Your Home directory contains the genomic data and other data resources. It accessed by clicking the 'Home' application on the desktop. You can also access your home directly through the 'terminal' application.
Home directory structure
The Home directory can be accessed via the Home icon on the desktop.
You have been provided with a Home directory within both the Inuvika Desktop Environment and the Helix HPC. In addition to this we have provided a significant amount of storage in an attached directory for you to store your working data, scripts and results.
If you reach the 10GB limit set on these Home locations it is likely to result in either your being unable to launch tools, or your being unable to log into the Research Environment at all, until the data has been cleared out. This may result in the loss of any work or results stored in these locations as we cannot guarantee the ability to migrate the files for you.
For academic users, your data storage area will be located within the
~/re_gecip (/re_gecip within Helix); for Discovery Forum members this will be
~/discovery_forum (/discovery_forum within Helix). The relevant locations are accessible to both the HPC and the Inuvika Desktop Environment. We have had to limit the size of the Home directories, both in the Inuvika Desktop Environment and the HPC, and as such these locations should be reserved for the use of software tools that will require this space to store configurations. For more information on these locations please review the relevant sections for you below.
In your home directory there are links to a number of important folders. These folders are:
|Folder||Icon||Read||Write||Desktop Path||Mounted on HPC||HPC Path||Description|
|genomes||All the genomic data provided by our sequence partner Illumina (see Genomic Data from Illumina).|
|gel_data_resources||Outputs from the Genomics England internal pipelines (see Genomics England Data).|
|public_data_resources||Public data resources such as 1,000 genomes data, reference genomes, example scripts etc (see Public data resources).|
|specific shared folder||Backed-up working space for each group (e.g. re_gecip). There are several petabytes of storage space for use and collaboration.|
The genomes, gel_data_resources, and public_data_resources folders are read-only, whilst the specific group shared folder is read-write and the ideal place to store all your work.
All of these folders are also mounted on the HPC at root ( / ) so you can access them when running programs on the HPC. Your home folder in the Research Environment desktop is NOT available on the HPC. Please use the specific group share instead.
If you attempt to write anything to
public_data_resources you will get a 'permission denied' error. Please note that this will happen if an attempt is made to gunzip a file with no output directory specified. Consider using the following command instead:
gunzip -c file_name > /path/to/output.file
GeCIP Shared Working Space
If you are a member of GeCIP, you will be able to read and write to the 're_gecip' folder. Use this folder as your working space. Within this folder, are sub-folders categorised by GeCIP domain (e.g. neurology, cardiovascular, skin, etc). In the Research Environment, you will be able to 'see' all of these sub-folders, however you will only have read-write access to the sub-folders that you are a member of. The re_gecip folder is mounted on the HPC, so any files and folders you save here, will be accessible from the HPC. We recommend saving all your work to your domain folder within the re_gecip folder as you have much more storage allocation. How you organise the domains shared working space is entirely up to you!
Discovery Forum Shared Working Space
Each Discovery Forum group will have their own specific shared folder which should be used as the shared working space. This folder has several petabytes of storage available and is mounted on the HPC at root ( / ). The folder has restricted access to each particular Discovery Forum member.
Please be aware that some tools within the Research Environment will require the production of transient or temporary files. The configuration of the Helix HPC means that the /tmp location on the cluster can rapidly become unavailable and severely impact other users of the resource. You should create a directory at the following location:
and set the location for this temporary file directory in your .bashrc or as an environment variable within your script:
Our recommendation would be to set this in the .bashrc so that the environment variable is generally accessible to your profile. Using a private scratch location will ensure that your files temporary files will remain both accessible and private. Please note that, as the scratch location is designed to be used for the temporary storage of transient and intermediary files needed by analyses, we are not able to guarantee that these files will be covered by the Research Environment's backup processes or would be recoverable beyond 1 month. We strongly advise that the location be reviewed prior to launching new analyses to ensure that any files that are no longer required are cleared from the location.