April-June 2020: User migration to the new HPC Helix
For information on the migration of your account to the new HPC called Helix, please visit the following page: HPC (Helix) Migration 2020. We will be updating our User Guide accordingly.
March-May 2020: Genome migration
We have recently initiated the process to migrate of all genomes in the Research Environment to a new storage system. The migration is expected to last approximately two months. Each genome will be temporarily unavailable when it is migrated. This means that, in practice, any HPC jobs trying to access a genome at the time of its migration will fail. In most cases, as each genome will only be unavailable for a couple of minutes, restarting your job will resolve the issue. However, jobs trying to access multiple genomes will be more severely affected. While this will cause some disruption over the next two months, once the migration is completed, the high-performance compute cluster will run significantly faster.
It is therefore recommended that you consider adding extra file-checking steps to your code so that it fails gracefully if a file is unavailable. If your code is looping over many genome files, you might consider modifying it such that it is able to skip any files that appear to be unavailable, while keeping track of them so that they can be tried again later.
5th August 2019: Duplicate rows in the cancer_analysis table
There are seven completely duplicated rows in the cancer_analysis table. The interpretation request IDs for these duplicated rows are: 15807-1, 21450-1, 22977-1, 14581-2, 30736-1, 31083-1, 42295-1. Please ensure these are considered in any analysis.
5th August 2019: Participant with incorrect gender in the cancer_analysis table
A participant has been identified with incorrect gender assignment in the cancer_analysis table. The participant has two interpretation request IDs for two different tumour-normal pairings. The IDs are: 14497-1 and 48870-1. As a rule of thumb, please use the latest interpretation request ID (48870-1) in all cases for the correct gender assignment.
31st July 2019: LabKey API
The Rlabkey R package for querying LabKey contains a bug whereby only the first 100,000 rows of a table are imported when using the labkey.selectRows() function (even when the maxRows argument is set to > 100,000).
To work around this issue, please use the labkey.executeSql() function instead. For example, to select the entire sequencing_report table for the Main Programme V7 release, use the following code:
Please see: Using the LabKey API for more SQL examples.
4th April 2019: IGV Genome Browser
The Broad Institute have changed where they host the genome files for IGV (from igvdata.broadinstitute.org/* to https://s3.amazonaws.com/igvdata.broadinstitute.org/*). As the latter is not yet 'white-listed' within the Research Environment, the genome files will not be accessible causing an error to occur when attempting to load them within IGV.
We have manually downloaded the FASTA, RefSeq annotation, chromosome cytoband, and chromosome aliases from the Broad for hg19/GRCh37 and hg38/GRCh38 reference assemblies and have created .genome files from them. These are identical to the default ones present in IGV and can be loaded into IGV using the instructions below.
|1||Open the IGV application from the desktop||-|
|2||Click 'Cancel' on the subsequent pop-up box|
|3||On the IGV toolbar, navigate to 'Genomes' > 'Load Genome from File...'|
|4||In the file browser, navigate to 'Home' > 'public_data_resources'|
|5||Navigate to 'IGV'|
|6||Select the reference assembly of choice (hg19/GRCh37 or hg38/GRCh38).|
|7||Click on the corresponding '.genome' file.|
|8||This will load the genome file selected as a local version|