These pages comprise the "Airlock policy guidelines” for those conducting research on de-identified data from the Genomics England Research Environment, as referred to in the Airlock policy document. Please ensure you read the Airlock Policy and the Participant's consent at Genomics England documentation before attempting to transfer data in and out of the Research Environment.
Participants' consent at Genomics England states that:
‘…although researchers can look at your data and ask questions about it, they can only take away the answers to their questions (their results). They can’t copy or take away any of your individual data’
Genomics England does not allow the re-identification of any participant outside the Research Environment – either from the material alone, or through aggregation with other data available now or in the future. The Airlock systems and processes are therefore designed to prevent identifiable data being released from the Research Environment without the consent of the individual concerned.
By adopting a ‘reading library’ approach to data security we are able to allow researchers to access a richer dataset than would otherwise be possible. Furthermore, it allows continuous evolution as new datasets become available and the longitudinal datasets expand, without requiring any additional applications from the researchers. It should be noted that the Airlock is also designed to prevent other data provider’s data from being released in a manner that breaches their agreements with Genomics England or their study participants.
The Genomics England Research Environment has been developed with the intention that all data analysis is carried out within it and that the only data to leave it are analytical results. This means that the data exported in an Airlock export must under normal circumstances be the end results of analysis, and exports of raw data for analysis outside the RE will generally not be allowed. Exceptions to this rule may be granted in exceptional cases where approved work cannot feasibly be done inside the RE. In such cases, the requester must give a full explanation as to why their work cannot feasibly be done inside the RE on the request form, and an exemption to the rule must be granted by the Airlock Review Manager and the Airlock Review Committee.
The Airlock applies specifically to the Research Environment; the process for extracting data for the purposes of clinical care is different and is described elsewhere.
The Airlock enables material (data, files, tools etc) to be moved in or out of the Research Environment in a controlled and supervised manner; facilitating research and discovery, while maintaining control of security and access.
The File Transfer/Airlock application is the method for moving files and data through the Airlock.
Please submit a ticket to the Genomics England Service Desk if you require any further assistance.
Rules applied to all transfer requests
- All relevant details of the files to be transferred must be provided with every request.
- GeCIP members wishing to use the Airlock system must be members of a registered project and must provide the RR number of that registered project when making an Airlock request.
- All files transferred may be checked by Genomics England to ensure compliance with the relevant policies. Users will be notified of any files rejected along with the reason for the rejection.
- All files transferred will be checked for viruses and malware and those failing this test will be rejected. It is the responsibility of the requesters to resolve such issues before re-submitting the file for transfer.
- Files requested for transfer are assessed using the following criteria:
- whether the request aligns with the user’s Access Review Committee (ARC) approval
- whether the request can clearly be demonstrated to be aligned with a registered project in the Research Environment: please note that GeCIP members with no registered project cannot export any data via the Airlock. Commercial researchers who have been approved for pre-research but do not have an ARC approved project will have heavy limitations on what they will be able to export via the Airlock.
- whether the associated project has been registered in the Research Registry for a minimum of 3 months, as you will generally not be able to export data for a given project until that project has been registered for a minimum of 3 months, though exceptions are sometimes granted.
- any data security implications
- any disclosure risks
- the technical feasibility and associated cost of the request
- when importing data, its scientific value to the community of researchers within the Research Environment, and when and how it will be shared
- when importing data, checks will be performed to ensure that the data importer owns the data and holds the correct consents and approvals.
The Airlock Process
Analysed results will be inspected to ensure they cannot be used to disclose the identity of the participant. Checking of statistical output by the Airlock Review Team will be governed by a generalisable set of principles that will guide individual decisions and ensure flexible evaluation of the Genomics England dataset. By using a principles-based approach where each case is assessed individually the security of the dataset is maintained by exporting only ‘safe’ data.
The Airlock process is governed by the Airlock Policy, which defines the process and governance of the Airlock process.
A set of Airlock Policy Guidelines presents the rules-of-thumb/principles that will be referenced by both the researcher (during preparation of analysis results) and the output checker (during output-checking).
Summary data still carry a risk of participant identification, a risk that is considerably higher when the data in question are in the public domain. Accordingly, review of transfer requests resulting in public-sharing/publication of data (in this context this covers publication in journal articles and conference abstracts/posters/presentations) will be checked more stringently and must reference the Project within the Research Registry that it refers to. Any approved Airlock export can only be used for the specific use detailed in the original export.
While completing a transfer request, please:
Be transparent and precise
Your application should be simple but clear and must cover all the uses of the data. Be specific and unambiguous, and ensure there is consistency across all the fields. Use only the minimum amount of data required to adequately describe the statistical analysis carried out, or the trend in the data you are presenting. Any Airlock approval granted will only apply to the use cases you have specified in the original application. Remember that your application may be made public, so ensure that it can be easily understood – expand acronyms and assume that the application will be read in isolation from other documents.
Consider the safety of the data
All Airlock requests will be considered, in part, according to the risk they pose for participant re-identification. All applicants should consider the safety of the data they are submitting and, when exporting analysis results. A central principle of the Airlock is that only summary data, not individual level data, will be exported through the Airlock. However, on rare occasions export of individual-level data will be considered and, as always, it will be the responsibility of the applicant to justify why its use is needed, why a summary of the data is not feasible (eg a case study of a single family represents individual-level data that could not otherwise be summarised), and how disclosure risks have been minimised. Further details on this can be found in the Airlock Policy Guidelines.
Understand the implications
Breach of the Airlock (for example by ‘screen-shots’ of the Research Environment or copying data long hand), or contravention of the policies of the Airlock (for example, using exported materials for uses other than those within the original request) will be considered a serious breach of the GeCIP Rules. Genomics England reserves the right to ban the researcher’s institution, and all their researchers, from access to the 100,000 Genomes Project dataset and pursue legal action against both the individual and the institution.