R and RStudio are available within the Research Environment. You can use the latest version of R, as well as specifying previous versions if you like. It is also possible to install R packages from both CRAN and BioConductor using the internal mirror.
Follow the steps below to configure your environment to install R packages.
Selecting a version of R to use
Default version of R
The default install of R and R within Rstudio on the Desktop is version 4.0.3 (Sept 2021). While you are free to use this version of R and Rstudio, since we migrated to AWS (Sept, 2021) this version will not have all packages available. If you wish to use the pre-installed packages, you will need to use the approach below and manually load R v4.0.3 (module load R/4.0.3).
Specifying another version of R
To use a specific version of R in RStudio, open the terminal app on the Desktop and enter the following commands:
|
This will firstly scan for all available versions of R and then load RStudio using R 4.0.2.
This is important, as there are different libraries available for the different versions of R. For more information on loading and installing R packages, see InstallingRpackagesfromCRAN.
Configuration of R
Because the Research Environment and the HPC are closed environments, users will have to perform a small number of steps to correctly configure their R instances. This is required to access databases such as our internal CRAN mirror, Bioconductor, and other rerouting.
Please follow the steps below to configure your R:
- Open the terminal application from the Desktop in the Research Environment.
Type in (or copy-and-paste) the following lines to the file open in the terminal:
cp -a ~/gel_data_resources/example_config_files/Inuvika/. ./
- Done!
Example:
Please note that the command prompts a warning message. This is expected and normal and just means that it has copied the timestamps of the original files because it comes from a mounted file system. You will not see this prompt in the Helix configuration.
Contents of the added files
For the configuration of R on the Research Environment, three files are added. The contents are displayed here for reference:
Configuration of R on Helix
Please note that the above configures your R instances only for the Research Environment itself. If you wish to setup your R instances on Helix, please follow the steps below.
- Open the terminal application from the Desktop in the Research Environment.
- Login to the HPC (see https://research-help.genomicsengland.co.uk/display/GERE/4.+In-Depth+Guide+to+HPC+Usage)
Type in (or copy-and-paste) the following lines to the file open in the terminal:
cp -a /gel_data_resources/example_config_files/Helix/. ./
- Done!
Please note that the command used for Helix is slightly different (~) and refers to different files and folders. This is due to how the file systems are mounted on the HPC vs on the Research Environment sessions. The .netrc files remain the same, however the .Renviron file will be different.
Example:
Contents of the added files
For the configuration of R on the Helix, three files are added. The contents are displayed here for reference:
Installing R packages from CRAN
You can install R packages yourself within the Research Environment from CRAN as we have an internal mirror. To ensure that your R sessions are able to reach out to this CRAN mirror, please follow the instructions at ConfigurationofR. Please also note that you can only install R packages from the Desktop environment. You cannot install R packages directly on the HPC. However, we do already have various packages pre-installed. Please see the "Loading R packages" section below.
Loading R Packages
The Inuvika Desktop Environment currently defaults to R version 4.0.3 which is loaded automatically once you launch RStudio. Due to the closed environment of the Research Environment we have tried to provide you with a range of packages "out of the box" to progress your work. The System Packages that have been provided do not need to be installed and can simply be loaded within a script with the command library(library_name)
or by selecting the appropriate package in the "Packages" tab in RStudio.
If the package is not available, it is best to use the following approach outlined here: https://research-help.genomicsengland.co.uk/display/GERE/Loading+R+packages+when+versions+are+not+synchronized
If you had previously installed the package, but it is now throwing an error, please have a look here to see if this may help resolve the issue: Rpackage "CURL_OPENSSL_3" not found
If the package of your interest is not available under https://artifactory.gel.zone/artifactory/cran/src/contrib, please raise a ticket at the Genomics England Service Desk so it can be installed.
Finally, please note that the above approaches may not resolve dependencies, and each dependency would need to be installed using the same approach. If there are many dependencies that would need to be resolved, please raise a ticket at the Genomics England Service Desk so we can install the primary package of interest.
Installation from the Desktop environment
The default R installation within the Research Environment Desktop contains just the base packages. If you want to to install packages yourself you will have to follow the steps outlined above in "Configuration of R". We always recommend to install your packages in your /re_gecip/yourDomain/ or /re_df/yourDomain/ folders. There are various benefits of doing so, and one of which is that they will be accessible for your scripts that run on the HPC too.
Afterwards, you will be able to install packages accordingly:
Make a folder where you want to store your R packages for example:
~/re_gecip/yourDomain/R_packages
Install the package and specify the installation path with lib:
install.packages("", lib="~/re_gecip/yourDomain/R_packages")
Load libraries:
library(, lib="~/re_gecip/yourDomain/R_packages")
All R packages that are located on GitHub require Genomics England admins to install them. Please submit a service desk ticket if you require this.
Loading from the HPC environment
You can only install R packages from the Desktop environment. If you need to use these packages on the HPC environment then you will need to install the packages to a folder on the Desktop that is writeable and mounted on the HPC. If you are in GeCIP, then your specific shared folder is called /re_gecip/
and for Discovery Forum members this will be placed in /re_df/
.
To load a pre-installed R package from the HPC environment you can use the following command: library(, lib="/re_gecip/yourDomain/R_packages")
. Notice the preceding /
in the HPC environment compared to ~/
in the Desktop environment.
Installing and configuring packages from BioConductor
You can also install BioConductor packages from within the Research Environment after a once-off configuration as shown in "Configuration of R". Follow the same setup as CRAN packages by installing them to a shared folder on the HPC (such as 're_gecip'). Note: As of version R3.6.0, BioConductor packages are installed through BiocManager (see below).
BioConductor for R versions up to 3.5.1
Open R (or RStudio) and run the following
- source("https://bioconductor.org/biocLite.R")
- biocLite("<package_name>")
- library(<package_name>)
BioConductor for R versions 3.6.0 and later (Recommended)
Open R (or RStudio) and run the following:
- library("BiocManager")
- BiocManager::install("<package_name>")
- library(<package_name>)
Plotting in R on the HPC
When R is run on the HPC as a module, it will not be able to output plots. You might see errors such as: Unable to start device PNG, Unable to open connection to X11 display.
This can be solved by using an X Virtual Frame Buffer to run R in. The below is an example of how to do this:
|
If, however, you do not want to write xvfb-run R
each time, then you can set up an alias in your .bashrc
file that will do this for you. Add the following line to your .bashrc
file:
|
A quick example of saving an R plot to disk:
|
This will create a .png
file called plot.png
in the current working directory with your data plotted.