Page tree
Skip to end of metadata
Go to start of metadata


R and RStudio are available within the Research Environment. You can use the latest version of R, as well as specifying previous versions if you like. It is also possible to install R packages from both CRAN and BioConductor using the internal mirror. 

Follow the steps below to configure your environment to install R packages. 



Selecting a version of R to use

Default version of R

The default install of R and R within Rstudio on the Desktop is version 4.0.3 (Sept 2021). While you are free to use this version of R and Rstudio, since we migrated to AWS (Sept, 2021) this version will not have all packages available. If you wish to use the pre-installed packages, you will need to use the approach below and manually load R v4.0.3 (module load R/4.0.3).

Specifying another version of R

To use a specific version of R in RStudio, open the terminal app on the Desktop and enter the following commands:

module avail R/
module load R/4.0.2 #select your version here
rstudio

This will firstly scan for all available versions of R and then load RStudio using R 4.0.2. 

This is important, as there are different libraries available for the different versions of R. For more information on loading and installing R packages, see InstallingRpackagesfromCRAN.


Configuration of R

Because the Research Environment and the HPC are closed environments, users will have to perform a small number of steps to correctly configure their R instances. This is required to access databases such as our internal CRAN mirror, Bioconductor, and other rerouting.

Please follow the steps below to configure your R:

  1. Open the terminal application from the Desktop in the Research Environment.
  2. Type in (or copy-and-paste) the following lines to the file open in the terminal:

    cp -a ~/gel_data_resources/example_config_files/Inuvika/. ./
  3. Done!

Example:

Please note that the command prompts a warning message. This is expected and normal and just means that it has copied the timestamps of the original files because it comes from a mounted file system. You will not see this prompt in the Helix configuration.

Contents of the added files

For the configuration of R on the Research Environment, three files are added. The contents are displayed here for reference:

.Renviron
no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com,.gel.zone"

.Rprofile
myrepo = getOption("repos")
myrepo["CRAN"] = "https://artifactory.gel.zone/artifactory/cran"
options(repos = myrepo)
rm(myrepo)
.netrc
machine labkey-embassy.gel.zone
login yourusername
password yourPasswordHere


Configuration of R on Helix

Please note that the above configures your R instances only for the Research Environment itself. If you wish to setup your R instances on Helix, please follow the steps below.

  1. Open the terminal application from the Desktop in the Research Environment.
  2. Login to the HPC (see https://research-help.genomicsengland.co.uk/display/GERE/4.+In-Depth+Guide+to+HPC+Usage)
  3. Type in (or copy-and-paste) the following lines to the file open in the terminal:

    cp -a /gel_data_resources/example_config_files/Helix/. ./
  4. Done!

Please note that the command used for Helix is slightly different (~) and refers to different files and folders. This is due to how the file systems are mounted on the HPC vs on the Research Environment sessions. The .netrc files remain the same, however the .Renviron file will be different.

Example:

Contents of the added files

For the configuration of R on the Helix, two files are added. The contents are displayed here for reference:

.Renviron
http_proxy=http://pfsense.int.corp.gel.ac:3128
ftp_proxy=http://pfsense.int.corp.gel.ac:3128
rsync_proxy=http://pfsense.int.corp.gel.ac:3128
https_proxy=http://pfsense.int.corp.gel.ac:3128
no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com,.gel.zone,.cluster


.netrc
machine labkey-embassy.gel.zone
login yourusername
password yourPasswordHere


Installing R packages from CRAN

You can install R packages yourself within the Research Environment from CRAN as we have an internal mirror. To ensure that your R sessions are able to reach out to this CRAN mirror, please follow the instructions at ConfigurationofR. Please also note that you can only install R packages from the Desktop environment. You cannot install R packages directly on the HPC. However, we do already have various packages pre-installed. Please see the "Loading R packages" section below.

Loading R Packages

The Inuvika Desktop Environment currently defaults to R version 4.0.3 which is loaded automatically once you launch RStudio. Due to the closed environment of the Research Environment we have tried to provide you with a range of packages "out of the box" to progress your work. The System Packages that have been provided do not need to be installed and can simply be loaded within a script with the command library(library_name) or by selecting the appropriate package in the "Packages" tab in RStudio.

If the package is not available, it is best to use the following approach outlined here: https://research-help.genomicsengland.co.uk/display/GERE/Loading+R+packages+when+versions+are+not+synchronized

If you had previously installed the package, but it is now throwing an error, please have a look here to see if this may help resolve the issue: Rpackage "CURL_OPENSSL_3" not found

If the package of your interest is not available under https://artifactory.gel.zone/artifactory/cran/src/contrib, please raise a ticket at the Genomics England Service Desk so it can be installed. 

Finally, please note that the above approaches may not resolve dependencies, and each dependency would need to be installed using the same approach. If there are many dependencies that would need to be resolved, please raise a ticket at the Genomics England Service Desk so we can install the primary package of interest.


Installation from the Desktop environment

The default R installation within the Research Environment Desktop contains just the base packages. If you want to to install packages yourself you will have to follow the steps outlined above in "Configuration of R". We always recommend to install your packages in your /re_gecip/yourDomain/ or /re_df/yourDomain/ folders. There are various benefits of doing so, and one of which is that they will be accessible for your scripts that run on the HPC too.

Afterwards, you will be able to install packages accordingly:

  1. Make a folder where you want to store your R packages for example: ~/re_gecip/yourDomain/R_packages

  2. Install the package and specify the installation path with lib: install.packages("", lib="~/re_gecip/yourDomain/R_packages")

  3. Load libraries: library(, lib="~/re_gecip/yourDomain/R_packages")


All R packages that are located on GitHub require Genomics England admins to install them. Please submit a service desk ticket if you require this.

Loading from the HPC environment

You can only install R packages from the Desktop environment. If you need to use these packages on the HPC environment then you will need to install the packages to a folder on the Desktop that is writeable and mounted on the HPC. If you are in GeCIP, then your specific shared folder is called /re_gecip/ and for Discovery Forum members this will be placed in /re_df/.

To load a pre-installed R package from the HPC environment you can use the following command: library(, lib="/re_gecip/yourDomain/R_packages"). Notice the preceding / in the HPC environment compared to ~/ in the Desktop environment. 

Installing and configuring packages from BioConductor

You can also install BioConductor packages from within the Research Environment after a once-off configuration as shown in "Configuration of R". Follow the same setup as CRAN packages by installing them to a shared folder on the HPC (such as 're_gecip'). Note: As of version R3.6.0, BioConductor packages are installed through BiocManager (see below). 

BioConductor for R versions up to 3.5.1

Open R (or RStudio) and run the following

BioConductor for R versions 3.6.0 and later (Recommended)

Open R (or RStudio) and run the following:

  • library("BiocManager")
  • BiocManager::install("<package_name>")
  • library(<package_name>)

Plotting in R on the HPC

When R is run on the HPC as a module, it will not be able to output plots. You might see errors such as: Unable to start device PNG, Unable to open connection to X11 display.

This can be solved by using an X Virtual Frame Buffer to run R in. The below is an example of how to do this:

module load R/3.5.1
xvfb-run -a R

If, however, you do not want to write xvfb-run R each time, then you can set up an alias in your .bashrc file that will do this for you. Add the following line to your .bashrc file:

alias R='xvfb-run -a R'

A quick example of saving an R plot to disk:

png("plot.png")
plot(1)
dev.off()

This will create a .png file called plot.png in the current working directory with your data plotted.


  • No labels