Submitting an R SBATCH Job Script

This tutorial will cover how to submit an R batch job to the Aloe supercomputer. A batch job is a job that is submitted as a complete script and runs unattended. This is done through a SBATCH script.

Submitting a Batch Job

Follow these steps to submit an R batch job:

  1. Create the SBATCH script. You can create this script on your workstation and then upload it to the Aloe supercomputer. You can also create this script through the shell. To do this:

    1. Connect to the web portal and log on with your credentials.

    2. Select “System“ in the navigation bar then select “Aloe Shell Access“. You will be prompted for your credentials and a DUO push will be sent to you.

    3. In the command prompt, use the following commands to create a SBATCH script called “R_test_script.sh“.

      nano R_test_script.sh
    4. In the file editor, specify the resources, R version, and the Rscript file needed for your computation. Here is an example of a R SBATCH script:

      #!/bin/bash #SBATCH -c 1 # number of TASKS #SBATCH -N 1 # keep all tasks on the same node #SBATCH --mem=120G # request 120 GB of memory #SBATCH -t 0-1 # request 1 hour of walltime module load r-4.3.0-gcc-11.2.0-6t5 # load in R version 4.3.0 # Use one of the following commands to run R: R --no-save --quiet < regression.r # OR Rscript regression.r # OR Rscript regression.r <arguments to be passed into R>
      1. This job will allocate one core and 120 GB of memory on one node for one hour. It will also execute the “regression.r“ file with R version 4.3.0.

    5. Save the SBATCH script and exit the file editor.

    6. In the command prompt, use the following commands to submit the batch job:

      sbatch MATLAB_test_script.sh
    7. Done. Your job has been submitted. Here are some helpful commands to manage your job:

     

How will I know how many and which resources to use?

Learning to use the supercomputer, like learning how to use other new technologies in your life, will take trial and error. A good starting point is to start with one core for two hours in the interactive RStudio graphical application as you get familiar with the supercomputer. You can also request resources to replicate your current workstation to create a baseline. Through this trial and error, you will identify what resources can better improve your computation’s performance. See the “Requesting Resources on Aloe“ guide for additional tips.