Creating a job array SBATCH script
This document is to help provide a minimum, viable SBATCH script that employs job arrays to help dispatch numerous sub-jobs from a single SBATCH script and input file. Job arrays can help parallelize a wide variety of job formats: this is merely one method of parallel-izing work whose input is highly tabular.
This specific job is using a mapping file, input_data
which correlates each job to its respective input parameters.
This tutorial will use three (3) separate files in your $HOME
directory:
jobarray.sh
(the SBATCH script which defines the resource allocation and the subjob script)input_data
(a simple, text-only input file, for which each individual line translates to one subjob)the_work.sh
(a bash shell script which defines all the actual work needing to be done)
jobarray.sh
This is the script that is submitted via sbatch jobarray.sh
. It’s successful execution will spawn 20 separate instances of the_work.sh
, with a maximum of 10 concurrent jobs at once.
#!/bin/bash
#SBATCH --job-name=array
#SBATCH --array=1-20 #number of total jobs to spawn
#SBATCH --time=0-00:00:10 #upper bound time limit for job to finish
#SBATCH --partition=general
#SBATCH --qos=public
#SBATCH --ntasks=10 #number of concurrently running jobs
srun -n 1 ./the_work.sh $SLURM_ARRAY_TASK_ID
Note, job arrays do not--and are not meant to--promise work is completed in a reliable order. The key behind job arrays is that all fully-independent of eachother and order of completion of each sub-job is not important.
If subsequent processing steps do need to occur in order, consider job dependencies.
With --array=1-20
, we expect the compute work to be invoked as in the following (in the backend):
./the_work.sh 1
./the_work.sh 2
./the_work.sh 3
./the_work.sh 4
...
./the_work.sh 20
input_data
The following sample data shows latitude and longitude pairs, one per line. Based on this example, only lines 1-20 will be computed, with 21-24 omitted from the --array=
directive.
25.78642, 20.40898
-74.72462, -160.2885
54.37842, 146.27341
-11.2249, -4.36196
30.70403, 97.13326
-54.3891, -88.66462
-63.55478, 136.06104
36.98927, -25.04504
42.28435, 125.48806
-39.07869, -58.36421
-39.10534, 162.02482
-12.08924, -117.58317
87.46025, 171.04233
-63.30626, 120.42349
62.53605, -92.54694
-83.28753, 117.9856
32.16439, -22.9092
81.5248, 151.89049
77.69849, 112.53154
-67.29104, -18.73863
61.78938, -120.69627
-37.00708, -140.53009
45.08802, 66.40128
-24.61395, 28.29682
the_work.sh
This is the processing script which does the computation component of this jobscript.
As we saw in job_array.sh
, this script receives a single parameter $SLURM_ARRAY_TASK_ID
which is interpreted by the_work.sh
as $1
.
Consider the input_data
, comma separated latitude/longitude coordinates. This script will:
retrieve bash argument
$1
, which corresponds to$SLURM_ARRAY_TASK_ID
(1-20).open
input_data
and extract out exactly one line from that file, corresponding to$1
.clean up the data, which in this example means simply removing the comma.
create an array which contains the contents of coordinates, splitting the two values by a whitespace character.
return the processed results, appending one line at a time to a new file,
my.out
finally,
my.out
will be produced, showing you the results of each job’s work:
Note that the order here is not sequential, but all lines can be cleaned up to be, if necessary, in post-processing.
Remember, jobarrays are in no way limited to simply having all input squeezed onto a single line.
It is also possible for you to have 20 consistently-named files, such as input1
, input2
, input20
… and to avoid parsing (using sed
) altogether. It is also possible to have your input data map together files and numbers. The possibilities are limitless correlating incrementing numbers ($SLURM_ARRAY_TASK_ID
) to arguments. Another such approach that could go inside the_work.sh
:python guess_country.py $arr[0] $arr[1] >> my.out