Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document is to help provide a minimum, viable SBATCH script that employs job arrays to help dispatch numerous sub-jobs from a single SBATCH script and input file. Keep in mind, Job arrays can help parallelize a wide variety of job formats: this is merely one way method of parallel-izing work whose input is highly tabular.

This specific job is using a mapping file, input_data which correlates each job to its respective input parameters.

This tutorial will use three (3) separate files in your $HOME directory:

  1. jobarray.sh (the SBATCH script which defines the resource allocation and the subjob script)

  2. input_data (a simple, text-only input file, for which each individual line translates to one subjob)

  3. the_work.sh (a bash shell script which defines all the actual work needing to be done)

jobarray.sh

This is the script that is submitted via sbatch jobarray.sh. It’s successful execution will spawn 20 separate instances of the_work.sh, with a maximum of 10 concurrent jobs at once.

Code Block
#!/bin/bash

...



#SBATCH --job-name=array

...


#SBATCH --array=1-20            #number of total jobs to spawn

...


#SBATCH --time=0-00:00:10       #upper bound time limit for job to finish

...


#SBATCH --partition=

...

general
#SBATCH --qos=

...

public
#SBATCH --ntasks=10             #number of concurrently running jobs

...



srun -n 1 ./the_work.sh $SLURM_ARRAY_TASK_ID

This job will spawn 20 separate instances of the_work.sh, with a total of 10 concurrent jobs at any time. Note, job arrays do not (--and are not meant to) --promise work is completed in a reliable order. The key behind job arrays is that all jobs are independently able to run fully-independent of eachother and order of completion of each sub-job is not important.

Info

If subsequent processing steps do need to occur in order, use filenames or data embedded in the contents to help determine the correct order. See the_work.sh and its use of printing LINE n.

Because each job corresponds directly to one other line, --array=1-20 therefore also means only the first twenty lines. If you wish to have all lines present be processed, be sure to adjust --array= to match the line count of input_data.

input_data

...

consider job dependencies.

With --array=1-20, we expect the compute work to be invoked as in the following (in the backend):

Code Block
./the_work.sh 1
./the_work.sh 2
./the_work.sh 3
./the_work.sh 4
...
./the_work.sh 20

input_data

The following sample data shows latitude and longitude pairs, one per line. Based on this example, only lines 1-20 will be computed, with 21-24 omitted from the --array= directive.

Code Block
25.78642, 20.40898
-74.72462, -160.2885
54.37842, 146.27341
-11.2249, -4.36196
30.70403, 97.13326
-54.3891, -88.66462
-63.55478, 136.06104
36.98927, -25.04504
42.28435, 125.48806
-39.07869, -58.36421
-39.10534, 162.02482
-12.08924, -117.58317
87.46025, 171.04233
-63.30626, 120.42349
62.53605, -92.54694
-83.28753, 117.9856
32.16439, -22.9092
81.5248, 151.89049
77.69849, 112.53154
-67.29104, -18.73863
61.78938, -120.69627
-37.00708, -140.53009
45.08802, 66.40128
-24.61395, 28.29682

the_work.sh

Code Block
#!/bin/bash


LINE=$(sed "$1q;d" input_data)


REMOVEDCOMMA=$(echo "$LINE" | sed 's|,||g')


arr=($REMOVEDCOMMA)


echo "LINE $1 LAT ${arr[0]} LON ${arr[1]}" >> my.out

This is the processing script which will receive the line number as a passed argument as does the computation component of this jobscript.

As we saw in job_array.sh, this script receives a single parameter $SLURM_ARRAY_TASK_ID and received which is interpreted by the_work.sh as $1. The first processed job will be passed 1, followed by 2, all the way to the upper bound 20.

Consider the input_data, comma separated latitude/longitude coordinates. The steps this This script will take are:

  1. retrieve bash argument $1, which corresponds to $SLURM_ARRAY_TASK_ID (1-20).

  2. open input_data and extract out exactly one line

...

  1. from that file, corresponding to $1.

  2. clean up the data, which in this example means simply removing the comma.

  3. create an array which contains the contents of coordinates, splitting the two values by a whitespace character.

  4. return the processed results, appending one line at a time to a new file, my.out

  5. finally, my.out will be produced, showing you the results of each job’s work:

Code Block
$ cat my.out

...


LINE 1 LAT 25.78642 LON 20.40898

...


LINE 2 LAT -74.72462 LON -160.2885

...


LINE 4 LAT -11.2249 LON -4.36196

...


LINE 3 LAT 54.37842 LON 146.27341

Note that the order here is not sequential, but all lines can be cleaned up to be, if necessary, in post-processing. 

Info

Remember, jobarrays are in no way limited to simply having all input squeezed onto a single line.

It is also possible for you to have 20 consistently-named files, such as input1, input2, input20… and to avoid parsing (using sed) altogether. It is also possible to have your input data map together files and numbers. The possibilities are limitless correlating incrementing numbers ($SLURM_ARRAY_TASK_ID) to arguments. Another such approach that could go inside the_work.sh:
python guess_country.py $arr[0] $arr[1] >> my.out

...

Page Properties
hiddentrue

Related issues