Parabricks 4.0

https://docs.nvidia.com/clara/parabricks/4.0.0/WhatsNew.html

Parabricks now runs with out a license and is available on the cluster via singularity.

To run Parabricks, you will need a GPU node that fits the following requirements.

  • Any NVIDIA GPU that supports CUDA architecture 60, 70, 75, or 80 and has at least 16GB of GPU RAM. Parabricks has been tested on the following NVIDIA GPUs:

    • V100

    • T4

    • A10, A30, A40, A100, A6000

  • System Requirements:

    • A 2 GPU server should have at least 100GB CPU RAM and at least 24 CPU threads.

    • A 4 GPU server should have at least 196GB CPU RAM and at least 32 CPU threads.

    • A 8 GPU server should have at least 392GB CPU RAM and at least 48 CPU threads.

 

Once you have a system that fulfills the above requirements, you will need to load singularity, and the version of Parabricks you are attempting to use. As of this article, parabricks 4.0 is the latest.

module load singularity/3.8.0 module load parabricks/4.0

After loading these, you can use the following to drop in to the Parabricks container.

singularity shell --nv `which parabricks.simg`

Once in the container, if you have the sample data, you can run the following

pbrun fq2bam --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz --out-bam fq2bam_output.bam

To get to any of the GPU models than can run Parabricks, you can either go with the wildfire partitions (no time limit)

or htcgpu (4 hour time limit). All GPUs in htcgpu partition can run Parabricks.

An additional note about running in a container: If you have data (such as reference genomes or fastqs in a /data directory), the directories must be explicitly linked into the directory with the “-B” flag. First define these as variables and then pass in: