/
Summer Maintenance 2024 Information

Summer Maintenance 2024 Information

General Updates:

  • Module loading cuda is now required

    • CUDA is no longer installed in the nodes directly. This allows more control over the CUDA version, for greater compatibility. This now requires module loading a specific version of CUDA.

    • A new version of CUDA 12.5 is available.

    • Current jobs relying on unspecified CUDA may fail and sbatch scripts adjusted

  • Node naming convention update

    • All nodes are now prefixed with an “s”

    • Login nodes are now named sol-login0[1-3]

    • HTC jobs will only run on public nodes

      • This affects previously submitted and future jobs.

      • This change will temporarily make the 32 private nodes on the Sol supercomputer unavailable for HTC jobs.

      • This will allow private node job recovery. 

      • This change will last up to a week.

  • Rocky OS update

    • 8.9 to 8.10 for security updates

  • Slurm version update

    • 23.11 to 24.05 for security updates

  • Additional 16 GPU MIG instances are available

  • Modernized interactive script 

    • The “interactive script” has been updated to provide a more seamless experience on compute nodes, enhancing the overall user experience.

    • The old version of interactive is still available via the command “classic-interactive”. 

 

Benchmarks completed https://top500.org/  

  • Green500 - results posted in November 

  • Top500 - results posted in November

Technical Updates:

  • Implementation of Warewulf

    • Deployment tool for OS images

    • Keeping consistency across all of the compute nodes

  • Revamped account creation script for better use and functionality across the supercomputers

  • The arbiter tool for monitoring the login nodes has been updated to version 2.1

  • The Slurm scheduler has been moved to being in a container

  • Proxy servers for Sol has been updated to being in a container

  • Cholla storage array updated software version from 6.2.0.1 to 7.1.0.1

 

Related content