Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Notice the bash prompt changed from username@login02 to username@c008, indicating we are on node c008 now. We can now use the following commands to view information about our job

CPU Usage - top

Code Block
top -u $USER ##This will show the CPU and Memory of our job

...

This shows the processes and their CPU / Memory usage. CPU usage is a percentage. 100% CPU is 1 CPU core, so if a process is using 8 cores it may say 800%, or list 8 processes at 100%

Press q to quit out of top

GPU Usage - nvtop

Code Block
nvtop ## This will display GPU usage for our job, this only works on GPU nodes

...

Once a job has completed/canceled/failed, pulling the job statistics is rather simple. There are two main commands to do this: seff and mysacct

Seff

Excerpt
nameseff command

seff is short for “slurm efficiency” and will display the percentage of CPU and Memory used by a job relative to how long the job ran. The goal is high efficiency so that jobs are not allocating resources they are not using.

Example of seff for an inefficient job

Code Block
languagebash
[jeburks2@login02:~]$ seff 11273084
Job ID: 11273084
Cluster: sol
User/Group: jeburks2/grp_rcadmins
State: COMPLETED (exit code 0)
Cores: 1
CPU Utilized: 00:00:59
CPU Efficiency: 12.55% of 00:07:50 core-walltime
Job Wall-clock time: 00:07:50
Memory Utilized: 337.73 MB
Memory Efficiency: 16.85% of 2.00 GB
[jeburks2@login02:~]$ 

This shows the job had a CPU for 7 minutes, but only used the CPU for 59 seconds, resulting in a 12% efficiency, but did use the memory

Example of

...

seff for a CPU efficient job

Code Block
[jeburks2@login02:~]$ seff 11273083
Job ID: 11273083
Cluster: sol
User/Group: jeburks2/grp_rcadmins
State: TIMEOUT (exit code 0)
Nodes: 1
Cores per node: 4
CPU Utilized: 00:59:39
CPU Efficiency: 98.98% of 01:00:16 core—wall time
Job Wall—clock time: 00:15:04
Memory Utilized: 337.73 MB
Memory Efficiency: 4.12% of 8.00 GB

In this example, the job used all four cores it was allocated for 98% of the time the job ran. The core-wall time is calculated by the number of CPU cores * the length of the job. This 15-minute job with 4 CPUs had a core-wall time of 1:00:00. However, the memory efficiency is rather low. This lets us know that if we run this job in the future, we can allocate less memory. This will reduce the impact to our fair share and use the system more efficiently.

Note

Note: Seff does not display statists for GPUs, so a GPU-heavy job will likely have inaccurate seff results

sacct / mysacct

The sacct / mysacct command allows a user to easily pull up information about past jobs that have completed.

...

Note

If a + is listed at the end of a field, then that field has likely been truncated to fit into a fixed number of characters. Consider increasing the with by appending a % followed by a number to specify a new width. For example allocTRES%42 overrides the default width to 42 characters.

Additional Help

If you require further assistance on this topic, please don't hesitate to contact the Research Computing Team. To create a support ticket, kindly send an email to rtshelp@asu.edu. For quick inquiries, you're welcome to reach out via our #rc-support Slack Channel or attend our office hours for live assistance

We also offer a series of workshops. More information here: Educational Opportunities and Workshops

Insert excerpt
Contact RC Support excerpt
Contact RC Support excerpt
nameContact RC Support
nopaneltrue