1 Fairshare and Equitable Use of the Supercomputer
2 Checking Fairshare Score
3 Working with Fairshare Score

Fairshare and Equitable Use of the Supercomputer

Computational resources on the supercomputer are free for ASU faculty, students, and collaborators. To keep availability equitable, submitted jobs in the queue are prioritized based on recent usage via the submitting user’s Fairshare score, which scales a base priority by a factor dependent on recent usage from zero to one. A user’s fairshare halves for every 10,000 core-hour equivalents (CHE) of usage, e.g., 20,000 CHE consumed today would reduce a user’s fairshare to 0.25. More specifically, FairShare = 2^{-current_core_hour_equivalent_usage / 10,000}.

Usage is “forgotten” at an exponential rate, with a half-life of one week. For instance 20,000 CHE consumed today would be “remembered” as 10,000 a week from now, and after 26 weeks, as 1.07 core-second equivalents. For more details on the way fairshare is dynamically controlled on RC’s systems, see this 2020 PEARC proceedings paper.

CHE are tracked based on a linear combination of different hardware allocations, i.e.,

CHE = (core-hour equivalents) = (
  (number of cores)
  + (total RAM allocated) / (4 GiB)
  + 3 * (number of Multi-Instance GPU slices)
  + 20 * (number of A30 GPUs)
  + 25 * (number of A100 GPUs)
  + 40 * (number of H100 GPUs)
) * (job runtime)

Thus, utilizing a single core with four GiB of RAM and one A100 GPU for four hours would equate to approximately 108 CHE. Researchers who carefully manage their hardware allocations to maximize job efficiency (see our documentation page on seff and mysacct for tools to track efficiency) will experience less impact on their FairShare.

All jobs will eventually run; however, researchers with higher recent utilization (resulting in a lower score) may experience longer wait times as other jobs are prioritized.

Requesting more or fewer resources does not affect your position in line. However, if your job requires resources that do not hinder any other user with a greater fairshare, it becomes eligible for back-filling, allowing it to be prioritized for immediate processing.

Checking Fairshare Score

The fairshare score is printed out in the login welcome text. And there are two commands available on both Sol and Phx to check the fairshare score (they are equivalent):

myfairshare
mybalance

The value in the last column is the real fairshare score or the final calculation result. Please re-run these commands if the output is broken or abnormal, and you may need to re-run them multiple times.

Working with Fairshare Score

Here are two examples of how to work around a low fairshare score and get jobs started as soon as possible:

#	Scenarios	Consequence	Workaround

#

Scenarios

Consequence

Workaround

1

A job asking for 300 CPUs and 7 days on Sol.

This is a very bulky job and will take a very long waiting time even with a perfect fairshare score, given how busy Sol is.

Break the job into

300 * 7 * 24 = 50400
small jobs asking for 1 CPU and 1 hour per job, using the htc partition. Using the private QOS also helps.

2

Launching 50400 small jobs one after one using a python script.

Each submission will take a deduction on the fairshare score, so the waiting time will get longer and longer, some of the jobs will wait for days.

Instead of submitting the jobs one by one, launching a job array with 50400 sub-jobs will one take a one-time deduction on the fairshare score. So all the sub-jobs can start queuing using the same fairshare score.

Research Computing

Slurm Fairshare Score

Fairshare and Equitable Use of the Supercomputer

Checking Fairshare Score

Working with Fairshare Score