Connecting to Aloe (HPC Environment)

Your first time using an High-performance Computing environment (HPC) like Aloe can be intimidating, but it doesn’t have to be: this guide will get you started with the basics.

This article will assume a basic familiarity with the Linux command line. If you are new to linux, or need a refresher, RC has created a guide on git called The Linux Shell; the instructions provided are general enough and apply to the Aloe supercomputer.

This document also assumes you already have requested and been granted an account. If not, please see the Creating a User Account page.

Please also familiarize yourself with our Required Trainings and Acceptable Use Policy before getting started.

Quick Start

For users who have never used any HPC environment before, we would recommend reading through the detailed start.

For those who wish to get started quickly, here is the general overview:

Choose a connection method (ssh / Webportal)
Connect to the ASU VPN
Transfer files as needed
Login with your WINEDS username & password
Run an interactive session or create an SBATCH script

Important Terms

Login Node: A node intended as a launching point to compute nodes. Login nodes have minimal resources and should not be used for any application that consumes a lot of CPU or memory. Also known as a head node.
Compute Node: Nodes intended for heavy compute. This is where all heavy processing should be done
HPC: Short for “High Performance Computing” it refers to a group (cluster) of computers designed for parallelism across many computers at once. Publicly these are often called “supercomputers”
Cluster: A group of interconnected computers that can work cooperatively or independently.
Job: Work assigned to be done on a compute node. Any time a compute node is assigned a job is created.
Scheduler: The application on our end that assigns compute resources for jobs.
Slurm: The brand name of our scheduler like which manages and allocates resources.

Detailed Start

Choosing a connection method

There are a few methods for connecting to the supercomputer; each has their advantages and disadvantages.

Connecting to the Supercomputer with SSH is the most versatile method, though it tends to be slower with interactive graphical applications. If you intend to use applications that rely on a point-and-click interface, we recommend the web portal “Aloe Desktop” for these cases.

Our webportal has become the standard for new users, as it provides a file system viewer and editor, a job submission tool, the ability to view the job queue, and a zoo of interactive applications including a virtual desktop, Jupyter Lab, and RStudio. In the file manager, uploading files is as easy as dragging-and-dropping through the interface!

Connect through the Cisco VPN

All Aloe resources require the user to be connected to the ASU Cisco VPN. Be sure to connect to sslvpn.asu.edu/2fa, and if prompted for a “second password,” provide either a DUO code, push to receive a DUO push request, phoneto authenticate via a phone call, or sms to authenticate via a text message.

For additional details or to install the software, please go to the SSL VPN page.

PLEASE NOTE: If you are having trouble connecting to the ASU VPN you will need to contact ASU support. RC does not have any control or insight into the VPN and cannot assist with VPN issues.

Transfer needed files

Transferring files can be done with:

the online webportal under “Files”.
globus.org through the File Manager, listed as the “ASRE Storage” endpoint
SCP/SFTP clients, such as Filezilla, Cyberduck, sftp, and scp

Login to Aloe

You should now be ready to reach the login node. You can get to a terminal prompt via:

the online webportal under “System / Aloe Shell Access”.
SSH clients, such as putty, termius, mobaxterm, or other terminal application

Run Interactive or SBATCH

Once you have a command prompt, there are two ways to get to a compute node:

Starting an Interactive Session : Will assign a compute node and connect your command prompt to it. This is good when working by hand to establish the commands needed to run your work. When your session disconnects, the interactive session also closes. Any unsaved work will be lost.

Scheduling Batch Scripts (Example) : This is a method of telling the scheduler you want an unattended job run. When an sbatch is submitted the job will run until it either completes, fails, or runs out of time. Once submitted sbatch jobs will run without remaining connected to the supercomputer.

Recommended Reading

That covers the basic steps, but you may still be wondering “How do i get my specific work done”. Here’s a little more reading that may help you get fully started.

Modules and Software

RC already has many software packages and many versions of the same software available. They can be accessed using modules. You can see all available modules with the command module avail:

$ module avail
------------------- /rc/packages/modulefiles/apps ---------------------------
jupyter/latest  mamba/latest  matlab/2023a  matlab/latest
rstudio/desktop  sas/9.4  stata/18  

------------------- /rc/packages/modulefiles/spack --------------------------
gcc-11.2.0-gcc-8.5.0-gs2  gcc-12.3.0-gcc-8.5.0-52r  gmake-4.4.1-gcc-8.5.0-se5
openjdk-11.0.17_8-gcc-11.2.0-mcl perl-5.38.0-gcc-11.2.0-3h3  r-4.3.0-gcc-11.2.0-6t5

You can then engage the module to make all their respective applications available to your terminal session:

$ module load mamba/latest
$ mamba create -n pytorch -c pytorch ....

Users may install software to their home directory so long as it does not require a license. Users can also request a software install if they prefer to have a module available and the module is not already present. Software that is free for ASU but requires a license is acceptable for modules. Paid licenses are not covered.

Using GPUs

Scientific research increasingly takes advantage of the power of GPUs. See our page on Requesting Resources on Aloe .

File Systems

There are two, primary file systems, referred to as home and tmp. These are accessed at paths /home/<username> and /tmp. Home provides a default 100 GB of storage and scratch is provided for compute jobs: only actively computed data may reside on the tmp filesystem.