Python, Pip, and Permissions

Python binary Locations

On the supercomputers, there are numerous versions of python and pip installed.

$ which python2
/usr/bin/python2
$ which python3
/usr/bin/python3

$ which pip
/usr/bin/pip
$ which pip3
/usr/bin/pip3

These installations are system-default installations, and their binaries and libraries exist locally on one specific host, e.g., c001 or g050.

Simply trying to install new python packages with pip3 install <packagename> will therefore fail, because using the system-installation python points to root-owned locations on that specific node.

$ pip3 install hmmlearn Collecting hmmlearn Downloading https://files.pythonhosted.org/packages/1f/58/d8aa966456550e3741043b6345d63f62740b4ff3f749bde0c2fb6acde2c5/hmmlearn-0.2.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (367kB) 100% |████████████████████████████████| 368kB 1.4MB/s ... Installing collected packages: numpy, scipy, threadpoolctl, joblib, scikit-learn, hmmlearn Exception: Traceback (most recent call last): ... PermissionError: [Errno 13] Permission denied: '/usr/local/lib64/python3.6'

As you can see, pip3 install points to /usr/local/lib64/, and you get a PermissionError.

Installing on a specific node is counterproductive to HPC computing, as it's undesirable to rely on resources from a specific node for computation. Instead, it's preferable for packages to be installed on shared storage, which is accessible from every node within the supercomputer.

Examples of shared storage include:

  1. /home/[myasurite]

  2. /scratch/[myasurite]

  3. /data/grp_[mygrp]

How to Install on Shared Storage

The Pip Way

$ pip3 install --user hmmlearn

Applying the --user flag tells pip to install to the user’s $HOME directory, where users have full permissions. These files are put into ~/.local/lib/python3.6/site-packages, where 3.6 matches the version of python/pip currently first in your $PATH variable.

Because your $HOME is mounted on every node you connect to (and your jobs connect to), you can rely on these packages to be available for import any time you use any 3.6 version of python.

The Mamba Way

$ module load mamba/latest $ mamba create -n myenv $ source activate myenv (myenv) $ pip install hmmlearn

It is preferable to use mamba install rather than pip install where possible, but sometimes packages are only available through pip.

When using pip with mamba, there is no need to use --user because pip is contained within and already pointed to your mamba environment. Thus, installing packages with pip in this scenario will direct files to .conda/envs/myenv/lib/python3.6/site-packages.

Using --user will still work (by putting the files in the file tree described above), but generally is not advisable.

(qiime2-2021.4) ]$ which python ~/.conda/envs/qiime2-2021.4/bin/python (qiime2-2021.4) $ which pip3 ~/.conda/envs/qiime2-2021.4/bin/pip3

As you can see, an mamba environment can instead have its own python and pip binaries, ensuring no environment or python conflicts.

Summary of Installation Methods

Method

Using Mamba?

Target Destination

Recommended?

Method

Using Mamba?

Target Destination

Recommended?

pip

no

/usr/local/lib64/

no

pip

yes

~/.conda/envs/[myenv]

yes

pip --user

no

~/.local/lib/python/3.x/site-libraries

sometimes

pip --user

yes

~/.local/lib/python/3.x/site-libraries

no

conda install

yes

~/.conda/envs/[myenv]

yes