Python, Pip, and Permissions
Python binary Locations
On the supercomputers, there are numerous versions of python
and pip
installed.
$ which python2
/usr/bin/python2
$ which python3
/usr/bin/python3
$ which pip
/usr/bin/pip
$ which pip3
/usr/bin/pip3
These installations are system-default installations, and their binaries and libraries exist locally on one specific host, e.g., c001
or g050
.
Simply trying to install new python packages with pip3 install <packagename>
will therefore fail, because using the system-installation python
points to root
-owned locations on that specific node.
$ pip3 install hmmlearn
Collecting hmmlearn
Downloading https://files.pythonhosted.org/packages/1f/58/d8aa966456550e3741043b6345d63f62740b4ff3f749bde0c2fb6acde2c5/hmmlearn-0.2.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (367kB)
100% |████████████████████████████████| 368kB 1.4MB/s
...
Installing collected packages: numpy, scipy, threadpoolctl, joblib, scikit-learn, hmmlearn
Exception:
Traceback (most recent call last):
...
PermissionError: [Errno 13] Permission denied: '/usr/local/lib64/python3.6'
As you can see, pip3 install
points to /usr/local/lib64/
, and you get a PermissionError
.
Installing on a specific node is counterproductive to HPC computing, as it's undesirable to rely on resources from a specific node for computation. Instead, it's preferable for packages to be installed on shared storage, which is accessible from every node within the supercomputer.
Examples of shared storage include:
/home/[myasurite]
/scratch/[myasurite]
/data/grp_[mygrp]
How to Install on Shared Storage
The Pip Way
$ pip3 install --user hmmlearn
Applying the --user
flag tells pip
to install to the user’s $HOME
directory, where users have full permissions. These files are put into ~/.local/lib/python3.6/site-packages
, where 3.6
matches the version of python/pip currently first in your $PATH
variable.
Because your $HOME
is mounted on every node you connect to (and your jobs connect to), you can rely on these packages to be available for import any time you use any 3.6
version of python.
The Mamba Way
$ module load mamba/latest
$ mamba create -n myenv
$ source activate myenv
(myenv) $ pip install hmmlearn
It is preferable to use mamba install
rather than pip install
where possible, but sometimes packages are only available through pip
.
When using pip
with mamba, there is no need to use --user
because pip
is contained within and already pointed to your mamba environment. Thus, installing packages with pip
in this scenario will direct files to .conda/envs/myenv/lib/python3.6/site-packages
.
Using --user
will still work (by putting the files in the file tree described above), but generally is not advisable.
(qiime2-2021.4) ]$ which python
~/.conda/envs/qiime2-2021.4/bin/python
(qiime2-2021.4) $ which pip3
~/.conda/envs/qiime2-2021.4/bin/pip3
As you can see, an mamba environment can instead have its own python
and pip
binaries, ensuring no environment or python conflicts.
Summary of Installation Methods
Method | Using Mamba? | Target Destination | Recommended? |
---|---|---|---|
| no |
| no |
| yes |
| yes |
| no |
| sometimes |
| yes |
| no |
| yes |
| yes |