Using custom python packages on Linux in the institute
This page explains, how to install/use python packages on your own.
Permanent Link:
Overview
Software installation in the context of
EnterpriseComputing is done by IT to prevent users from interfering with each other's work. This is why there is no way to e.g. change local device drivers or networking libraries. However, new versions of research related software are released frequently and you most likely want to work with a recent version or a very specific one (because of this long running study which you want a stable code base for) or both. In both cases it's best to have user control over software installation--which is easy when it comes to Python packages.
Important:
- Use Python 3 and do not use Python 2 anymore. Do not start analyzing new projects/studies using Python 2, because it's no longer supported by the authors. For some edge cases it's still installed on G7 computer but you should never ever use it for your own scripts!
- An environment like CUSTOMPYTHON has to be activated once whenever you open a new terminal window!
Some theory
- Python packages can be installed into any folder.
- Python packages installed by IT are being installed into folders that no user has write access to.
- You can instruct Python to look for packages in multiple folders in a specific order.
- When you activate CUSTOMPYTHON, the installed packages in your custom folder will take precedence over the default installations.
- There are other ways of installing custom Python modules and even actual Python binaries.
- These are not supported by IT ...
- ... but you're free to use those anyway (Anaconda, Virtualenv, ...)
- You might need additional development packages installed locally by IT. We'll take care of this, just write a ticket.
A concrete example
- You want to use the latest
matplotlib
release.
- Prepare the environment to use an institute default per-user folder for software installation:
user@host >
CUSTOMPYTHON
If you do not have a personal software folder, create one first: https://userportal.cbs.mpg.de/storageunified/request/user.
- Type this command to install the package incl. all dependencies:
custompython_user@h.ost >
pip3 install matplotlib
- Start your script or IDE from the CUSTOMPYTHON environment to use the installed packages
To make it more complex
The
matplotlib
-example takes different platform generations into account and installs packages into a per-platform folder ( e.g.
/data/u_user_software/python/debian-bullseye-amd64
).
That's necessary since some Python modules are compiled C code which is OS specific.
However, you might want to add another layer for your different studies to version-lock Python modules for specific projects (e.g.
/data/u_user_software/python/debian-bullseye-amd64/12345
):
- Have a look at the script
/usr/bin/prepare-custom-python
. This is a "source" -script which is supposed to be run in an existing shell.
- Copy the script to your
~/bin
folder and name it e.g. py12345
.
- Add another path component to the
_python_dir
assignment:
_python_dir=/data/${_storage_block}/python/$(distri -f)/12345
. Make sure to keep $(distri -f)
as a path component. This is necessary to keep your stuff running smoothly during a generation changes (like the G7 upgrade in 2022).
How to start from scratch
You might want to start over after your updated a package to a wrong version. If you start the CUSTOMPYTHON environment, you'll be shown a base path. Whatever you installed via pip can be found there and can be removed. The most brute force way is to
- Exit the CUSTOMPYTHON environment (e.g. close all the terminals, it's being used in).
- Remove the whole base path.
- Start CUSTOMPYTHON in a terminal.
- Re-install everything you need via
pip3 install
.
A little bit more precise:
- Identify the package, you want to remove, e.g.
matplotlib
- Make sure to be in the CUSTOMPYTHON environment
- Remove the package via
user@host >
pip3 uninstall matplotlib
A bit more low-level:
- Go to the module path.
- Manually remove all the modules, you don't want.