A Short Guide to Python on the DTU HPC Cluster.
Patrick M. Jensen, patmjen@dtu.dk, 25-11-2022
Contents
- I. Preliminaries
- II. Setup Virtualenv
- III. Jupyter notebooks on ThinLinc
- IV. Jupyter notebooks on the cluster in your own browser
I. Preliminaries
First, get to a terminal on the cluster. Either through ThinLinc (Applications->Terminal Emulator) or ssh:
ssh <USERNAME>@login2.hpc.dtu.dk
where <USERNAME>
is your DTU user name.
For actual work, you should log on a real node. For a CPU node, enter:
linuxsh
For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129):
voltash
sxm2sh
a100sh
II. Setup Virtualenv
-
Get to a CPU or GPU node on the cluster (see Section I).
-
Navigate to your project folder.
-
Download
scripts/init.sh
and place it in your project folder. This only needs to be done the first time.Tip: You can do this by calling
wget https://lab.compute.dtu.dk/patmjen/hcp_tutorials/-/raw/main/scripts/init.sh
in the terminal.
-
Call
source init.sh
This will setup and activate your virtualenv. You must do this every time you log in or change node (e.g. by calling
sxm2sh
)!Tip: To configure the virtualenv change the following variables at the top of
init.sh
:# Configuration # This is what you should change for your setup VENV_NAME=venv # Name of your virtualenv (default: venv) VENV_DIR=. # Where to store your virtualenv (default: current directory) PYTHON_VERSION=3.11.9 # Python version (default: 3.11.9) CUDA_VERSION=11.8 # CUDA version (default: 11.8)
-
Your are done! You can now install packages with
pip install <PACKAGE>
and run python3 code withpython
.Troubleshooting: If pip doesn't work, you may need to manually install it with:
easy_install pip
III. Jupyter notebooks on ThinLinc
-
Open a terminal (Applications->Terminal Emulator).
-
If you want GPU, enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129):
voltash -X sxm2sh -X a100sh -X
Remeber the
-X
which enables X11 forwarding! It is needed to open a browser for the notebook. -
Navigate to your project folder and activate the virtualenv. Same steps as in section II.
-
Install Jupyter with:
pip install jupyter jupyterlab
-
Start Jupyter lab with
jupyter lab
...or start Jupyter notebooks with
jupyter notebook
This should open a browser with the Jupyter notebook or jupyter lab interface.
Troubleshooting: Sometimes Jupyter lab has some issues and you need to revert to Jupyter notebooks.
IV. Jupyter notebooks on the cluster in your own browser
WARNING: This will be a bit involved... Credit goes to Niels Jeppesen who figured all this out.
-
Open a terminal on the cluster, either through ThinLinc or ssh.
-
Call
sxm2sh
orlinuxsh
, as described in section I, so you are not on a login node. -
Start a tmux session by running:
tmux
If you lose your internet connection your notebook will keep running. You can reconnect to the tmux session by running:
tmux attach
in a terminal on the cluster.
-
Take note of the node's hostname - you will need it later. You can see this by running:
echo $HOSTNAME
-
Navigate to your project folder, and activate the virtualenv. Same steps as in section II.
-
Start a Jupyter lab or Jupyter notebook server by entering one of the following:
jupyter lab --port=44000 --ip=$HOSTNAME --no-browser jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser
This should start a server and print something like:
To access the server, open this file in a browser: file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html Or copy and paste one of these URLs: http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0 or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
-
Open a terminal on your own computer and run
ssh <USERNAME>@login2.hpc.dtu.dk -NL44000:<HOSTNAME>:44000
where
<USERNAME>
is your DTU user name and<HOSTNAME>
is the hostname you found in step 4. This should prompt you for your DTU password, and then NOTHING SHOULD HAPPEN. -
Open your browser and enter the URL printed in step 5 that starts with
127.0.0.1
(e.g.http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
). This should open the Jupyter interface. Any commands you run will be executed on the HPC cluster.Troubleshooting: If no URL beginning with
127.0.0.1
was printed in step 5, change the first part manually to127.0.0.1
before entering it in your browser. In the example from step 5, you would changen-62-20-9
to127.0.0.1
.Troubleshooting: If the number after
http://127.0.0.1:
is not44000
, Jupyter selected another port. In this case, redo step 7 where 44000 is replaced with the number from the URL printed by Jupyter. This happens if the port we request with--port=44000
is not available.
If you close your browser, you can reconnnect by entering the URL again. If you lose your internet connection, you can reconnect by repeating steps 5 and 6.
NOTE: You can make inline plots in the notebook, but cannot open new windows for plotting. The closest you can get is by starting the server on a ThinLinc node with X11 forwarding. This will allow you to have the notebook in your own browser, but new windows will be opened in ThinLinc.