Newer
Older
# A Short Guide to Python on the DTU HPC Cluster.
Patrick M. Jensen, patmjen@dtu.dk
## I. Preliminaries
First, get to a terminal on the cluster.
Either through ThinLinc (Applications->Terminal Emulator) or ssh:
```
ssh <USERNAME>@login2.hpc.dtu.dk
```
where `<USERNAME>` is your DTU user name.
For actual work, you should log on a real node.
For a CPU node, enter:
```
linuxsh
```
For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129):
```
voltash
sxm2sh
a100sh
```
## II. First time virtualenv setup
1. Navigate to your project folder.
2. Load modules for Python by entering:
```
module load python3/3.9.14
module load numpy/1.23.3-python-3.9.14-openblas-0.3.21
module load scipy/1.9.1-python-3.9.14
module load matplotlib/3.6.0-numpy-1.23.3-python-3.9.14
module load cuda/11.6
We load `numpy`, `scipy`, and `matplotlib` as modules, because the HPC team have made optimized versions for the HPC cluster.
> __NOTE:__ This guide uses Python 3.9 and CUDA 11.6 but other versions are available.
3. Create a virtualenv by running:
```
virtualenv <VENV_NAME>
```
4. Activate the virtualenv by running:
```
source <VENV_NAME>/bin/activate
```
You should now be able to install packages with pip install <PACKAGE> as normal.
> __Troubleshooting:__ If pip doesn't work, you may need to manually install it with:
> ```
> easy_install pip
> ```
> __NOTE:__ These steps must be done every time. Also if you change from a login node to a GPU node (e.g. by calling `sxm2sh`)
1. Navigate to your project folder.
2. Load modules for Python by entering:
```
module load python3/3.9.14
module load numpy/1.23.3-python-3.9.14-openblas-0.3.21
module load scipy/1.9.1-python-3.9.14
module load matplotlib/3.6.0-numpy-1.23.3-python-3.9.14
module load cuda/11.6
```
We load `numpy` and `scipy` as modules, since the HPC team have made optimized versions for the HPC cluster.
3. Activate the virtualenv by running:
```
source <VENV_NAME>/bin/activate
```
> __Pro tip:__ To make life easy, put these commands in a bash script called `init.sh`:
> ```
> #!/bin/bash
> module load python3/3.9.14
> module load numpy/1.23.3-python-3.9.14-openblas-0.3.21
> module load scipy/1.9.1-python-3.9.14
> module load matplotlib/3.6.0-numpy-1.23.3-python-3.9.14
> module load cuda/11.6
>
> source <VENV_NAME>/bin/activate
> ```
>
> which you can then run by entering:
> ```
> source init.sh
> ```
>
> and this will prepare everything
## IV. Jupyter notebooks on ThinLinc
1. Open a terminal (Applications->Terminal Emulator).
2. If you want GPU, we need to enable X11 forwarding, by adding the -X option.
Enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129):
```
voltash -X
sxm2sh -X
a100sh -X
```
3. Navigate to your project folder, load modules, and activate the virtualenv. Same steps as in section III.
4. Install Jupyter with:
```
pip install jupyter jupyterlab
```
5. Start Jupyter notebooks with
```
jupyter notebook
```
This should open a browser with the Jupyter notebooks interface.
6. Start Jupyter labs with
```
jupyter lab
```
This should open a browser with the Jupyter lab interface.
> __Troubleshooting:__ Sometimes Jupyter lab has some issues and you need to revert to Jupyter notebooks.
## V. Jupyter notebooks on the cluster in your own browser
> __WARNING:__ This will be a bit involved...
> Credit goes to Niels Jeppesen who figured all this out.
1. Open a terminal on the cluster, either through ThinLinc or ssh.
2. Call `sxm2sh` or `linuxsh`, as described in section I, so you are not on a login node.
3. Start a tmux session by running:
> _**If you lose your internet connection**_ your notebook will keep running.
> You can _**reconnect**_ to the tmux session by running:
> ```
> tmux attach
> ```
> in a terminal on the cluster.
4. Take note of the node's hostname - you will need it later. You can see this by running:
5. Navigate to your project folder, load modules, and activate the virtualenv. Same steps as in section III.
6. Start a Jupyter lab or Jupyter notebook server by entering one of the following:
```
jupyter lab --port=44000 --ip=$HOSTNAME --no-browser
jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser
```
This should start a server and print something like:
```
To access the server, open this file in a browser:
file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html
Or copy and paste one of these URLs:
http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
```
ssh <USERNAME>@login2.hpc.dtu.dk -NL44000:<HOSTNAME>:44000
where `<USERNAME>` is your DTU user name and `<HOSTNAME>` is the hostname you found in step 4.
This should prompt you for your DTU password, _**and then NOTHING SHOULD HAPPEN**_.
8. Open your browser and enter the URL printed in step 5 that starts with `127.0.0.1`
(e.g. `http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0`).
This should open the Jupyter interface. Any commands you run will be executed on the HPC
> **Troubleshooting:** If no URL beginning with `127.0.0.1` was printed in step 5, change the first part
> manually to `127.0.0.1` before entering it in your browser. In the example from step 5, you
> **Troubleshooting:** If the number after `http://127.0.0.1:` is not `44000`, Jupyter selected another port. In this case, redo step 7 where 44000 is replaced with the number from the URL printed by Jupyter. This happens if the port we request with `--port=44000` is not available.
If you close your browser, you can reconnnect by entering the URL again.
If you lose your internet connection, you can reconnect by repeating steps 5 and 6.
> **NOTE:** You can make inline plots in the notebook, but cannot open new windows for plotting.
> The closest you can get is by starting the server on a ThinLinc node with X11 forwarding.
> This will allow you to have the notebook in your own browser, but new windows will be opened
> in ThinLinc.