# A Short Guide to Python on the DTU HPC Cluster. Patrick M. Jensen, patmjen@dtu.dk ## I. Preliminaries First, get to a terminal on the cluster. Either through ThinLinc (Applications->Terminal Emulator) or ssh: ``` ssh <USERNAME>@login2.hpc.dtu.dk ``` where `<USERNAME>` is your DTU user name. For actual work, you should log on a real node. For a CPU node, enter: ``` linuxsh ``` For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129): ``` voltash sxm2sh a100sh ``` ## II. First time virtualenv setup 1. Navigate to your project folder. 2. Load modules for Python by entering: ``` module load python3/3.8.9 module load numpy/1.19.5-python-3.8.9-openblas-0.3.13 module load scipy/1.5.4-python-3.8.9 module load cuda/11.0 ``` We load `numpy` and `scipy` as modules, since the HPC team have made optimized versions for the HPC cluster. NOTE: this uses Python 3.8 but others are available. 3. Create a virtualenv by running: ``` virtualenv <VENV_NAME> ``` 4. Activate the virtualenv by running: ``` source <VENV_NAME>/bin/activate ``` You should now be able to install packages with pip install <PACKAGE> as normal. If pip doesn't work, you may need to manually install it with: ``` easy_install pip ``` To run python, remember to use the `python3` command. ## III. Virtualenv activation 1. Navigate to your project folder. 2. Load modules for Python by entering: ``` module load python3/3.8.9 module load numpy/1.19.5-python-3.8.9-openblas-0.3.13 module load scipy/1.5.4-python-3.8.9 module load cuda/11.0 ``` We load `numpy` and `scipy` as modules, since the HPC team have made optimized versions for the HPC cluster. 3. Activate the virtualenv by running: ``` source <VENV_NAME>/bin/activate ``` To make life easy, put these commands in a bash script called `init.sh`: ``` #!/bin/bash module load python3/3.8.9 module load numpy/1.19.5-python-3.8.9-openblas-0.3.13 module load scipy/1.5.4-python-3.8.9 module load cuda/11.0 source <VENV_NAME>/bin/activate ``` which you can then run by entering: ``` source init.sh ``` and this will prepare everything ## IV. Jupyter notebooks on ThinLinc 1. Open a terminal (Applications->Terminal Emulator). 2. If you want GPU, we need to enable X11 forwarding, by adding the -X option. Enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129): ``` voltash -X sxm2sh -X a100sh -X ``` 3. Navigate to your project folder and activate the virtualenv 4. Install Jupyter with: ``` pip install jupyter jupyterlab ``` 5. Start Jupyter notebooks with ``` jupyter notebook ``` This should open a browser with the Jupyter notebooks interface. 6. Start Jupyter labs with ``` jupyter lab ``` This should open a browser with the Jupyter lab interface. Sometimes this has some issues and one then need to revert to Jupyter notebooks. ## V. Jupyter notebooks on the cluster in your own browser **WARNING:** This will be a bit involved... Credit goes to Niels Jeppesen who figured all this out. 1. Open a terminal on the cluster, either through ThinLinc or ssh. 2. Make sure you are on a real node as described in section I. If you are using ssh, it is good to start a tmux session by running: ``` tmux ``` This way, _**if you lose your internet connection**_ your notebook will keep running. You can _**reconnect**_ to the tmux session by running: ``` tmux attach ``` in a terminal on the cluster. 3. Take note of the node's hostname. You can see this by running: ``` echo $HOSTNAME ``` 4. Navigate to your project folder and activate the virtualenv. 5. Start a Jupyter lab or Jupyter notebook server by entering one of the following: ``` jupyter lab --port=44000 --ip=$HOSTNAME --no-browser jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser ``` This should start a server and print something like: ``` To access the server, open this file in a browser: file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html Or copy and paste one of these URLs: http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0 or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0 ``` 5. Open a terminal on your own computer and run ``` ssh <USERNAME>@login2.hpc.dtu.dk -g -L44000:<HOSTNAME>:44000 -N ``` where `<USERNAME>` is your DTU user name and `<HOSTNAME>` is the hostname you found in step 3. This should prompt you for your DTU password, _**and then NOTHING SHOULD HAPPEN**_. 6. Open your browser and enter the URL printed in step 5 that starts with `127.0.0.1` (e.g. `http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0`). This should open the Jupyter interface. Any commands you run will be executed on the HPC cluster. **NOTE:** If no URL beginning with `127.0.0.1` was printed in step 5, change the first part manually to `127.0.0.1` before entering it in your browser. In the example from step 5, you would change `n-62-20-9` to `12.0.0.1`. If you close your browser, you can reconnnect by entering the URL again. If you lose your internet connection, you can reconnect by repeating steps 5 and 6. **NOTE:** You can make inline plots in the notebook, but cannot open new windows for plotting. The closest you can get is by starting the server on a ThinLinc node with X11 forwarding. This will allow you to have the notebook in your own browser, but new windows will be opened in ThinLinc.