# A Short Guide to Python on the DTU HPC Cluster. Patrick M. Jensen, patmjen@dtu.dk, 25-11-2022 **Contents** * [I. Preliminaries](#i-preliminaries) * [II. Setup Virtualenv](#ii-setup-virtualenv) * [III. Jupyter notebooks on ThinLinc](#iii-jupyter-notebooks-on-thinlinc) * [IV. Jupyter notebooks on the cluster in your own browser](#iv-jupyter-notebooks-on-the-cluster-in-your-own-browser) ## I. Preliminaries First, get to a terminal on the cluster. Either through ThinLinc (Applications->Terminal Emulator) or ssh: ``` ssh <USERNAME>@login2.hpc.dtu.dk ``` where `<USERNAME>` is your DTU user name. For actual work, you should log on a real node. For a CPU node, enter: ``` linuxsh ``` For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129): ``` voltash sxm2sh a100sh ``` ## II. Setup Virtualenv 1. Get to a CPU or GPU node on the cluster (see [Section I](#i-preliminaries)). 2. Navigate to your project folder. 3. Download `scripts/init.sh` and place it in your project folder. This **only** needs to be done the first time. > **Tip:** You can do this by calling > ```bash > wget https://lab.compute.dtu.dk/patmjen/hcp_tutorials/-/raw/main/scripts/init.sh > ``` > in the terminal. 4. Call ``` source init.sh ``` This will setup and activate your virtualenv. You must **do this every time** you log in or change node (e.g. by calling `sxm2sh`)! > **Tip:** To configure the virtualenv change the following variables at the top of `init.sh`: > ```bash > # Configuration > # This is what you should change for your setup > VENV_NAME=venv # Name of your virtualenv (default: venv) > VENV_DIR=. # Where to store your virtualenv (default: current directory) > PYTHON_VERSION=3.11.9 # Python version (default: 3.11.9) > CUDA_VERSION=11.8 # CUDA version (default: 11.8) > ``` 5. Your are done! You can now install packages with `pip install <PACKAGE>` and run python3 code with `python`. > __Troubleshooting:__ If pip doesn't work, you may need to manually install it with: `easy_install pip` ## III. Jupyter notebooks on ThinLinc 1. Open a terminal (Applications->Terminal Emulator). 2. If you want GPU, enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129): ``` voltash -X sxm2sh -X a100sh -X ``` Remeber the `-X` which enables X11 forwarding! It is needed to open a browser for the notebook. 3. Navigate to your project folder and activate the virtualenv. Same steps as in [section II](#ii-setup-virtualenv). 4. Install Jupyter with: ``` pip install jupyter jupyterlab ``` 5. Start Jupyter lab with ``` jupyter lab ``` **...or** start Jupyter notebooks with ``` jupyter notebook ``` This should open a browser with the Jupyter notebook or jupyter lab interface. > __Troubleshooting:__ Sometimes Jupyter lab has some issues and you need to revert to Jupyter notebooks. ## IV. Jupyter notebooks on the cluster in your own browser > __WARNING:__ This will be a bit involved... > Credit goes to Niels Jeppesen who figured all this out. 1. Open a terminal on the cluster, either through ThinLinc or ssh. 2. Call `sxm2sh` or `linuxsh`, as described in [section I](#i-preliminaries), so you are not on a login node. 3. Start a tmux session by running: ``` tmux ``` > _**If you lose your internet connection**_ your notebook will keep running. > You can _**reconnect**_ to the tmux session by running: > ``` > tmux attach > ``` > in a terminal on the cluster. 4. Take note of the node's hostname - you will need it later. You can see this by running: ``` echo $HOSTNAME ``` 5. Navigate to your project folder, and activate the virtualenv. Same steps as in [section II](#ii-setup-virtualenv). 6. Start a Jupyter lab or Jupyter notebook server by entering one of the following: ``` jupyter lab --port=44000 --ip=$HOSTNAME --no-browser jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser ``` This should start a server and print something like: ``` To access the server, open this file in a browser: file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html Or copy and paste one of these URLs: http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0 or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0 ``` 7. Open a terminal on your own computer and run ``` ssh <USERNAME>@login2.hpc.dtu.dk -NL44000:<HOSTNAME>:44000 ``` where `<USERNAME>` is your DTU user name and `<HOSTNAME>` is the hostname you found in step 4. This should prompt you for your DTU password, _**and then NOTHING SHOULD HAPPEN**_. 8. Open your browser and enter the URL printed in step 5 that starts with `127.0.0.1` (e.g. `http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0`). This should open the Jupyter interface. Any commands you run will be executed on the HPC cluster. > **Troubleshooting:** If no URL beginning with `127.0.0.1` was printed in step 5, change the first part > manually to `127.0.0.1` before entering it in your browser. In the example from step 5, you > would change `n-62-20-9` to `127.0.0.1`. > **Troubleshooting:** If the number after `http://127.0.0.1:` is not `44000`, Jupyter selected another port. In this case, redo step 7 where 44000 is replaced with the number from the URL printed by Jupyter. This happens if the port we request with `--port=44000` is not available. If you close your browser, you can reconnnect by entering the URL again. If you lose your internet connection, you can reconnect by repeating steps 5 and 6. > **NOTE:** You can make inline plots in the notebook, but cannot open new windows for plotting. > The closest you can get is by starting the server on a ThinLinc node with X11 forwarding. > This will allow you to have the notebook in your own browser, but new windows will be opened > in ThinLinc.