# A Short Guide to Python on the DTU HPC Cluster.
Patrick M. Jensen, patmjen@dtu.dk

## I. Preliminaries

First, get to a terminal on the cluster.
Either through ThinLinc (Applications->Terminal Emulator) or ssh:
```
ssh <USERNAME>@login2.hpc.dtu.dk
```
where `<USERNAME>` is your DTU user name.

For actual work, you should log on a real node.
For a CPU node, enter:
```
linuxsh
```

For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129):
```
voltash
sxm2sh
a100sh
```

## II. First time virtualenv setup

1. Navigate to your project folder.

2. Load modules for Python by entering:
   ```
   module load python3/3.8.9
   module load numpy/1.19.5-python-3.8.9-openblas-0.3.13
   module load scipy/1.5.4-python-3.8.9
   module load cuda/11.0
   ```
   We load `numpy` and `scipy` as modules, since the HPC team have made optimized versions for the HPC cluster.
   NOTE: this uses Python 3.8 but others are available.

3. Create a virtualenv by running:
   ```
   virtualenv <VENV_NAME>
   ```

4. Activate the virtualenv by running:
    ```
    source <VENV_NAME>/bin/activate
    ```

You should now be able to install packages with pip install <PACKAGE> as normal.
If pip doesn't work, you may need to manually install it with:
```
easy_install pip
```
To run python, remember to use the `python3` command.

## III. Virtualenv activation

1. Navigate to your project folder.

2. Load modules for Python by entering:
   ```
   module load python3/3.8.9
   module load numpy/1.19.5-python-3.8.9-openblas-0.3.13
   module load scipy/1.5.4-python-3.8.9
   module load cuda/11.0
   ```
   We load  `numpy` and `scipy` as modules, since the HPC team have made optimized versions for the HPC cluster.

3. Activate the virtualenv by running:
   ```
   source <VENV_NAME>/bin/activate
   ```

To make life easy, put these commands in a bash script called `init.sh`:
```
#!/bin/bash
module load python3/3.8.9
module load numpy/1.19.5-python-3.8.9-openblas-0.3.13
module load scipy/1.5.4-python-3.8.9
module load cuda/11.0

source <VENV_NAME>/bin/activate
```

which you can then run by entering:
```
source init.sh
```

and this will prepare everything

## IV. Jupyter notebooks on ThinLinc

1. Open a terminal (Applications->Terminal Emulator).

2. If you want GPU, we need to enable X11 forwarding, by adding the -X option.
   Enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129):
   ```
   voltash -X
   sxm2sh -X
   a100sh -X
   ```

3. Navigate to your project folder and activate the virtualenv

4. Install Jupyter with:
   ```
   pip install jupyter jupyterlab
   ```

5. Start Jupyter notebooks with
   ```
   jupyter notebook
   ```
   This should open a browser with the Jupyter notebooks interface.

6. Start Jupyter labs with
   ```
   jupyter lab
   ```
   This should open a browser with the Jupyter lab interface.
   Sometimes this has some issues and one then need to revert to Jupyter notebooks.

## V. Jupyter notebooks on the cluster in your own browser

**WARNING:** This will be a bit involved...
Credit goes to Niels Jeppesen who figured all this out.

1. Open a terminal on the cluster, either through ThinLinc or ssh.

2. Make sure you are on a real node as described in section I.
   If you are using ssh, it is good to start a tmux session by running:
   ```
   tmux
   ```
   This way, _**if you lose your internet connection**_ your notebook will keep running.
   You can _**reconnect**_ to the tmux session by running:
   ```
   tmux attach
   ```
   in a terminal on the cluster.

3. Take note of the node's hostname. You can see this by running:
   ```
   echo $HOSTNAME
   ```

4. Navigate to your project folder and activate the virtualenv.

5. Start a Jupyter lab or Jupyter notebook server by entering one of the following:
   ```
   jupyter lab --port=44000 --ip=$HOSTNAME --no-browser
   jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser
   ```
   This should start a server and print something like:
   ```
   To access the server, open this file in a browser:
           file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html
       Or copy and paste one of these URLs:
           http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
        or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
   ```

5. Open a terminal on your own computer and run
   ```
   ssh <USERNAME>@login2.hpc.dtu.dk -g -L44000:<HOSTNAME>:44000 -N
   ```
   where `<USERNAME>` is your DTU user name and `<HOSTNAME>` is the hostname you found in step 3.
   This should prompt you for your DTU password, _**and then NOTHING SHOULD HAPPEN**_.

6. Open your browser and enter the URL printed in step 5 that starts with `127.0.0.1`
   (e.g. `http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0`).
   This should open the Jupyter interface. Any commands you run will be executed on the HPC
   cluster. 
   
   **NOTE:** If no URL beginning with `127.0.0.1` was printed in step 5, change the first part
   manually to `127.0.0.1` before entering it in your browser. In the example from step 5, you
   would change `n-62-20-9` to `12.0.0.1`.

If you close your browser, you can reconnnect by entering the URL again.
If you lose your internet connection, you can reconnect by repeating steps 5 and 6.

**NOTE:** You can make inline plots in the notebook, but cannot open new windows for plotting.
The closest you can get is by starting the server on a ThinLinc node with X11 forwarding.
This will allow you to have the notebook in your own browser, but new windows will be opened
in ThinLinc.