Skip to content
Snippets Groups Projects
HPC_Python_Guide.md 6.35 KiB
Newer Older
  • Learn to ignore specific revisions
  • patmjen's avatar
    patmjen committed
    # A Short Guide to Python on the DTU HPC Cluster.
    Patrick M. Jensen, patmjen@dtu.dk
    
    ## I. Preliminaries
    
    First, get to a terminal on the cluster.
    Either through ThinLinc (Applications->Terminal Emulator) or ssh:
    ```
    ssh <USERNAME>@login2.hpc.dtu.dk
    ```
    where `<USERNAME>` is your DTU user name.
    
    For actual work, you should log on a real node.
    For a CPU node, enter:
    ```
    linuxsh
    ```
    
    For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129):
    ```
    voltash
    sxm2sh
    a100sh
    ```
    
    ## II. First time virtualenv setup
    
    1. Navigate to your project folder.
    
    2. Load modules for Python by entering:
       ```
    
       module load python3/3.9.14
       module load numpy/1.23.3-python-3.9.14-openblas-0.3.21
       module load scipy/1.9.1-python-3.9.14
       module load matplotlib/3.6.0-numpy-1.23.3-python-3.9.14
       module load cuda/11.6
    
    patmjen's avatar
    patmjen committed
       ```
    
       We load `numpy`, `scipy`, and `matplotlib` as modules, because the HPC team have made optimized versions for the HPC cluster.
       
       > __NOTE:__ This guide uses Python 3.9 and CUDA 11.6 but other versions are available.
    
    patmjen's avatar
    patmjen committed
    
    3. Create a virtualenv by running:
       ```
       virtualenv <VENV_NAME>
       ```
    
    4. Activate the virtualenv by running:
        ```
        source <VENV_NAME>/bin/activate
        ```
    
    You should now be able to install packages with pip install <PACKAGE> as normal.
    
    patmjen's avatar
    patmjen committed
    
    > __Troubleshooting:__ If pip doesn't work, you may need to manually install it with:
    > ```
    > easy_install pip
    > ```
    
    patmjen's avatar
    patmjen committed
    
    ## III. Virtualenv activation
    
    
    > __NOTE:__ These steps must be done every time. Also if you change from a login node to a GPU node (e.g. by calling `sxm2sh`)
    
    
    patmjen's avatar
    patmjen committed
    1. Navigate to your project folder.
    
    2. Load modules for Python by entering:
       ```
    
       module load python3/3.9.14
       module load numpy/1.23.3-python-3.9.14-openblas-0.3.21
       module load scipy/1.9.1-python-3.9.14
       module load matplotlib/3.6.0-numpy-1.23.3-python-3.9.14
       module load cuda/11.6
    
    patmjen's avatar
    patmjen committed
       ```
       We load  `numpy` and `scipy` as modules, since the HPC team have made optimized versions for the HPC cluster.
    
    3. Activate the virtualenv by running:
       ```
       source <VENV_NAME>/bin/activate
       ```
    
    
    patmjen's avatar
    patmjen committed
    > __Pro tip:__ To make life easy, put these commands in a bash script called `init.sh`:
    > ```
    > #!/bin/bash
    
    > module load python3/3.9.14
    > module load numpy/1.23.3-python-3.9.14-openblas-0.3.21
    > module load scipy/1.9.1-python-3.9.14
    > module load matplotlib/3.6.0-numpy-1.23.3-python-3.9.14
    > module load cuda/11.6
    
    patmjen's avatar
    patmjen committed
    >
    > source <VENV_NAME>/bin/activate
    > ```
    > 
    > which you can then run by entering:
    > ```
    > source init.sh
    > ```
    > 
    > and this will prepare everything
    
    patmjen's avatar
    patmjen committed
    
    ## IV. Jupyter notebooks on ThinLinc
    
    1. Open a terminal (Applications->Terminal Emulator).
    
    2. If you want GPU, we need to enable X11 forwarding, by adding the -X option.
       Enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129):
       ```
       voltash -X
       sxm2sh -X
       a100sh -X
       ```
    
    
    patmjen's avatar
    patmjen committed
    3. Navigate to your project folder, load modules, and activate the virtualenv. Same steps as in section III.
    
    patmjen's avatar
    patmjen committed
    
    4. Install Jupyter with:
       ```
       pip install jupyter jupyterlab
       ```
    
    5. Start Jupyter notebooks with
       ```
       jupyter notebook
       ```
       This should open a browser with the Jupyter notebooks interface.
    
    6. Start Jupyter labs with
       ```
       jupyter lab
       ```
       This should open a browser with the Jupyter lab interface.
    
    patmjen's avatar
    patmjen committed
       > __Troubleshooting:__ Sometimes Jupyter lab has some issues and you need to revert to Jupyter notebooks.
    
    patmjen's avatar
    patmjen committed
    
    ## V. Jupyter notebooks on the cluster in your own browser
    
    
    patmjen's avatar
    patmjen committed
    > __WARNING:__ This will be a bit involved...
    > Credit goes to Niels Jeppesen who figured all this out.
    
    patmjen's avatar
    patmjen committed
    
    1. Open a terminal on the cluster, either through ThinLinc or ssh.
    
    
    patmjen's avatar
    patmjen committed
    2. Call `sxm2sh` or `linuxsh`, as described in section I, so you are not on a login node.
    
    3. Start a tmux session by running:
    
    patmjen's avatar
    patmjen committed
       ```
       tmux
       ```
    
    patmjen's avatar
    patmjen committed
       > _**If you lose your internet connection**_ your notebook will keep running.
       > You can _**reconnect**_ to the tmux session by running:
       > ```
       > tmux attach
       > ```
       > in a terminal on the cluster.
    
    patmjen's avatar
    patmjen committed
    
    
    patmjen's avatar
    patmjen committed
    4. Take note of the node's hostname - you will need it later. You can see this by running:
    
    patmjen's avatar
    patmjen committed
       ```
       echo $HOSTNAME
       ```
    
    
    patmjen's avatar
    patmjen committed
    5. Navigate to your project folder, load modules, and activate the virtualenv. Same steps as in section III.
    
    patmjen's avatar
    patmjen committed
    
    
    patmjen's avatar
    patmjen committed
    6. Start a Jupyter lab or Jupyter notebook server by entering one of the following:
    
    patmjen's avatar
    patmjen committed
       ```
       jupyter lab --port=44000 --ip=$HOSTNAME --no-browser
       jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser
       ```
       This should start a server and print something like:
       ```
       To access the server, open this file in a browser:
               file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html
           Or copy and paste one of these URLs:
               http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
            or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
       ```
    
    
    patmjen's avatar
    patmjen committed
    7. Open a terminal on your own computer and run
    
    patmjen's avatar
    patmjen committed
       ```
    
       ssh <USERNAME>@login2.hpc.dtu.dk -NL44000:<HOSTNAME>:44000
    
    patmjen's avatar
    patmjen committed
       ```
    
    patmjen's avatar
    patmjen committed
       where `<USERNAME>` is your DTU user name and `<HOSTNAME>` is the hostname you found in step 4.
    
    patmjen's avatar
    patmjen committed
       This should prompt you for your DTU password, _**and then NOTHING SHOULD HAPPEN**_.
    
    
    patmjen's avatar
    patmjen committed
    8. Open your browser and enter the URL printed in step 5 that starts with `127.0.0.1`
    
       (e.g. `http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0`).
    
    patmjen's avatar
    patmjen committed
       This should open the Jupyter interface. Any commands you run will be executed on the HPC
    
    patmjen's avatar
    patmjen committed
       > **Troubleshooting:** If no URL beginning with `127.0.0.1` was printed in step 5, change the first part
    
    patmjen's avatar
    patmjen committed
       > manually to `127.0.0.1` before entering it in your browser. In the example from step 5, you
    
    patmjen's avatar
    patmjen committed
       > would change `n-62-20-9` to `127.0.0.1`.
    
    
    patmjen's avatar
    patmjen committed
       > **Troubleshooting:** If the number after `http://127.0.0.1:` is not `44000`, Jupyter selected another port. In this case, redo step 7 where 44000 is replaced with the number from the URL printed by Jupyter. This happens if the port we request with `--port=44000` is not available.
    
    patmjen's avatar
    patmjen committed
    
    If you close your browser, you can reconnnect by entering the URL again.
    If you lose your internet connection, you can reconnect by repeating steps 5 and 6.
    
    
    patmjen's avatar
    patmjen committed
    > **NOTE:** You can make inline plots in the notebook, but cannot open new windows for plotting.
    > The closest you can get is by starting the server on a ThinLinc node with X11 forwarding.
    > This will allow you to have the notebook in your own browser, but new windows will be opened
    > in ThinLinc.