Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

A Short Guide to Python on the DTU HPC Cluster.

Patrick M. Jensen, patmjen@dtu.dk, 25-11-2022

Contents

I. Preliminaries

First, get to a terminal on the cluster. Either through ThinLinc (Applications->Terminal Emulator) or ssh:

ssh <USERNAME>@login2.hpc.dtu.dk

where <USERNAME> is your DTU user name.

For actual work, you should log on a real node. For a CPU node, enter:

linuxsh

For a GPU node, enter on of the following (see https://www.hpc.dtu.dk/?page_id=2129):

voltash
sxm2sh
a100sh

II. Setup Virtualenv

  1. Get to a CPU or GPU node on the cluster (see Section I).

  2. Navigate to your project folder.

  3. Download scripts/init.sh and place it in your project folder. This only needs to be done the first time.

    Tip: You can do this by calling

    wget https://lab.compute.dtu.dk/patmjen/hcp_tutorials/-/raw/main/scripts/init.sh

    in the terminal.

  4. Call

    source init.sh

    This will setup and activate your virtualenv. You must do this every time you log in or change node (e.g. by calling sxm2sh)!

    Tip: To configure the virtualenv change the following variables at the top of init.sh:

    # Configuration
    # This is what you should change for your setup
    VENV_NAME=venv         # Name of your virtualenv (default: venv)
    VENV_DIR=.             # Where to store your virtualenv (default: current directory)
    PYTHON_VERSION=3.11.9  # Python version (default: 3.11.9)
    CUDA_VERSION=11.8      # CUDA version (default: 11.8)
  5. Your are done! You can now install packages with pip install <PACKAGE> and run python3 code with python.

    Troubleshooting: If pip doesn't work, you may need to manually install it with: easy_install pip

III. Jupyter notebooks on ThinLinc

  1. Open a terminal (Applications->Terminal Emulator).

  2. If you want GPU, enter one of the following (see https://www.hpc.dtu.dk/?page_id=2129):

    voltash -X
    sxm2sh -X
    a100sh -X

    Remeber the -X which enables X11 forwarding! It is needed to open a browser for the notebook.

  3. Navigate to your project folder and activate the virtualenv. Same steps as in section II.

  4. Install Jupyter with:

    pip install jupyter jupyterlab
  5. Start Jupyter lab with

    jupyter lab

    ...or start Jupyter notebooks with

    jupyter notebook

    This should open a browser with the Jupyter notebook or jupyter lab interface.

    Troubleshooting: Sometimes Jupyter lab has some issues and you need to revert to Jupyter notebooks.

IV. Jupyter notebooks on the cluster in your own browser

WARNING: This will be a bit involved... Credit goes to Niels Jeppesen who figured all this out.

  1. Open a terminal on the cluster, either through ThinLinc or ssh.

  2. Call sxm2sh or linuxsh, as described in section I, so you are not on a login node.

  3. Start a tmux session by running:

    tmux

    If you lose your internet connection your notebook will keep running. You can reconnect to the tmux session by running:

    tmux attach

    in a terminal on the cluster.

  4. Take note of the node's hostname - you will need it later. You can see this by running:

    echo $HOSTNAME
  5. Navigate to your project folder, and activate the virtualenv. Same steps as in section II.

  6. Start a Jupyter lab or Jupyter notebook server by entering one of the following:

    jupyter lab --port=44000 --ip=$HOSTNAME --no-browser
    jupyter notebook --port=44000 --ip=$HOSTNAME --no-browser

    This should start a server and print something like:

    To access the server, open this file in a browser:
            file:///zhome/9d/d/98006/.local/share/jupyter/runtime/jpserver-4566-open.html
        Or copy and paste one of these URLs:
            http://n-62-20-9:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
         or http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0
  7. Open a terminal on your own computer and run

    ssh <USERNAME>@login2.hpc.dtu.dk -NL44000:<HOSTNAME>:44000

    where <USERNAME> is your DTU user name and <HOSTNAME> is the hostname you found in step 4. This should prompt you for your DTU password, and then NOTHING SHOULD HAPPEN.

  8. Open your browser and enter the URL printed in step 5 that starts with 127.0.0.1 (e.g. http://127.0.0.1:44000/lab?token=401720c25a3e9411a5f28d9015591b19a9032fc90989ffa0). This should open the Jupyter interface. Any commands you run will be executed on the HPC cluster.

    Troubleshooting: If no URL beginning with 127.0.0.1 was printed in step 5, change the first part manually to 127.0.0.1 before entering it in your browser. In the example from step 5, you would change n-62-20-9 to 127.0.0.1.

    Troubleshooting: If the number after http://127.0.0.1: is not 44000, Jupyter selected another port. In this case, redo step 7 where 44000 is replaced with the number from the URL printed by Jupyter. This happens if the port we request with --port=44000 is not available.

If you close your browser, you can reconnnect by entering the URL again. If you lose your internet connection, you can reconnect by repeating steps 5 and 6.

NOTE: You can make inline plots in the notebook, but cannot open new windows for plotting. The closest you can get is by starting the server on a ThinLinc node with X11 forwarding. This will allow you to have the notebook in your own browser, but new windows will be opened in ThinLinc.