This pages assumes that Jupyter is available. For more details on connecting to Jupyter, see the guide on port forwarding. This page will only detail how to start Jupyter.

Warning

Do not run Jupyter notebook servers on CHTC submit nodes except through the process described on this page.

Launching Jupyter with this process allows the CHTC admins to effectively monitor resource usage in the Jupyter process and debug/info messages to be more easily displayed. If you persist the Jupyter session through other means like tmux, the CHTC admins may kill your entire tmux session if it consumes too much CPU and memory.

Running Jupyter through Dask-CHTC

Attention

You may want to interact with your Dask cluster through a Jupyter notebook. Dask-CHTC provides a way to run a Jupyter notebook server on a CHTC submit node.

Warning

Jupyter must be installed, which amounts to running conda install jupyterlab or adding jupyter to your environment.yml file. For more detail, see the Jupyter install documentation,

You can run a notebook server via the Dask-CHTC command line tool, via the jupyter subcommand. The command line tool will run the notebook server as an HTCondor job. To see the detailed documentation for this subcommand, run

$ dask-chtc jupyter --help
Usage: dask-chtc jupyter [OPTIONS] COMMAND [ARGS]...

    [... long description cut ...]

    Commands:
      start   Start a Jupyter notebook server as a persistent HTCondor job.
      run     Run a Jupyter notebook server as an HTCondor job.
      status  Get information about your running Jupyter notebook server.
      stop    Stop a Jupyter notebook server that was started via "start".

The four sub-sub-commands of dask-chtc jupyter (run, start, status, and stop) let us run and interact with a Jupyter notebook server. You can run dask-chtc jupyter <subcommand> --help to get detailed documentation on each of them, but for now, let’s try out the run subcommand.

Using the run subcommand

The run subcommand is the simplest way to launch a Jupyter notebook server. It is designed to mimic the behavior of running a Jupyter notebook server on your local machine. Any command line arguments you pass to it will be passed to the actual jupyter command line tool.

Jupyter Lab instances are normally started with jupyter lab. The equivalent command for Dask-CHTC is dask-chtc jupyter run lab:

$ dask-chtc jupyter run lab
000 (7858010.000.000) 2020-07-13 10:38:46 Job submitted from host: <128.104.100.44:9618?addrs=128.104.100.44-9618+[2607-f388-107c-501-92e2-baff-fe2c-2724]-9618&alias=submit3.chtc.wisc.edu&noUDP&sock=schedd_4216_675f>
001 (7858010.000.000) 2020-07-13 10:38:47 Job executing on host: <128.104.100.44:9618?addrs=128.104.100.44-9618+[2607-f388-107c-501-92e2-baff-fe2c-2724]-9618&alias=submit3.chtc.wisc.edu&noUDP&sock=starter_5948_a76b_2712469>
[... Jupyter startup logs cut ...]
[I 10:38:51.582 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 10:38:51.587 LabApp]

    To access the notebook, open this file in a browser:
        file:///home/karpel/.local/share/jupyter/runtime/nbserver-2187556-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=fedee94f539b0beea492bb358d549ed79025b714f3b308c4
     or http://127.0.0.1:8888/?token=fedee94f539b0beea492bb358d549ed79025b714f3b308c4

Dask-CHTC mixes HTCondor job diagnostic information into the normal Jupyter output stream. These messages may be helpful if your notebook server job is unexpectedly interrupted.

Just like running jupyter lab, if you press Control-C, the notebook server will be stopped:

^C
[C 10:40:35.962 LabApp] received signal 15, stopping
[I 10:40:35.963 LabApp] Shutting down 0 kernels
004 (7858010.000.000) 2020-07-13 10:40:36 Job was evicted.
    (0) CPU times
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
        Usr 0 00:00:01, Sys 0 00:00:00  -  Run Local Usage
    0  -  Run Bytes Sent By Job
    0  -  Run Bytes Received By Job
009 (7858010.000.000) 2020-07-13 10:40:36 Job was aborted.
    Shut down Jupyter notebook server (by user karpel)

You can think of this notebook server as being tied to your ssh session. If your ssh session disconnects (either because you quit manually, or because it timed out, or because you closed your laptop, or any number of other possible reasons) your notebook server will also stop. The next section will discuss how to run your notebook server in a more persistent manner.

Using the start, status, and stop subcommands

The start subcommand is similar to the run subcommand, except that if you end the command by Control-C or your terminal session ending, the notebook server will not be stopped. The command will still “take over” your terminal, echoing log messages just like the run subcommand did:

$ dask-chtc jupyter start lab
000 (7858021.000.000) 2020-07-13 10:52:51 Job submitted from host: <128.104.100.44:9618?addrs=128.104.100.44-9618+[2607-f388-107c-501-92e2-baff-fe2c-2724]-9618&alias=submit3.chtc.wisc.edu&noUDP&sock=schedd_4216_675f>
001 (7858021.000.000) 2020-07-13 10:52:51 Job executing on host: <128.104.100.44:9618?addrs=128.104.100.44-9618+[2607-f388-107c-501-92e2-baff-fe2c-2724]-9618&alias=submit3.chtc.wisc.edu&noUDP&sock=starter_5948_a76b_2713469>
[... Jupyter startup logs cut ...]
[I 10:52:56.060 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 10:52:56.066 LabApp]

    To access the notebook, open this file in a browser:
        file:///home/karpel/.local/share/jupyter/runtime/nbserver-2209285-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=3342f18a95d7d61c51a2b8cf80b836e932ac53f9ebdb3965
     or http://127.0.0.1:8888/?token=3342f18a95d7d61c51a2b8cf80b836e932ac53f9ebdb3965
^C

Even though we pressed Control-C, the notebook server will still be running. We can look at the status of our notebook server job using the status subcommand, which will show us various diagnostic information on both the Jupyter notebook server and the HTCondor job it is running inside:

$ dask-chtc jupyter status
█ RUNNING  jupyter lab
├─ Contact Address: http://127.0.0.1:8888/?token=3342f18a95d7d61c51a2b8cf80b836e932ac53f9ebdb3965
├─ Python Executable: /home/karpel/.python/envs/dask-chtc/bin/python3.7
├─ Working Directory:  /home/karpel/dask-chtc
├─ Job ID: 7858021.0
├─ Last status change at:  2020-07-13 15:52:51+00:00 UTC (4 minutes ago)
├─ Originally started at: 2020-07-13 15:52:51+00:00 UTC (4 minutes ago)
├─ Output: /home/karpel/.dask-chtc/jupyter-logs/current.out
├─ Error:  /home/karpel/.dask-chtc/jupyter-logs/current.err
└─ Events: /home/karpel/.dask-chtc/jupyter-logs/current.events

This may be particularly useful for recovering the contact address of a notebook server that you started running in a previous ssh session.

To stop your notebook server, run

$ dask-chtc jupyter stop
[C 11:02:57.820 LabApp] received signal 15, stopping
[I 11:02:57.821 LabApp] Shutting down 0 kernels
004 (7858021.000.000) 2020-07-13 11:02:58 Job was evicted.
    (0) CPU times
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
        Usr 0 00:00:01, Sys 0 00:00:00  -  Run Local Usage
    0  -  Run Bytes Sent By Job
    0  -  Run Bytes Received By Job
009 (7858021.000.000) 2020-07-13 11:02:58 Job was aborted.
    Shut down Jupyter notebook server (by user karpel)

What’s Next?

Once you’re able to connect to your Jupyter notebook server, you should move on to Dask Cluster Creation to learn how to create a CHTCCluster.