DaskHub

Warning

Deploying DaskHub requires an existing Kubernetes cluster.

Introduction

DaskHub provides a multi-user, Dask-Gateway enabled JupyterHub. Dask is a library for scaling Python code across a cluster.

Launch configuration

To get started, in the Platforms tab, press the New Platform button, and select DaskHub.

You will then be presented with launch configuration options to fill in:

Option	Explanation
Platform name	A name to identify the DaskHub platform
Kubernetes cluster	The Kubernetes platform on which to deploy DaskHub. If one hasn't already been created, check out the Kubernetes Overview.
App version	The version of the DaskHub Azimuth Application to use.
Notebook CPUs	The number of CPUs to allocate to each user notebook.
Notebook RAM	The amount of RAM to allocate to each user notebook.
Notebook storage	The amount of disk storage to allocate to each user notebook.

Using DaskHub

Accessing DaskHub

After creating the DaskHub platform, DaskHub's corresponding JupyterHub will automatically be exposed by Azimuth's Zenith proxy.

It can be accessed via the link under Services.

Using Dask Gateway

Dask Gateway is configured to integrate with the JupyterHub environment, so creating a Dask cluster requires very little configuration.

For example, the following creates a Dask cluster that scales between 0 and 10 workers on the underlying Kubernetes cluster depending on the workload:

from dask_gateway import Gateway

gateway = Gateway()
cluster = gateway.new_cluster()
cluster.adapt(minimum=0, maximum=10)
cluster

If your Kubernetes cluster has autoscaling of nodes configured, this may cause the cluster itself to grow in size to accomodate your Dask cluster. Once you have finished with the Dask cluster, the Kubernetes nodes will scale back down again when possible.