User Image Stacks

Overview

Our Docker image stack hierarchy is inspired by jupyter/docker-stacks, but customized for our specific needs. This document outlines our image hierarchy, key features, and customizations.

Image Hierarchies

jupyter/docker-stacks Reference Hierarchy

graph TD A[ubuntu LTS with point release] --> B[docker-stacks-foundation] B --> C[base-notebook] C --> D[minimal-notebook] D --> E[scipy-notebook] D --> F[r-notebook] D --> G[julia-notebook] E --> H[tensorflow-notebook] E --> I[pytorch-notebook] E --> J[datascience-notebook] E --> K[pyspark-notebook] K --> L[all-spark-notebook]

Our Custom Hierarchy

graph TD A["ubuntu<br>(LTS with point release)"] --> B[docker-stacks-foundation] C["ubuntu<br>+ CUDA"] --> B["polusai/notebooks-hub-stacks-foundation<br><ul><li>add Lmod</li><li>use apt-fast</li></ul>"] B --> D["polusai/notebook<br> <img src='../../img/app_logos/jupyter_logo.png' style='max-width:30px;min-height:0'/>"] B --> E["polusai/dashboard-base<br>(add jhsingle-native-proxy)"] E --> F["polusai/vscode<br> <img src='../../img/app_logos/vscode_logo.png' style='max-width:30px;min-height:0'/>"] E --> G["polusai/rstudio<br> <img src='../../img/app_logos/rstudio_logo.png' style='max-width:60px;min-height:0'/>"] E --> H["polusai/rshiny<br> <img src='../../img/app_logos/rshiny_logo.png' style='max-width:60px;min-height:0'/>"] E --> I["polusai/pshiny<br> <img src='../../img/app_logos/pshiny_logo.png' style='max-width:60px;min-height:0'/>"] E --> J["polusai/dash<br> <img src='../../img/app_logos/dash_logo.png' style='max-width:60px;min-height:0'/>"] E --> K["polusai/solara<br> <img src='../../img/app_logos/solara_logo.png' style='max-width:60px;min-height:0'/>"] E --> L["polusai/streamlit<br> <img src='../../img/app_logos/streamlit_logo.png' style='max-width:60px;min-height:0'/>"] E --> M["polusai/voila<br> <img src='../../img/app_logos/voila_logo.png' style='max-width:60px;min-height:0'/>"]

Base Images

polusai/notebooks-hub-stacks-foundation

This image is based on jupyter/docker-stacks-foundation with additional customizations.

Features from jupyter/docker-stacks-foundation:

  • Package managers:

    • conda: Cross-platform, language-agnostic binary package manager

    • mamba: Faster reimplementation of conda in C++ (default package manager)

  • Unprivileged user jovyan (uid=1000, configurable) in group users (gid=100)

  • tini and start.sh script as container entry point

  • run-hooks.sh script for sourcing/running files in a given directory

  • Options for passwordless sudo

  • Common system libraries (bzip2, ca-certificates, locales)

  • wget for downloading external files

  • No preinstalled scientific computing packages

Additional customizations:

  • Lmod for managing environment modules across Notebooks Hub

  • apt-fast for faster .deb package installation

  • Additional tools: gdebi-core, vim, nano, jq

  • Symbolic link to shared directory: /home/jovyan/shared -> /opt/shared

polusai/notebook

Based on the foundation image, this adds:

  • JupyterLab

  • Related packages and extensions

polusai/dashboard-base

This image adds:

  • jhsingle-native-proxy for making any web-based application compatible with JupyterHub by presenting itself as a Jupyter Server.

  • A wrapper script for starting applications

Application-specific Images

The following images are based on the dashboard-base image:

  1. polusai/vscode: VS Code development environment based on Coder

  2. polusai/rstudio: RStudio IDE for R development

  3. polusai/rshiny: R Shiny for interactive web applications

  4. polusai/pshiny: Python-based Shiny alternative

  5. polusai/dash: Plotly Dash for analytical web applications

  6. polusai/solara: Streamlit-like framework for data apps

  7. polusai/streamlit: Streamlit for data applications

  8. polusai/voila: Voilà for converting Jupyter notebooks to standalone applications

Usage

Building a Dashboard Image

To build a custom dashboard image, navigate to the directory containing your Dockerfile and run the following command:

docker build . -t <dashboard_image>  

This command will use the latest dashboard-base image by default to create your custom dashboard image.

Running the Dashboard Container

To run the dashboard container, you need to set the DASHBOARD_PORT environment variable and ensure that the necessary port (default is 8888) is open. You can use the following docker run command to start your container:

docker run -d -p 8888:8888 -e DASHBOARD_PORT=8888 <dashboard_image>  

Here’s a breakdown of the command:

  • docker run -d: Run the container in detached mode.

  • -p 8888:8888: Map the container’s port 8888 to the host’s port 8888.

  • -e DASHBOARD_PORT=8888: Set the DASHBOARD_PORT environment variable to 8888.

  • <dashboard_image>: Replace this with the name of your custom dashboard image.

After running this command, your dashboard application should be accessible via http://localhost:8888.

Example

Suppose you have built an image named my-custom-dashboard. You would start it with:

docker run -d -p 8888:8888 -e DASHBOARD_PORT=8888 my-custom-dashboard  

This will launch your custom dashboard and make it accessible at http://localhost:8888.

By following these steps, you can easily build and run your custom dashboard images within the Notebooks Hub environment.

Customization

The polusai/dashboard-base image is designed to simplify the adoption of new web-based applications. This flexibility allows you to easily add and integrate various applications into your Notebooks Hub environment.

Prerequisites for New Applications

To successfully integrate a new application using the dashboard-base image, ensure that the application meets these two key requirements:

  1. Command-line Startup: The application must be capable of starting from the command line in a Linux environment.

  2. Web Application Port: The application should serve its web interface on a port, ideally one that is configurable.

Steps to Integrate a New Application

  1. Create a New Dockerfile: Start with the polusai/dashboard-base image as your base:

    ARG ROOT_CONTAINER=polusai/dashboard-base:latest
    FROM $ROOT_CONTAINER
    
    USER root
    
    COPY start-dashboard.sh /usr/bin
    RUN chmod +x /usr/bin/start-dashboard.sh
    
    USER $NB_USER
    
    # Start the dashboard
    CMD ["/usr/bin/dashboard-wrapper.sh"]
    
  2. Create a start-dashboard.sh script: This script will contain the specific logic to start your application. Here’s an example based on the Streamlit implementation:

    #!/bin/bash
    
    # run dashboard based on type
    if [ "$DASHBOARD_TYPE" == "file" ]; then
      $PYTHON_EXEC_PATH -m pip install your-application==1.0.0 dependency1 dependency2
      $JHSINGLE_COMMAND --destport 8080 $PYTHON_EXEC_PATH {-}m your-application run $DASHBOARD_PATH {--}port 8080 {--}other-options
    elif [ "$DASHBOARD_TYPE" == "folder" ]; then
      echo "ERROR: Your application requires file path to be provided" >&2
      exit 1 
    elif [ "$DASHBOARD_TYPE" == "none" ]; then
      echo "ERROR: Your application requires file path to be provided" >&2
      exit 1 
    fi
    
  3. Understand the dashboard-wrapper.sh script: The dashboard-wrapper.sh script in the dashboard-base image sets up the environment and calls your start-dashboard.sh script. Key points:

    • It activates Lmod and loads environment modules.

    • Sets up the JHSINGLE_COMMAND with appropriate options.

    • Sources your start-dashboard.sh script.

  4. Configure jhsingle-native-proxy: The JHSINGLE_COMMAND in dashboard-wrapper.sh handles the proxying of your application. Ensure your start-dashboard.sh uses this command correctly.

  5. Port Configuration: Use the $DASHBOARD_PORT environment variable (set in dashboard-wrapper.sh) for your application’s port.

  6. Build and Test: Build your new image and test it to ensure the application starts correctly and is accessible through the proxy.

Example: Integrating a Hypothetical Web App

Let’s say you want to integrate a hypothetical web application called “DataViz”. Here’s how you might set it up:

Dockerfile:

ARG ROOT_CONTAINER=polusai/dashboard-base:latest
FROM $ROOT_CONTAINER

COPY start-dashboard.sh /usr/bin

CMD ["/usr/bin/dashboard-wrapper.sh"]

start-dashboard.sh:

#!/bin/bash

if [ "$DASHBOARD_TYPE" == "file" ]; then
  $PYTHON_EXEC_PATH -m pip install dataviz==1.0.0 required-dependency1 required-dependency2
  $JHSINGLE_COMMAND --destport 8080 $PYTHON_EXEC_PATH {-}m dataviz run $DASHBOARD_PATH {--}port 8080 {--}server.address=0.0.0.0 {--}server.headless True
elif [ "$DASHBOARD_TYPE" == "folder" ]; then
  echo "ERROR: DataViz requires a file path to be provided" >&2
  exit 1 
elif [ "$DASHBOARD_TYPE" == "none" ]; then
  echo "ERROR: DataViz requires a file path to be provided" >&2
  exit 1 
fi

This setup installs the DataViz application, sets its port, and uses jhsingle-native-proxy (via $JHSINGLE_COMMAND) to make it available through the standard Notebooks Hub interface.

Separation of dashboarding tools from the environment

In Notebooks Hub, there are broadly two types of dashboarding tools: those that can be separated from the environment they use and those that cannot.

Tools that can run independently

Examples of tools that can run independently:

  • Voila. It has a flag --VoilaExecutor.kernel_name which allows to point it to a separate environment.

  • RStudio. It has a flag --rsession-which-r which allows to point it to R executable in a separate environment.

  • RShiny. It implicitly uses R executable which can be remapped to a separate environment.

Tools That Need Runtime Installation of Dashboarding Library

These tools normally import Python packages from the same environment they are installed in and don’t have any configuration flags to point them to a separate environment:

  • PyShiny

  • Streamlit

  • Solara

  • Dash

We install packages at runtime in the start-dashboard.sh script because we combine the user-provided environment with Shiny dependencies, and thus cannot install the packages during the Docker build.

Best Practices

✅ Use environment variables like $DASHBOARD_TYPE, $DASHBOARD_PATH, and $PYTHON_EXEC_PATH provided by the dashboard-base image.

✅ Include necessary dependencies and their versions in your start-dashboard.sh script.

✅ Handle different DASHBOARD_TYPE scenarios appropriately.

✅ Use the $JHSINGLE_COMMAND for starting your application to ensure proper integration with the proxy.

✅ Test thoroughly to ensure compatibility with the Notebooks Hub environment.

✅ Consider adding health checks or additional error handling in your start-dashboard.sh script.

✅ Consider the type of dashboarding application when adding a new one to ensure proper configuration and integration.

By following these guidelines and the provided example, you can easily extend the functionality of your Notebooks Hub environment with new web-based applications, leveraging the flexibility provided by the dashboard-base image.