Environments
############
Create New Module
*****************
This walkthrough showcases a user creating a new "Python-Data-Science" module. After creating this module, please see the walkthrough below on how to `Install Conda Dependencies`_ on it for completion.
Demo
====
.. raw:: html
Step-By-Step Walkthrough
========================
.. raw:: html
Update Existing Module
**********************
This walkthrough showcases a user updating an existing "Python-Data-Science" module.
Demo
====
.. raw:: html
Step-By-Step Walkthrough
========================
.. raw:: html
Share Existing Module
*********************
This walkthrough showcases a user sharing an existing module for use by other users.
Demo
====
.. raw:: html
Step-By-Step Walkthrough
========================
.. raw:: html
.. _Install Conda Dependencies:
Install Conda Dependencies
**************************
This walkthrough showcases how to install Conda dependencies on an existing module.
Demo
====
.. raw:: html
Step-By-Step Walkthrough
========================
.. raw:: html
General Documentation
=====================
Introduction
------------
One of the main features of Notebooks Hub is the ability to manage
software environments. Environments are used to isolate software
dependencies and provide reproducibility of the results. At the core,
environments are sets of binaries, libraries, and other dependencies
together with environment definition which dictate how to load and use
them (i.e. update $PATH). We rely on Lmod and Lua modulefiles to define the environments.
Custom environments that are either shared with or installed by the user
appear in the ``/opt/modules`` folder. The associated binaries are then
found within the ``/opt/modules/binaries`` folder, and the modulefiles are
found within the ``/opt/modules/modulefiles//``
folder. The `` is either you or the user who shared the module
with you.
All application types in Notebooks Hub support loading environments.
There are multiple ways to load environments:
* When launching a Server in Notebooks Hub UI, you can select an environment
from the list of available environments in the wizard.
* In JupyterLab, you can select an environment from the list of available
environments in the extension sidebar.
* In all applications providing command line, you can load an environment
using the command:
.. code-block:: bash
module load
Creating a new user environment
-------------------------------
Since Lmod is a very flexible and open-ended system, you can create
environments with almost any software or language you need.
The general steps are going to be the same for all environments:
1. Use Notebooks Hub to create a new environment, which creates
a binary installation folder at ``/opt/modules/binaries/{id}/``
and the associated Lua modulefile at ``/opt/modules/modulefiles//``.
See instructions on how to write a modulefile `here `__
2. Install binaries and libraries in the given environment location ``/opt/modules/binaries/{id}/``
Conda environment
~~~~~~~~~~~~~~~~~
Conda provides a way to specify and install software dependencies in a
reproducible way. It is langugage-agnostic and provides an excellent
support for Python and R environments.
Python
^^^^^^
1. Navigate to the `Environments` tab on Notebooks Hub and click `Create New`
to open the environment wizard.
2. Fill out the Environment metadata in the first step of the wizard including name,
version, and description. Then click Next (e.g. test-env/0.1.0)
3. Create a modulefile in the second step of the wizard by selecting the
correct functions and by providing an associated input. A list of example commands,
which is the combination of a function and an input, can be seen below:
.. code:: lua
help([[
Test GPU kernel
]])
whatis("Version: 0.1.0")
whatis("Keywords: GPU, PyTorch")
prepend_path("JUPYTER_PATH", module_path .. "/share/jupyter")
setenv("JUPYTER_KERNEL_NAME", "My Test Environment")
setenv("PYTHON_EXEC_PATH", module_path .. "/bin/python")
Note:
- The ``module_path`` variable is a reference to the binary folder which is autogenerated by Notebooks Hub.
- In this example, we are using ``JUPYTER_PATH`` and ``JUPYTER_KERNEL_NAME`` to provide the Jupyter kernel for the environment. We also use ``PYTHON_EXEC_PATH`` to provide a Python interpreter for the environment.
4. Click `Create Module`, and verify that the metadata looks correct on the right hand
sidebar. At this point, Notebooks Hub has created the associated binary folder and
modulefile for this environment.
5. Go to the `Servers` tab, and create/launch a new JupyterLab server using the new environment.
This action will mount the proper binary folder on the server and allow you to manually install
conda dependencies into it with the proper command using an `environment.yaml` file as shown
in steps 6 and 7.
6. Create the `environment.yaml` file. It does not matter where this is located. An example can be
seen below:
.. code:: yaml
name: gpu-env
channels:
- pytorch
dependencies:
- python=3.9
- pip=22.2.2
- ipykernel
- pytorch=1.11.0=py3.9_cuda11.3_cudnn8.2.0_0
- torchvision=0.12.0=py39_cu113
- torchaudio=0.11.0=py39_cu113
- cudatoolkit=11.3.1
7. Build the conda environment using the correct {id} which can be found by referencing the
`module_path` variable in the modulefile.
.. code:: sh
conda env create --prefix /opt/modules/binaries/{id} --file environment.yaml
8. (Optional) Modify the Jupyter kernel name:
- Rename the folder ``/opt/modules/binaries/{id}/share/jupyter/kernels/python3`` to ``test-kernel``
- modify the file ``/opt/modules/binaries/{id}/share/jupyter/kernels/test-kernel/kernel.json`` to change ``display_name`` so the new kernel won’t clash with the existing Python 3 kernel.
9. The new module that you can load at any time for new servers or at the runtime will now appear
in Notebooks Hub.
Using Poetry with Conda
^^^^^^^^^^^^^^^^^^^^^^^
Poetry is a tool for dependency management and packaging in Python.While
Conda is a general-purpose package and environment manager with
cross-language support, Poetry is specifically designed for Python
projects, providing dependency management, packaging, and project
metadata features. Poetry can easily be used in conjuction with Conda
environments.
1. Create a new minimal Conda environment with Poetry pre-installed
.. code:: sh
conda create --prefix /opt/modules/binaries/{id} python=3.11 poetry
conda activate /opt/modules/binaries/{id}
2. Clone existing project
.. code:: sh
git clone
cd
3. Initialize Poetry
.. code:: sh
poetry init
4. Install the project
.. code:: sh
poetry install
References: - `Poetry documentation `__
- `what is difference between conda and poetry? when to use conda over poetry? `__
- `Conda and Poetry: A Harmonious Fusion `__
R
^
1. Navigate to the `Environments` tab on Notebooks Hub and click `Create New`
to open the environment wizard.
2. Fill out the Environment metadata in the first step of the wizard including name,
version, and description. Then click Next (e.g. r-env/0.1.0)
3. Create a modulefile in the second step of the wizard by selecting the
correct functions and by providing an associated input. A list of example commands,
which is the combination of a function and an input, can be seen below:
.. code:: lua
help([[
Conda environment with R packages
]])
help([[ "Conda environment with R packages" ]])
whatis("Version: 0.2.0")
whatis("Keywords: Scientific/Engineering, Software Development, R")
setenv("R", module_path .. "/bin/R")
setenv("RSTUDIO_WHICH_R", module_path .. "/bin/R")
setenv("R_LIBS", module_path .. "/lib")
setenv("R_LIBS_USER", module_path .. "/lib/R/library")
Notes:
- The ``module_path`` variable is a reference to the binary folder which is autogenerated by Notebooks Hub.
- The last 3 lines are required to get the environment working in our implementation of RStudio IDE and R Shiny dashboard.
4. Click `Create Module`, and verify that the metadata looks correct on the right hand
sidebar. At this point, Notebooks Hub has created the associated binary folder and
modulefile for this environment.
5. Go to the `Servers` tab, and create/launch a new JupyterLab server using the new environment.
This action will mount the proper binary folder on the server and allow you to manually install
conda dependencies into it with the proper command using an `environment.yaml` file as shown
in steps 6 and 7.
6. Create the `environment.yaml` file. It does not matter where this is located. An example can be
seen below:
.. code:: yaml
name: r-env
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- r-base=4.3.3
- r-essentials=4.3
- r-devtools=2.4.5
7. Build the conda environment using the correct {id} which can be found by referencing the
`module_path` variable in the modulefile.
.. code:: sh
conda env create --prefix /opt/modules/binaries/{id} --file environment.yaml
8. (Optional) Modify the Jupyter kernel name:
- Rename the folder ``/opt/modules/binaries/{id}/share/jupyter/kernels/R`` to ``test-kernel``
- modify the file ``/opt/modules/binaries/{id}/share/jupyter/kernels/test-kernel/kernel.json`` to change ``display_name`` so the new kernel won’t clash with the existing Python 3 kernel.
9. The new module that you can load at any time for new servers or at the runtime will now appear
in Notebooks Hub.
Debian package
~~~~~~~~~~~~~~
1. Navigate to the `Environments` tab on Notebooks Hub and click `Create New`
to open the environment wizard.
2. Fill out the Environment metadata in the first step of the wizard including name,
version, and description. Then click Next (e.g. libmariadb-dev/10.6.16.lua)
3. Create a modulefile in the second step of the wizard by selecting the
correct functions and by providing an associated input. A list of example commands,
which is the combination of a function and an input, can be seen below:
.. code:: lua
help([[
Debian package libmariadb-dev
]])
whatis("Version: 10.6.16")
whatis("Keywords: Database, Development, C")
append_path("INCLUDE_DIR", module_path .. "/dpkg/libmariadb-dev/usr/include")
append_path("LIB_DIR", module_path .. "/dpkg/libmariadb-dev/usr/lib")
append_path("LD_LIBRARY_PATH", module_path .. "/dpkg/libmariadb-dev/usr/lib")
append_path("PATH", module_path .. "/dpkg/libmariadb-dev/usr/bin")
Note:
- The ``module_path`` variable is a reference to the binary folder which is autogenerated by Notebooks Hub.
- Normally, in Ubuntu and Debian, you can install packages using ``apt-get`` or ``apt``. However, since user Servers are containerazied and don’t have root rights, you can’t install packages using ``apt-get`` or ``apt``. Instead, you can download the package, install it in a modules directory using ``dpkg`` and point to it using ``LD_LIBRARY_PATH`` and ``PATH`` environment variables.
References:
- `Install a package in custom location `__
- `Example of download page `__
4. Click `Create Module`, and verify that the metadata looks correct on the right hand
sidebar. At this point, Notebooks Hub has created the associated binary folder and
modulefile for this environment. The environment modulefile will be saved at
``/opt/modules/modulefiles//libmariadb-dev/10.6.16.lua``
5. Install binaries using ``dpkg`` in the proper location following steps 6 and 7.
6. Download the package from the official repository. For example, to
download ``libmariadb-dev`` package, you can use the following
command:
.. code:: sh
wget http://security.ubuntu.com/ubuntu/pool/universe/m/mariadb-10.6/libmariadb-dev_10.6.16-0ubuntu0.22.04.1_amd64.deb
7. Install the package in the modules binary folder
.. code:: sh
dpkg -x libmariadb-dev_10.6.16-0ubuntu0.22.04.1_amd64.deb /opt/modules/binaries/{id}/dpkg/libmariadb-dev
The binary {id} can be found from the ``module_path``, which is generated as a variable
in the modulefile noted previously at ``/opt/modules/modulefiles//libmariadb-dev/10.6.16.lua``.