Skip to content

pangeo-data/pangeo-docker-images

Repository files navigation

Pangeo Docker Images

Documentation build status Build Status Publish Status DockerHub Version

The images defined in this repository capture reproducible computing environments used by Pangeo Cloud. They build on top of the Ubuntu operating system and include conda environments with a curated set of Python packages for geospatial analysis. While initially intended for Pangeo Cloud, they can be used outside of Pangeo infrastructure too!

More details can be found in our documentation.

Images are hosted on DockerHub and on Quay.io

Image Description Size Pulls
base-image Foundational Dockerfile for builds
base-notebook minimally functional image for pangeo hubs
pangeo-notebook base-notebook + core earth science analysis packages
pytorch-notebook pangeo-notebook + GPU-enabled pytorch
ml-notebook pangeo-notebook + GPU-enabled tensorflow2

Click on the image name in the table above for a current list of installed packages and versions

graph TD;
    base-image-->base-notebook;
    base-notebook-->pangeo-notebook;
    pangeo-notebook-->pytorch-notebook;
    pangeo-notebook-->ml-notebook;
    click base-image "https://hub.docker.com/r/pangeo/base-image" "Open this in a new tab" _blank
    click base-notebook "https://hub.docker.com/r/pangeo/base-notebook" "Open this in a new tab" _blank
    click pangeo-notebook "https://hub.docker.com/r/pangeo/pangeo-notebook" "Open this in a new tab" _blank
    click pytorch-notebook "https://hub.docker.com/r/pangeo/pytorch-notebook" "Open this in a new tab" _blank
    click ml-notebook "https://hub.docker.com/r/pangeo/ml-notebook" "Open this in a new tab" _blank
Loading

Using the image with Singularity on HPC systems

If you want to use this image on an HPC system (including a GPU system), we recommend using Singularity. Please see the Singularity guide.

Dask-gateway compatibility

The primary use of these Docker images is running on Pangeo Cloud deployments with dask-gateway. Generally, the dask-gateway library version built into the image must match the dask-gateway version deployed in the cloud environment. The follow table keeps track of the first time a new dask-gateway version appears in a tagged image:

dask-gateway Image tag
0.9 2020.11.06
0.8 2020.07.28
0.7 2020.04.22

Other notes

  • Since 2020.10.16, mamba is installed into the base-image and conda-lock environment and is used by default to solve for a compatible environment (see #146)
  • For a simple list of packages for a given image, you can use a link like this: https://github.com/pangeo-data/pangeo-docker-images/blob/2020.10.08/pangeo-notebook/packages.txt
  • To compare changes between two images, you can use a link like this: https://github.com/pangeo-data/pangeo-docker-images/compare/2020.10.03..2020.10.08
  • As of 2024.05.21, the ml-notebook and pytorch-notebook docker images contain machine learning libraries built with CUDA 12. In previous versions, we have suggested ml-notebook users to install cuda-nvcc manually to obtain JAX and/or TensorFlow with XLA optimization, but this workaround should no longer be needed if you are using ml-notebook 2024.06.02 or newer that comes with cuda-nvcc pre-installed.
  • There used to be a pangeo/forge image, built for use with pangeo-forge. It is no longer actively maintained or used, but you can still use the historical tags if you wish.
  • Note that users of zarr-python are advised to avoid using image tags 2025.01.10 and 2025.01.24 due to a bug in zarr-python>=3.0.0,<=3.0.7 that may result in potential data loss, see more details in #606

About

Docker Images For Pangeo Jupyter Environment

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors 34