Introduction to Guix

Table of content

This webinar is based on a workshop authored by Ludovic Courtès and presented at COMPAS 2025.

Software deployment in an HPC context

Deploying software in an HPC context is often a non-trivial task.

Let's review different solutions.

The module command

  • ✔ software environment flexibility
  • ❌ varies from machine to machine and in time
  • ❌ often depends on system administration teams

Spack, EasyBuild, CONDA, etc.

  • ✔ software environment flexibility
  • ✔ software building automation
  • ❓ allows to deploy the exact same software stack from machine to machine

    • not really: strong dependency on pre-installed software on the host
  • ❌ consumes disk space, inodes and computation time

Docker, Singularity, Shifter, etc.

  • ✔ software environment flexibility : custom content in Dockerfile or equivalent
  • ✔ automates building of an image of the software stack
  • ✔ allows to deploy the exact same software stack from machine to machine
  • ❌ no traceability (what are these binaries? where do they come from?)
  • ❌ consumes disk space and network bandwidth

Guix

  • ✔ software environment flexibility
  • ✔ software building automation
  • ✔ reproducibility: same code on all the machines
  • ✔ versatility: package management, container image generation, etc.
  • ✔ useful for application deployment, but also for application development

    • the same development environment can be shared within a team
  • ❌ less common in HPC but much better

    • natively available in some Tier-2 centers (GRICAD, GLICID, etc.)
    • in a near future, natively available in a Tier-1 center?
    • can be used even on a cluster where it is not available (see this tutorial)

What is Guix?

  • a package manager (like apt) and its packages (like Debian)
  • an environment manager like module but automatized/integrated
  • an environment manager like VirtualEnv but not limited to Python
  • a container image generation tool (like Docker, Singularity, etc.)
  • a Linux distribution
  • a command line tool

Searching and installing software

Warning

During the webinar, the various commands should be run from an OAR allocation on a node.

First, connect to Grid5000 on the Lille frontend:

ssh lille.g5k

From the frontend, use oarsub to allocate a single core on a single node (resources have been reserved so this should be instantaneous):

oarsub -l host=1/core=1,walltime=2 -p chiclet -q default -I \
       --project=lab-2025-numpex-exadi-guix-introduction \
       -t allowed=special

Searching locally on your computer

Search for a text editor (plain text search in package name and description):

guix search text editor

Installing software on your computer

Install hello in your Guix profile:

guix install hello
hello
  • Questions:

    • Is hello working?
    • If not, how to make it work?
      Temporary solution
      GUIX_PROFILE="$HOME/.guix-profile"
      . "$GUIX_PROFILE/etc/profile"
      Permanent solution
      cat >> ~/.bashrc <<EOF
      GUIX_PROFILE="$HOME/.guix-profile"
      . "$GUIX_PROFILE/etc/profile"
      EOF
      source ~/.bashrc
    • Where is hello binary installed?
      Hint

      You can either:

      • use which
      • check $PATH
      Solution
      which hello
      # $HOME/.guix-profile/bin/hello
      ls -l $HOME/.guix-profile
      # /var/guix/profiles/per-user/$USER/guix-profile
      ls -lH $HOME/.guix-profile
      # lrwxrwxrwx 1 root root   60 Jan  1  1970 bin -> /gnu/store/yjdlb3mfz600nk1dvyqxr1p81d5rds5k-hello-2.12.1/bin
      ls -l /gnu/store/yjdlb3mfz600nk1dvyqxr1p81d5rds5k-hello-2.12.1/bin
      # -r-xr-xr-x 2 root root 47648 Jan  1  1970 hello

Uninstall hello from your Guix profile:

guix remove hello
hello

Searching online on the web

Creating environments with guix shell (documentation)

Creating a development environment (GCC/OpenMPI)

guix shell gcc-toolchain openmpi
gcc --version
mpirun --version
which gcc
  • Question: How is that working?
    Hint

    Either:

    • check $PATH$
    • check GUIX_* environment variables
    Solution
    echo $GUIX_ENVIRONMENT
    echo $PATH

    Guix creates a profile entry in the store containing the requested packages and expands PATH with its bin subdirectory.

echo $GUIX_ENVIRONMENT
echo $PATH
ls $GUIX_ENVIRONMENT/bin

Creating a containerized environment containing GCC and OpenMPI

guix shell -C gcc-toolchain openmpi
gcc --version
mpirun --version
which gcc

Tip

The -C / --container option requires user namespaces kernel feature activation (see man 7 user_namespaces).

  • Question: Why is which not working?
    Hint

    which belongs to the which package.

    Solution
    guix shell -C gcc-toolchain openmpi which
    ls $GUIX_ENVIRONMENT/bin
echo $GUIX_ENVIRONMENT
echo $PATH
ls $GUIX_ENVIRONMENT/bin
  • Question: How to get the ls command working?
    Hint

    ls belongs to the coreutils package.

    Solution
    guix shell -C gcc-toolchain openmpi coreutils
    ls $GUIX_ENVIRONMENT/bin

Generating an isolated environment when -C is not available

guix shell --pure gcc-toolchain openmpi
gcc --version
mpirun --version
which gcc
echo $GUIX_ENVIRONMENT
echo $PATH
/bin/which gcc
  • Is --pure working as expected?
guix shell --pure gcc-toolchain openmpi --check

Protip
$ guix shell --pure gcc-toolchain openmpi -- /bin/sh --norc
sh-4.2$ echo $PATH
/gnu/store/aqlfyp9jp3pcgf5hkm9h7gnrh5dgx66q-profile/bin:/gnu/store/aqlfyp9jp3pcgf5hkm9h7gnrh5dgx66q-profile/sbin

Hands-on: creating an environment with Python, NumPy, Pandas, Matplotlib

Now it's your turn. Don't forget to check that everything works as expected.

Proposed solution
# Search for the packages
guix search pandas
guix search numpy
guix search matplotlib
# Once you know the package name, create a shell containing the packages at the desired version
guix shell -C python python-pandas python-numpy@1 python-matplotlib
# Ensure that you can import the Python modules
python3
>>> import matplotlib
>>> import numpy
>>> import pandas

You might have gotten an error mentioning binary incompatibility. This is due to a possible mismatch between python-numpy and python-pandas: the python-pandas package depends on the version 1.x of python-numpy, and requesting python-numpy on the command-line will bring you the version 2.x (at the time of writing). This can be seen with the following commands:

$ guix show python-pandas

name: python-pandas
version: 2.2.3
outputs:
+ out: everything
systems: x86_64-linux
dependencies: meson-python@0.17.1 python-beautifulsoup4@4.12.3 python-cython-next@3.0.11 python-dateutil@2.8.2 python-html5lib@1.1
+ python-jinja2@3.1.2 python-lxml@4.9.1 python-matplotlib@3.8.2 python-numpy@1.26.2 python-openpyxl@3.1.5 python-pytest-asyncio@0.24.0
+ python-pytest-localserver@0.9.0.post0 python-pytest-mock@3.14.0 python-pytest-xdist@3.6.1 python-pytest@8.3.3 python-pytz@2023.3.post1
+ python-tzdata@2023.4 python-versioneer@0.29 python-xlrd@2.0.1 python-xlsxwriter@3.2.0 which@2.21 xclip@0.13 xsel@1.2.0-1.062e6d3
location: gnu/packages/python-science.scm:2073:2
homepage: https://pandas.pydata.org
license: Modified BSD
synopsis: Data structures for data analysis, time series, and statistics
description: Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with
+ structured (tabular, multidimensional, potentially heterogeneous) and time series data both easy and intuitive.  It aims to be the
+ fundamental high-level building block for doing practical, real world data analysis in Python.

name: python-pandas
version: 1.5.3
outputs:
+ out: everything
systems: x86_64-linux
dependencies: python-beautifulsoup4@4.12.3 python-cython@0.29.35 python-dateutil@2.8.2 python-html5lib@1.1 python-jinja2@3.1.2
+ python-lxml@4.9.1 python-matplotlib@3.8.2 python-numpy@1.26.2 python-openpyxl@3.1.5 python-pytest-mock@3.14.0
+ python-pytest-xdist@3.6.1 python-pytest@8.3.3 python-pytz@2023.3.post1 python-setuptools@67.6.1 python-wheel@0.40.0 python-xlrd@2.0.1
+ python-xlsxwriter@3.2.0 which@2.21 xclip@0.13 xorg-server@21.1.15 xsel@1.2.0-1.062e6d3
location: gnu/packages/python-science.scm:1971:2
homepage: https://pandas.pydata.org
license: Modified BSD
synopsis: Data structures for data analysis, time series, and statistics  
description: Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with
+ structured (tabular, multidimensional, potentially heterogeneous) and time series data both easy and intuitive.  It aims to be the
+ fundamental high-level building block for doing practical, real world data analysis in Python.

$ guix show python-numpy

name: python-numpy
version: 2.2.5
outputs:
+ out: everything
systems: x86_64-linux i686-linux
dependencies: bash@5.1.16 gfortran@11.4.0 meson-python@0.17.1 ninja@1.11.1 openblas@0.3.29 pkg-config@0.29.2 python-hypothesis@6.54.5
+ python-mypy@1.13.0 python-pytest-xdist@3.6.1 python-pytest@8.3.3 python-setuptools@67.6.1 python-typing-extensions@4.12.2
+ python-wheel@0.40.0
location: gnu/packages/python-xyz.scm:10114:2
homepage: https://numpy.org
license: Modified BSD
synopsis: Fundamental package for scientific computing with Python  
description: NumPy is the fundamental package for scientific computing with Python.  It contains among other things: a powerful
+ N-dimensional array object, sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code, useful linear
+ algebra, Fourier transform, and random number capabilities.

name: python-numpy
version: 1.26.2
outputs:
+ out: everything
systems: x86_64-linux i686-linux
dependencies: bash@5.1.16 gfortran@11.4.0 meson-python@0.17.1 openblas@0.3.29 pkg-config@0.29.2 python-hypothesis@6.54.5
+ python-mypy@1.13.0 python-pytest-xdist@3.6.1 python-pytest@8.3.3 python-setuptools@67.6.1 python-typing-extensions@4.12.2
+ python-wheel@0.40.0
location: gnu/packages/python-xyz.scm:9964:2
homepage: https://numpy.org
license: Modified BSD
synopsis: Fundamental package for scientific computing with Python  
description: NumPy is the fundamental package for scientific computing with Python.  It contains among other things: a powerful
+ N-dimensional array object, sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code, useful linear
+ algebra, Fourier transform, and random number capabilities.

This can be fixed by specifying a version for python-numpy using python-numpy@1.

Tip

The arguments given to guix shell are package specifications: a package name, optionally followed by an at-sign and version number, optionally followed by a colon and the name of one of the outputs of a package, e.g. package-name@X.Y.Z, package-name:some-output or package-name@X.Y.Z:some-output.

If no version number is specified, the newest available version is selected.

If the specified version number matches multiple version (e.g. 12 matches 12.1 and 12.3), the newest matching version is selected (in the previous example, version 12.3).

If no output is specified, the default out output is selected.

Note that for most packages, a single version with only the default output is available (an example with multiple versions and multiple outputs is gcc-toolchain, see the output of guix show gcc-toolchain).

More information in the documentation.

Compiling an MPI application

Here is a simple MPI program:

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
  // Initialize the MPI environment
  MPI_Init(NULL, NULL);

  // Get the number of processes
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);

  // Get the rank of the process
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

  // Get the name of the processor
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name_len);

  // Print off a hello world message
  printf("Hello world from processor %s, rank %d out of %d processors\n",
         processor_name, world_rank, world_size);

  // Finalize the MPI environment.
  MPI_Finalize();
}
  • Questions:

    • How to compile it?
      Solution
      • Copy the content to a file, e.g. example.c.
      • Create a shell containing the openmpi and gcc-toolchain packages, mpicc requires a C compiler:

          guix shell -C openmpi gcc-toolchain
      • Compile the source file:

          mpicc -o example example.c
    • What are the potential difficulties?
      Hints
      • Check ldd ./example
      • Check the environment variables (using env so add coreutils to your shell)
      Some possible answers
      • If you don't use the --pure=/–container= option, the generated binary might be linked against libraries coming from the underlying operating system and not from Guix
      • It might also use headers coming from the underlying system
      • If you don't add gcc-toolchain to your shell, you might use gcc from the underlying system

Running the resulting binary

For this section, a new allocation with 2 hosts and 2 cores per host is needed:

# Exit the guix container
exit
# Exit the OAR allocation
exit
# You should now be on the frontend
# Allocate 2 nodes with 2 cores per node
oarsub -l host=2/core=2,walltime=2 -p chiclet -q default -I \
       --project=lab-2025-numpex-exadi-guix-introduction \
       -t allowed=special

Warning

There is currently a restriction on Grid5000 preventing to run an MPI application in a partial allocation using a containerized Guix shell, as oarsh is needed and not available in Guix. --pure will be used for environment isolation.

See the related Grid5000 documentation.

# Exit the guix container
exit
# Run the binary as a single process
./example
# Enter a new shell isolated with --pure, keeping OAR related
# environment variables with -E
guix shell --pure openmpi -E "^OAR" -- /bin/bash --norc
# Use oarsh for internode communication
OMPI_MCA_plm_rsh_agent=/usr/bin/oarsh mpirun -machinefile $OAR_NODEFILE ./example

Tip

This application can be compiled and run from an allocation using full nodes in a containerized environment with the following commands:

# Enter a container exposing OAR related files and variables.
# Openssh is needed for internode communication.
# -N allows network access.
guix shell -CN openmpi -E "^OAR" --expose=/var/lib/oar openssh gcc-toolchain coreutils
# Compile the program
mpicc -o example example.c
# Run it as parallel MPI processes
mpirun -machinefile $OAR_NODEFILE ./example

guix shell can launch an executable if using the -- switch on the command-line.

The code previously compiled can be launched on the fly in a shell using the following command (inside the same OAR allocation):

# Exit the guix shell
# You should be on the node, within the OAR allocation
guix shell --pure openmpi -E "^OAR" -- /usr/bin/env OMPI_MCA_plm_rsh_agent=/usr/bin/oarsh \
     mpirun -machinefile $OAR_NODEFILE ./example

Transforming packages on the fly (package transformation options)

Package definitions can be modified from the command-line (to some extent) using package transformations.

The following example shows dependency substitution using MPICH instead of OpenMPI for the Intel MPI Benchmarks.

guix shell -C --with-input=openmpi=mpich \
     intel-mpi-benchmarks gcc-toolchain
ldd $GUIX_ENVIRONMENT/bin/IMB-P2P

More information on package transformations:

guix build --help-transform

Declarative environments, manifests

Using guix shell with many packages and/or transformations can become tedious.

The package list and transformations can be stored in a Manifest file, which can be shared with collaborators, put under version control, etc.

Such a file can be automatically generated with the --export-manifest option:

guix shell python python-numpy python-scipy \
     --export-manifest > manifest.scm

The manifest file can then be used with the -m / --manifest option (this works for guix shell but also for other commands such as guix pack):

guix shell -m manifest.scm

Updating, channels and pinning

Updating Guix and packages

  • Both the guix command and the package definitions are updated using guix pull:

    guix pull

    It's the rough equivalent of sudo apt update in Debian or Ubuntu.

  • Packages that have been installed using guix install can be updated using guix package:

    guix package -u

    It's the rough equivalent of sudo apt upgrade in Debian or Ubuntu.

Tip

Since guix shell environments are dynamically generated, they always use the current package definitions and don't need to be "updated".

Adding channels for additional packages

Channels are basically Git repositories that extend Guix with package definitions and more.

Each user can use a personal channel configuration, stored by default in ~/.config/guix/channels.scm.

For example, in order to add the Guix Science and Guix HPC channels:

mkdir -p ~/.config/guix
cat > ~/.config/guix/channels.scm <<EOF
(append (list
          (channel
           (name 'guix-science)
           (url "https://codeberg.org/guix-science/guix-science.git")
           (introduction
            (make-channel-introduction
             "b1fe5aaff3ab48e798a4cce02f0212bc91f423dc"
             (openpgp-fingerprint
              "CA4F 8CF4 37D7 478F DA05  5FD4 4213 7701 1A37 8446"))))
          (channel
           (name 'guix-hpc)
           (url "https://gitlab.inria.fr/guix-hpc/guix-hpc.git")))
        %default-channels)
EOF

guix pull

guix pull must be run to update the channel definitions.

See https://hpc.guix.info/channels for a list of existing channels.

What about version numbers?!

😱 The manifest file generated above doesn't contain package version numbers. How can we make sure to select a specific version?

  • Question: is the version number representative of a software environment? Food for thoughts:

    guix graph python | dot -Tpng

    Python dependency graph.

Getting and setting Guix versions (git commit of the channels)

The guix describe command allows to list the channels that are currently available to the guix command, together with their revision:

guix describe

This output can be saved in a file using a format that can be later reused by guix:

guix describe -f channels > channels.scm

This permits to reuse the exact same Guix version together with the exact same package definitions that are available in the current environment.

This file can also be shared among a team (together with a manifest file) in order to make sure every person uses the exact same software environment, using e.g. guix pull -C channels.scm (to configure the guix user environment) or using guix time-machine (to instantiate a dynamic Guix environment).

Time-traveling ⏲

The guix time-machine command allows running any Guix command using a different Guix revision and/or different channels without modifying the default environment.

For example, the following command uses two files, channels.scm and manifest.scm, to deploy the exact same software stack in a reproducible way:

guix time-machine -C channels.scm -- shell -m manifest.scm

guix time-machine can also be used to activate different channels without modifying the user channels. The following example shows how to instantiate a shell containing the package quantum-espresso, available in Guix Science, which is not activated in the default environment:

$ guix describe
  guix d756fb9
    repository URL: https://git.guix.gnu.org/guix.git
    branch: master
    commit: d756fb91ce099774f688bc5fcca380572c3e7d84
$ guix shell quantum-espresso
guix shell: error: quantum-espresso: unknown package
$ cat > ./channels.scm <<EOF
(append (list
          (channel
    	   (name 'guix-science)
    	   (url "https://codeberg.org/guix-science/guix-science.git")
    	   (introduction
    	    (make-channel-introduction
    	     "b1fe5aaff3ab48e798a4cce02f0212bc91f423dc"
    	     (openpgp-fingerprint
    	      "CA4F 8CF4 37D7 478F DA05  5FD4 4213 7701 1A37 8446"))))
    	  (channel
    	   (name 'guix-hpc)
    	   (url "https://gitlab.inria.fr/guix-hpc/guix-hpc.git")))
        %default-channels)
EOF
$ guix time-machine -C channels.scm -- describe
Updating channel 'guix-science' from Git repository at 'https://codeberg.org/guix-science/guix-science.git'...
Updating channel 'guix-hpc' from Git repository at 'https://gitlab.inria.fr/guix-hpc/guix-hpc.git'...
Updating channel 'guix' from Git repository at 'https://git.guix.gnu.org/guix.git'...
Computing Guix derivation for 'x86_64-linux'... /
[...]
building package cache...
building profile with 3 packages...
  guix-science 7bbff91
    repository URL: https://codeberg.org/guix-science/guix-science.git
    branch: master
    commit: 7bbff91a771830003d4349c449d7b3358c7302d1
  guix-hpc ae851c4
    repository URL: https://gitlab.inria.fr/guix-hpc/guix-hpc.git
    branch: master
    commit: ae851c41af80a6b406cf281dd9f99ad107f54e0b
  guix 97c45bb
    repository URL: https://git.guix.gnu.org/guix.git
    branch: master
    commit: 97c45bbfc50bcc29a9b2f8081961ac165b4fc7cc
$ guix time-machine -C channels.scm -- shell quantum-espresso
[...]
building profile with 1 package...
[env] $

Using CUDA

The Guix project promotes the use of Free software.

Due to CUDA being proprietary, CUDA package definitions and CUDA enabled packages are located in separate channels, such as Guix Science nonfree and Guix HPC non-free.

For this section, we will use a specific channels.scm file with the guix time-machine command.

  wget https://guix.bordeaux.inria.fr/eval/9004271/channels.scm

Tip
  • /dev/nvidia* and libcuda.so from the host machine need to be accessible from within the containerized environment. This is achieved with the --expose option.
  • LD_PRELOAD should be set to the path leading to libcuda.so to replace the libcuda.so stub library provided by cuda-toolkit. This is machine specific.
  • On K40 and older hardware, CUDA 11 is required.
  • On more recent hardware, the default CUDA 12 version is used.

Listing the CUDA devices with StarPU

  • With OAR:

    # Allocate a full node with 2 NVIDIA GPUs
    oarsub -l host=1,walltime=2 -p chifflot -q default -I \
           --project=lab-2025-numpex-exadi-guix-introduction \
           -t allowed=special
    # Launch starpu
    guix time-machine -C channels.scm -- shell -C coreutils starpu-cuda \
         --expose=/dev/ --expose=/usr/lib/x86_64-linux-gnu -- \
         env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcuda.so \
         starpu_machine_display
  • With SLURM:

    guix time-machine -C channels.scm -- shell --pure starpu-cuda slurm@23 -- \
         srun -C p100 -N1 \
         /usr/bin/env LD_PRELOAD=/usr/lib64/libcuda.so \
         starpu_machine_display
Tip

starpu_machine_display will error if not running on a full node.

Launching a computation from a dynamic environment with Chameleon

  • With OAR:

    # Allocate 1 full node with 2 NVIDIA GPUs
    oarsub -l host=1/gpu=2,walltime=2 -p chifflot -q default -I \
           --project=lab-2025-numpex-exadi-guix-introduction \
           -t allowed=special
    # Launch chameleon
    guix time-machine -C channels.scm -- shell -C coreutils chameleon-cuda \
         --expose=/dev/ --expose=/usr/lib/x86_64-linux-gnu -- \
         env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcuda.so \
         chameleon_stesting -o gemm -H -g 2 -n 4000 -b 1000
  • With SLURM:

    guix time-machine -C channels.scm -- shell --pure chameleon-cuda slurm@23 -- \
         srun -C p100 \
         /usr/bin/env LD_PRELOAD=/usr/lib64/libcuda.so \
         chameleon_stesting -o gemm -H -g 2 -n 4000 -b 1000

Bonus: Using Guix on machines without Guix

Below is a quote of guix pack documentation:

The guix pack command creates […] a tarball or some other archive containing the binaries of the software you’re interested in, and all its dependencies.

The resulting archive can be used on any machine that does not have Guix, and people can run the exact same binaries as those you have with Guix.

The pack itself is created in a bit-reproducible fashion, so anyone can verify that it really contains the build results that you pretend to be shipping.

How to use guix pack

guix pack [options] <package1> [<package2> ...]

or

guix pack [options] -m <manifest_file>

Archive formats

The --format / -f option allows to choose the archive format:

  • tarball : Default format. Produces a tarball containing the whole software stack in /gnu/store, with symbolic links if specified.
  • docker : Produces a tar archive which respects the Docker image specification and can be used with Docker and other related tools.
  • squashfs : Produces a SquashFS image that is compatible with Singularity.

-R / --relocatable option

Produce relocatable binaries, […] that can be placed anywhere in the file system hierarchy and run from there.

When this option is passed once, the resulting binaries require support for user namespaces in the kernel Linux; when passed twice, relocatable binaries fall to back to other techniques if user namespaces are unavailable, and essentially work anywhere […].