Modern HPC Workflow Example (Spack)

Table of content

This is the second part of the Worklow Tutorial. In the previous example we show how to use Singularity and Guix for our running example, Chameleon, on HPC clusters (Modern HPC Workflow Example (Guix)).

Warning

This tutorial relies on a GitLab access token for the registry. Since the tutorial took place, this token has expired.

In this second part, we will use Spack instead of Guix. We will also produce Spack-generated containers, for easy reproducibility of the workflow across different computers.

In summary, we are going to:

  • Install Spack on Grid'5000.
  • Build Chameleon with CUDA support.
  • Push the packages into a container registry.
  • Pull the packages as a Singularity container.
  • Run the container in the GPU partition of Grid'5000, or other supercomputer.

About Container Registries

There are 2 ways to generate containers with Spack:

The containerize option has a number of drawbacks, so we want to push with the Build Caches option. This also has the benefit of being able to build and cache packages on CI/CD, allowing for quicker deployments.

The Spack build cache will require setting up a container registry, in some Git Forge solution. Both GitHub and GitLab provide their own Container Registry solutions. This guide presents how to create it: Setup a Container Registry on GitLab.

For this tutorial, we will use the container registry hosted at Inria’s GitLab.

Build the Container on Grid'5000

We will connect to the Lillie site on Grid'5000, exactly the same as with the Guix guide.

Note

If you are having trouble at any step, you can skip this and download the container directly:

$ wget --user=<your_g5k_login> --ask-password https://api.grid5000.fr/sid/sites/lille/public/fayatsllamas/chameleon-spack.sif
$ ssh lille.g5k
$ mkdir tuto-spack && cd tuto-spack

Spack is installed at the user level. To install Spack, you have to clone the Spack repo, and load it with source:

$ git clone -c feature.manyFiles=true https://github.com/spack/spack

$ cd spack
$ git checkout b7f556e4b444798e5cab2f6bbfa7f6164862700e
$ cd ..

$ source spack/share/spack/setup-env.sh
$ spack --version
1.0.0.dev0 (b7f556e4b444798e5cab2f6bbfa7f6164862700e)

We will create an Spack environment, that holds our configuration and installed packages. The Spack environment will create a spack.yaml file, which we will edit:

$ spack env create --dir ./myenv   # this may be a bit slow
$ spack env activate ./myenv
$ spack env status
==> In environment /home/fayatsllamas/myenv

Open the ./myenv/spack.yaml with your favorite editor, and you will see something like this:

# ./myenv/spack.yaml
spack:
  specs: []
  view: true
  concretizer:
    unify: true

We will perform 3 modifications:

  • Add Chamelon to the list of installed packages
  • Configure Spack to build our packages for generic x86_64. This will ensure it doesn’t mix the ISA’s of the nodes we will use.
  • Configure 2 mirrors:
    • inria-pull is a mirror I populated with caches of the packages for the tutorial.
    • inria-<name> is a mirror you will use to push the packages you build, as an example.
Important

Change inria-<name> and the URL .../buildcache-<name> to a unique name. You will push to this cache as an example, so we don’t collide between each other. You can use your G5k login, for example.

# ./myenv/spack.yaml
spack:
  specs:
  - chameleon@master+cuda
  packages:
    all:
      require: target=x86_64

  view: true
  concretizer:
    unify: true

  mirrors:
    inria-pull:
      url: oci://registry.gitlab.inria.fr/numpex-pc5/tutorials/buildcache
      signed: false

    inria-<name>:
      url: oci://registry.gitlab.inria.fr/numpex-pc5/tutorials/buildcache-<name>
      access_pair:
      - guest
      - glpat-x_uFkxezH1iTKi6KmLrb
      signed: false

Edit the spack.yaml file and save it. After the environment has been modified, we call spack concretize to “lock” our changes (to a spack.lock file). We can use spack spec to preview the status of our environment. It will show the packages we are missing to be built.

Note

spack concretize locks the characteristics of the environment to the current machine. We are concretizing on the frontend node for convenience, and to be able to test our packages in it.

$ spack concretize
$ spack spec
 -   chameleon@master%gcc@10.2.1+cuda~fxt~ipo+mpi+shared~simgrid build_system=cmake build_type=Release cuda_arch=none generator=make runtime=starpu arch=linux-debian11-x86_64
 -       ^cmake@3.31.4%gcc@10.2.1~doc+ncurses+ownlibs~qtgui build_system=generic build_type=Release arch=linux-debian11-x86_64
 -           ^curl@8.11.1%gcc@10.2.1~gssapi~ldap~libidn2~librtmp~libssh~libssh2+nghttp2 build_system=autotools libs=shared,static tls=openssl arch=linux-debian11-x86_64
 -               ^nghttp2@1.64.0%gcc@10.2.1 build_system=autotools arch=linux-debian11-x86_64
...

The next step is to build the packages. Usually, this is a CPU-intensive job. Let’s move to a CPU node of G5k for this (chiclet):

$ oarsub -t allowed=special --project lab-2025-numpex-exadi-guix-spack-in-hpccenters -I -p chiclet -l /nodes=1
[compute]$ source spack/share/spack/setup-env.sh
[compute]$ spack env activate --dir ./myenv

To build our software stack, just call spack install. We have configured a pull-only build cache previously, so packages will not be re-compiled:

[compute]$ spack install

You may want to check that everything was built, by running spack spec again:

[compute]$ spack spec 
[+]  chameleon@master%gcc@10.2.1+cuda~fxt~ipo+mpi+shared~simgrid build_system=cmake build_type=Release cuda_arch=none generator=make runtime=starpu arch=linux-debian11-zen
[+]      ^cmake@3.31.4%gcc@10.2.1~doc+ncurses+ownlibs~qtgui build_system=generic build_type=Release arch=linux-debian11-zen
[+]          ^curl@8.11.1%gcc@10.2.1~gssapi~ldap~libidn2~librtmp~libssh~libssh2+nghttp2 build_system=autotools libs=shared,static tls=openssl arch=linux-debian11-zen

After the packages have been built, let’s push them into the container registry.

Important

To push our packages to be used as containers, we must add the --base-image flag. As Spack doesn’t built everything from the bottom, we must provide a base image, from which the libc library will be taken. You must match your --base-image to the system that built the packages. We have built the packages under Grid'5000 Debian 11 installation, so the base image should be a Debian 11 too. Not matching this, or not passing --base-image will render the push unusable.

Because Docker might put a rate-limit on the pulls of an image, and we are sharing the same IP address (10 downloads per hour per IP), I mirrored the Debian 11 image to the Inria registry. Please use this image instead (otherwise, the command would be --base-image debian:11):

[compute]$ spack buildcache push --base-image registry.gitlab.inria.fr/numpex-pc5/tutorials/debian:11 inria-<name> chameleon
....
==> [55/55] Tagged chameleon@master/7p4fs2v as registry.gitlab.inria.fr/numpex-pc5/tutorials/buildcache:chameleon-master-7p4fs2vb7isbfol3ceumigyxqs7bhuoq.spack

Annotate the URL that Spack gives you

Because Singularity might use heavy CPU/Memory resources, we build the container Image while we are in the compute node. The output is a SIF file (Singularity Image Format).

[compute]$ module load singularity
[compute]$ singularity pull chameleon-spack.sif docker://registry.gitlab.inria.fr/numpex-pc5/tutorials/<.....>
[compute]$ exit

Deploying on NVIDIA GPU’s

The commands for running the container on other machines like Jean-Zay, Vega, etc; will be the same as in the Guix Tutorial.

We will demonstrate how to run the container on the GPU partition of Grid'5000.

Deploying Chameleon on Grid'5000

Before trying the container, we can try our Spack-installed Chameleon directly:

$ oarsub -t allowed=special --project lab-2025-numpex-exadi-guix-spack-in-hpccenters -I -p chifflot -l /host=2,walltime=0:10:00
[compute]$ source spack/share/spack/setup-env.sh
[compute]$ spack env activate ./myenv
[compute]$ mpirun -machinefile $OAR_NODEFILE -x PATH chameleon_stesting -o gemm -n 4000 -b 160 --nowarmup -g 2
[compute]$ exit

To use the singularity container:

$ oarsub -t allowed=special --project lab-2025-numpex-exadi-guix-spack-in-hpccenters -I -p chifflot -l /host=2,walltime=0:10:00
[compute]$ mpirun -machinefile $OAR_NODEFILE \
                  --bind-to board \
                  singularity exec \
                    --bind /usr/lib/x86_64-linux-gnu/:/usr/lib/x86_64-linux-gnu/ \
                    --bind /tmp:/tmp chameleon-spack.sif \
                    chameleon_stesting -o gemm -n 4000 -b 160 --nowarmup -g 2