AVBP with Guix
This document describes how to deploy the AVBP software using Guix, whether it is available or not on the target machine.
Requisites
The AVBP packages are available in the Guix-HPC-non-free channel.
See https://gitlab.inria.fr/guix-hpc/guix-hpc-non-free and https://guix.gnu.org/manual/en/html_node/Specifying-Additional-Channels.html for instructions regarding how to configure Guix to use software from this channel.
Running AVBP with Guix locally
AVBP can be installed using the avbp Guix package.
In order to build this package, the following environment variables need to be set:
AVBP_GIT_REPO: path to your local clone of the git repository containing the source code of AVBPAVBP_LIBSUP: path to the local folder containing the AVBP license file
The following commands instantiate a containerized environment in which a simulation is run:
# Go to the folder containing your simulation.
cd /path/to/simulation
# Either export the required environment variables...
export AVBP_GIT_REPO=... AVBP_LIBSUP=...
# ...and run the guix command
guix shell --container avbp coreutils openmpi@4 openssh
# Or set the environment variables on the command line
AVBP_GIT_REPO=... AVBP_LIBSUP=... guix shell --container avbp coreutils openmpi@4 openssh
# Run AVBP from the folder containing the run.params file.
cd RUN && avbp
# Alternatively, start a parallel simulation using Open MPI
cd RUN && mpirun -np 12 avbpNotes:
- in order to run AVBP from a containerized environment, the
coreutils,openmpi@4andopensshpackages have to be explicitly selected (opensshbeing required by Open MPI). -
in order to run a simulation, the root directory of the simulation must be accessible. This won't be the case if the containerized shell is started from the
RUNsubdirectory. An alternative command allowing to directly start a simulation from within theRUNfolder could be:AVBP_GIT_REPO=... AVBP_LIBSUP=... guix shell --container avbp coreutils openmpi@4 openssh --share=/path/to/simulation -- mpirun -np 12 avbp
Running AVBP on supercomputers
At the time of writing, Guix is not natively available on the national supercomputers.
In order to use AVBP on national supercomputers, Guix provides the
guix pack command, which allows to build an archive containing the
full software stack required to run AVBP.
This archive can be then deployed and run on the supercomputer.
So far, the techniques that have been tested are:
- Relocatable binaries on Adastra and Jean-zay (see the Example procedure on Adastra below which can be adapted to Jean-Zay)
- Singularity on Jean-Zay (see the Example procedure on Jean-Zay below)
Note: the following procedures use SLURM's srun command to start a
simulation (both in interactive or batch mode). SLURM's srun command
communicates directly with Open MPI using the library selected with
the --mpi switch (see Open MPI documentation). When using Open MPI
4.x (the current default version in Guix), this option has to be set
to --mpi=pmi2 for proper communication with SLURM.
Example procedure on Adastra (relocatable binaries)
On a machine with Guix installed
The following commands:
- create an archive that contains the
avbppackage, - copy the archive on the supercomputer
# On the local machine, create the archive...
AVBP_GIT_REPO=... AVBP_LIBSUP=... guix pack -R -S /bin=bin -C zstd avbp
[...]
/gnu/store/xxxxxxxxxxxxxxx-avbp-tarball-pack.tar.zst
# ...then copy it to Adastra
scp /gnu/store/xxxxxxxxxxxxxxx-avbp-tarball-pack.tar.zst user@adastra.cines.fr:/path/to/$CCFRWORK/avbp-pack.tar.zstOn Adastra
The following commands:
- unpack the archive in the
$CCFRWORKdirectory - set the required environment variables
- start a simulation
# Uncompress the archive in the $CCFRWORK space.
cd $CCFRWORK && mkdir avbp-pack && zstd -d avbp-pack.tar.zst && tar xf avbp-pack.tar -C avbp-pack
# Make sure no external library is loaded from the host machine
unset LD_LIBRARY_PATH
# This is needed by Slingshot when starting many MPI processes (hybrid mode gets message queue overflow).
export FI_CXI_RX_MATCH_MODE=software
# This is needed to run on a full node (192 cores) due to multiple PML being selected when not set.
# This PML uses libfabric for Slingshot support.
export OPMI_MCA_pml=cm
# Start an interactive job from the folder containing the run.params file
cd /path/to/simulation/run && srun -A user \
--time=0:20:00 \
--constraint=GENOA \
--nodes=10 \
--ntasks-per-node=192 \
--cpus-per-task=1 \
--threads-per-core=1 \
--mpi=pmi2 \
$CCFRWORK/avbp-pack/bin/avbp
An example sbatch script can be found below:
#!/bin/bash
#SBATCH -A user
#SBATCH --constraint=GENOA
#SBATCH --time=03:00:00
#SBATCH --nodes=10
#SBATCH --ntasks-per-node=192
#SBATCH --cpus-per-task=1
#SBATCH --threads-per-core=1
# Make sure no external library is loaded from the host machine.
unset LD_LIBRARY_PATH
cd /path/to/simulation/run
# Enforce the use of PMI2 to communicate with Open MPI 4, default Open MPI version in Guix.
srun --mpi=pmi2 $CCFRWORK/avbp-pack/bin/avbpCaveats
- Interconnection errors when starting too many MPI processes on Adastra
Exemple usage on Jean-Zay with Singularity
On a machine with Guix installed
The following commands:
- create an archive that contains the
avbp,coreutilsandbashpackages (the last one being a Singularity requirement), - copy the archive on the supercomputer
# On the local machine, create the archive...
AVBP_GIT_REPO=... AVBP_LIBSUP=... guix pack -f squashfs -S /bin=bin --entry-point=/bin/bash avbp coreutils bash
[...]
/gnu/store/xxxxxxxxxxxxxxx-avbp-coreutils-bash-squashfs-pack.gz.squashfs
# ...then copy it to Jean-Zay
scp /gnu/store/xxxxxxxxxxxxxxx-avbp-coreutils-bash-squashfs-pack.gz.squashfs user@jean-zay.idris.fr:/path/to/$WORK/avbp.sif
Note: the coreutils package is required when running AVBP in a
containerized environment.
On Jean-Zay
The Singularity image has to be copied to an authorized folder according to Jean-Zay documentation:
# Make the image available to Singularity
idrcontmgr cp $WORK/avbp.sifThe following commands starts a simulation in interactive mode:
# Load the Singularity environment
module load singularity
# Clean the environment variable
unset LD_LIBRARY_PATH
# Run the simulation on one full node
srun -A user@cpu \
--nodes=1 \
--ntasks-per-node=40 \
--cpus-per-task=1 \
--time=01:00:00 \
--hint=nomultithread \
--mpi=pmi2 \
singularity exec \
--bind $WORK:/work \
$SINGULARITY_ALLOWED_DIR/avbp-bash.sif \
bash -c 'cd /work/path/to/simulation/run && avbp'
Below is a sample sbatch script:
#!/bin/bash
#SBATCH -A user@cpu
#SBATCH --job-name=avbp
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=40
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH --hint=nomultithread
module purge
module load singularity
unset LD_LIBRARY_PATH
srun --mpi=pmi2 singularity exec --bind $WORK:/work $SINGULARITY_ALLOWED_DIR/avbp.sif /bin/bash -c 'cd /work/path/to/simulation/run && avbp'Caveats
- Singularity doesn't seem to honour the
-Wflag which sets the workdir. This requires usingbash -cwith mulitple commands. - The $WORK space doesn't seem to be accessible from within the
container: the ~--bind $WORK:/work~ option makes it accessible
through the
/workpath. - Open MPI parameters need to be tweaked when running on multiple nodes and multiple cores at the same time on Jean-Zay.
- Open MPI 5.x is not working at the time of writing on Jean-Zay.
Example procedure on Irene with PCOCC
PCOCC can import Docker images generated by Guix.
On a machine with Guix installed
The following commands:
- create an archive that contains the
avbp,coreutilsandbashpackages, - copy the archive on the supercomputer
# On the local machine, create the archive...
AVBP_GIT_REPO=... AVBP_LIBSUP=... guix pack -f docker bash coreutils avbp
# ...then copy it to Irene
scp /gnu/store/xxxxxxxxxxxxxxx-bash-coreutils-avbp-docker-pack.tar.gz \
user@irene-fr.ccc.cea.fr:/path/to/$CCFRWORK/avbp.tar.gzOn Irene
The Docker image has to be imported using PCOCC (see TGCC documentation for more details):
pcocc-rs image import docker-archive:$CCFRWORK/avbp.tar.gz avbpThe following commands start a simulation in interactive mode:
cd /path/to/RUN
ccc_mprun -p rome \
-N 2 \
-n 256 \
-c 1 \
-E '--mpi=pmi2' \
-m work,scratch \
-A project_id \
-T 600 \
-C avbp -- avbpGeneral notes related to HPC
- Open MPI 4.x uses PMI2 to communicate with SLURM. This requires
launching AVBP using
srun --mpi=pmi2. - The
LD_LIBRAY_PATHenvironment variable often gets in the way, causing the execution to fail, hence theunset.
Running a test suite
The avbp-tests package provides a script running a subset of the
AVBP test suite with a single MPI process.
In order to build this package, an additional environment variable has to be defined:
AVBP_TEST_SUITE: path to the folder containing the AVBP test suite (it can be the local clone of thetestcasesrepository).
The following command builds the package and runs the test suite:
AVBP_GIT_REPO=... AVBP_LIBSUP=... AVBP_TEST_SUITE=... guix build avbp-tests
It is also possible to build the avbp-tests package without actually
running the tests. This is useful if you want to run the tests
manually and have a look at the output files. This can be achieved
using the --without-tests flag:
AVBP_GIT_REPO=... AVBP_LIBSUP=... AVBP_TEST_SUITE=... guix build --without-tests=avbp-tests avbp-tests
If you want to run a subset of the standard test cases, simply copy
them to some directory on your system, set AVBP_TEST_SUITE to point
there and (re)build the avbp-tests package.
AVBP development environment
In order to instantiate a development environment for AVBP, the
AVBP_LIBSUP variable has to be set.
On a machine using Guix
The following command instantiates a containerized development environment for AVBP:
cd /path/to/avbp/source
AVBP_LIBSUP=... guix shell --container --development avbp --expose=/path/to/avbp/licenseNotes:
- you might want to instantiate a containerized environment from the top level directory of AVBP sources so you can actually perform the build
- you probably want to expose the path to the AVBP license inside the
container, this is done with the
--exposeflag - you might want to add other packages to the development
environment, for example
grep,coreutilsor a text editor, simply add them to the command-line ; see the documentation.
You can also store a list of packages for a development environment in a Manifest file, track it under version control (in your branch/fork of the AVBP source code for example) and use it later:
guix shell --export-manifest package1 package2 ... --development avbp > avbp-development-environment.scm
# [...]
export AVBP_LIBSUP=...
guix shell --container \
--manifest=avbp-development-environment.scm \
--expose=/path/to/avbp/licenseUsing Singularity
Generate the Singularity image
A development environment can be generated with guix pack by
providing a Manifest file (see above):
AVBP_LIBSUP=... guix pack -f squashfs --entry-point=/bin/bash -m avbp-development-environment.scm
[...]
/gnu/store/...-pack.gz.squashfsDeploy the image (example on Jean-Zay)
The generated image has to be then copied to the remote machine and launched using Singularity.
Below is an example on how to deploy the image on Jean-Zay:
# On the local machine: copy the image to Jean-Zay.
scp /gnu/store/...-pack.gz.squashfs jean-zay.idris.fr:/path/to/$WORK/avbp-development-environment.sif
# On Jean-Zay: copy the image to the authorized directory...
idrcontmgr cp $WORK/avbp-development-environment.sif
# ... load the Singularity module ...
module load singularity
# ... and launch the container (in this example a full node is allocated).
srun \
-A user@cpu \
--time=02:00:00 \
--exclusive \
--node=1 \
--ntasks-per-node=1 \
--cpus-per-task=40 \
--pty \
--hint=nomultithread \
singularity shell \
--bind $WORK:/work \
$SINGULARITY_ALLOWED_DIR/avbp-development-environment.sifNotes:
- The
--ptyflag sets pseudo terminal mode in order to properly handle interactive shell mode. - When not specifying
--cups-per-task, only a single core is associated to the shell task.