Using Grid'5000

Table of content

The purpose of this tuto is to let you experiment the Grid'5000 platform, which is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

As an example we will try to run an implementation of Conway’s Game of Life using Message Passing Interface (MPI) for parallelization.

Set up a Grid'5000 account

To request an account on Grid’5000 fill that form and select the appropriate Group Granting Access, Team and Project. Members of the NumPEx-PC5 Team should use the values documented here.

Then make sure to generate a SSH keypair on your PC and to upload the public key on Grid'5000 ; this will allow direct connection using ssh from your PC. Detailled explanations are given here.

Read the documentation

A very extensive documentation is available on Grid'5000 User Portal. For that tutorial you may start with these two articles:

If you are not familiar with MPI you might also have a look here.

Prepare the work

Connect to one Grid'5000 site

If you applied the correct SSH configuration on your PC (see here), you should be able to connect directly to a given Grid'5000 front-end, let’s say for instance Grenoble, with a simple ssh command :

jcharousset@DEDIPPCY117:~$ ssh grenoble.g5k
Linux fgrenoble 5.10.0-30-amd64 #1 SMP Debian 5.10.218-1 (2024-06-01) x86_64
----- Grid'5000 - Grenoble - fgrenoble.grenoble.grid5000.fr -----

** This site has 5 clusters (more details at https://www.grid5000.fr/w/Grenoble:Hardware)
 * Available in queue default with exotic job type:
 - drac   (2016): 12 nodes (2 CPUs POWER8NVL 1.0, 10 cores/CPU, 4 GPUs Tesla P100-SXM2-16GB, 128GB RAM, 2x931GB HDD, 1 x 10Gb Ethernet, 2 x 100Gb InfiniBand)
 - yeti   (2017): 4 nodes (4 CPUs Intel Xeon Gold 6130, 16 cores/CPU, 768GB RAM, 447GB SSD, 2x1490GB SSD, 3x1863GB HDD, 1 x 10Gb Ethernet, 1 x 100Gb Omni-Path)
 - troll  (2019): 4 nodes (2 CPUs Intel Xeon Gold 5218, 16 cores/CPU, 384GB RAM, 1536GB PMEM, 447GB SSD, 1490GB SSD, 1 x 25Gb Ethernet, 1 x 100Gb Omni-Path)
 - servan (2021): 2 nodes (2 CPUs AMD EPYC 7352, 24 cores/CPU, 128GB RAM, 2x1490GB SSD, 1 x 25Gb Ethernet, 2 x 100Gb FPGA/Ethernet)
 * Available in queue default:
 - dahu   (2017): 32 nodes (2 CPUs Intel Xeon Gold 6130, 16 cores/CPU, 192GB RAM, 223GB SSD, 447GB SSD, 3726GB HDD, 1 x 10Gb Ethernet, 1 x 100Gb Omni-Path)

** Useful links:
 - users home: https://www.grid5000.fr/w/Users_Home
 - usage policy: https://www.grid5000.fr/w/Grid5000:UsagePolicy
 - account management (password change): https://api.grid5000.fr/ui/account
 - support: https://www.grid5000.fr/w/Support

** Other sites: lille luxembourg lyon nancy nantes rennes sophia strasbourg toulouse

Last login: Fri Jun 14 11:40:27 2024 from 192.168.66.33
jcharous@fgrenoble:~$

Build the example application

Let’s first retrieve the original source code by cloning the Github repository:

jcharous@fgrenoble:~$: git clone https://github.com/giorgospan/Game-Of-Life.git
Cloning into 'Game-Of-Life'...
remote: Enumerating objects: 127, done.
remote: Total 127 (delta 0), reused 0 (delta 0), pack-reused 127
Receiving objects: 100% (127/127), 171.69 KiB | 1.95 MiB/s, done.
Resolving deltas: 100% (48/48), done.
Tip

There is a distinct home directory on each Grid'5000 site, so what has been stored in Grenoble will not be available if you connect to Lyon or Nancy.

To generate a more verbose output, you might want to uncomment lines 257 to 264 of the file Game-of-Life/mpi/game.c - thus a subset of the matrix will be printed at each generation.

		/*Uncomment following lines if you want to see the generations of process with
		rank "my_rank" evolving*/
		
		
		// if(my_rank==0)
		// {
			// printf("Generation:%d\n",gen+1);
			// for(i=0;i<local_M;++i)
				// putchar('~');
			// putchar('\n');
			// print_local_matrix();
		// }

Next step is to build the application:

jcharous@fgrenoble:~$ cd Game-Of-Life/
jcharous@fgrenoble:~/Game-Of-Life$ make mpi
rm -f ./mpi/functions.o ./mpi/game.o ./mpi/main.o ./mpi/gameoflife
mpicc -g -O3 -c mpi/functions.c -o mpi/functions.o
mpicc -g -O3 -c mpi/game.c -o mpi/game.o
mpicc -g -O3 -c mpi/main.c -o mpi/main.o
mpicc -o mpi/gameoflife mpi/functions.o mpi/game.o mpi/main.o
jcharous@fgrenoble:~/Game-Of-Life$

Resulting executable is available at ~/Game-Of-Life/mpi/gameoflife

Run the computation

Request nodes for a computation

Now we will ask the Grid'5000 platform to give us access to one node (comprising multiple CPU cores, 32 in our case) for an interactive session. We use also the walltime option to set an upper limit of 1 hour to our session ; after that time the session will be automatically killed.

jcharous@fgrenoble:~/Game-Of-Life$ oarsub -I -l nodes=1,walltime=1
# Filtering out exotic resources (servan, drac, yeti, troll).
OAR_JOB_ID=2344074
# Interactive mode: waiting...
# [2024-06-14 13:41:46] Start prediction: 2024-06-14 13:41:46 (FIFO scheduling OK)

Let’s wait until the scheduler decides to serve our request… be patient. Eventually our request will be picked up from the queue and the scheduler will grant us access to one computation node (dahu-28 in our example):

# [2024-06-14 13:45:13] Start prediction: 2024-06-14 13:45:13 (FIFO scheduling OK)
# Starting...
jcharous@dahu-28:~/Game-Of-Life$

Launch the computation

And finally we can execute the command to launch the computation:

jcharous@dahu-28:~/Game-Of-Life$ mpirun --mca pml ^ucx -machinefile $OAR_NODEFILE ./mpi/gameoflife -n 3200 -m 3200 -max 100

Where:

  • mpirun is the command to launch a MPI application on multiple cpus and cores,
  • --mca pml ^ucx is the set of options to tell Open MPI not to try to use high performance interconnect hardware and avoid a HUGE amount of warnings beeing shown,
  • $OAR_NODEFILE is the list of cpu cores to be used for the computation - this file was generated by the oarsub command in the previous section,
  • -n 3200 -m 3200 -max 100 are the parameters for our application, asking for a grid size of 3200*3200 and 100 generations.

You should see printouts of the matrix at each generation, followed by an information about the total time spent.

Congratulations, you did it 👋

What’s next ?

This very simple exercise should give you the basic idea. There are still lot of additionals topics you should explore:

  • Fine tune Open MPI config for performance optimisation,
  • Use OAR batch jobs instead of interactive sessions,
  • Use OAR options to precisely specify the ressources you want, requesting for a specific hardware property (e.g. 2 nodes with an SSD and 256G or more RAM) and/or a specific topology (e.g. 16 cores distributed on 4 different CPUs from the same node)
  • Automate complete workflows including transport of data, executables and/or source code to and from Grid'5000, before and after a computation,
Caution

Grid'5000 does NOT have any BACKUP service for users’ home directories, it is your responsibility to save what needs to be saved in some place outside Grid'5000.

  • Run computations on multiple nodes,
Tip

For this you will need to properly configure the High Performance Interconnect hardware available on the specific nodes that were assigned for your computation, either Infiniband or Omni-Path. See specific subsection in Run MPI On Grid'5000.

  • Customize the software environment, add new packages, deploy specific images,
  • Make use of GPU acceleration,
  • Learn tools for debugging, benchmarking and monitoring