Notes on MPI
MCA parameters
MPI uses the MCA (Modular Component Architectures) as a framework for configuration for different run-time parameters of an MPI application.
MCA parameters can be adjusted with the flag:
There are 2 ways to debug which MCA parameters are used:
ompi_info --allwill display all the MCA parameters that are avaiable a priori.- The
mpi_show_mca_paramsMCA parameters can be set toall,default,file,apiorenviroto display their selected value. Sometimes thy will just show askey= (default), which is not useful.
Network drivers
There are 3 modes for MPI to select networks 1: ob1, cm and ucx, that can be
set with --mca pml <ob1,cm,ucx> (PML: Point-to-point Message Layer).
ucxmanages the devices on its own. It should be used for InfiniBand networks. UCX can be further configured with ucx-specific env variables, for examplempirun --mca pml ucx -X UCX_LOG_LEVEL=debug ....ob1is the multi-device, multi-rail engine and is the “default” choice. It is configured with--mca pml ob1. It used different backends for the Byte-Transport-Layer (btl), which can be configured with--mca btl <name>, such as:tcpselfsmshared memoryofiLibfabric, alternate wayuctUCX, alternate way
cmcan interface with “matching” network cards that are MPI-enabled. It uses MTL’s (not BTL’s) which can be set with--mca mtl <name>psm2Single-threaded Omni-PathofiLibfabric
In short: ucx provides the performance for InfiniBand, cm can be used for
specific setups, and ob1 as the fallback for low-performance TCP or
local-device. libfabric can be used through cm or ob1.
TODO: Discuss MCA transports for CUDA