The Cluster
The aforementioned instructions suite apl05.eti.pg.gda.pl - access to the cluster is restricted to selected users! Please let me know if you need access!
The cluster is based on OpenHPC system. It consists of the frontend node (apl05.eti.pg.gda.pl) and compute nodes (apl05-c[01-31]). Each node has 16GB of RAM. Multiple MPI implementations are available:
- mpich3 - default, uses InfiniBand in ucx mode,
- openmpi 4 - during code execution ofi module needs to be disabled (
--mca mtl ^ofi
), - mvapich2.
To switch MPI implementation module
command should be used.
Other software includes gcc9, python 3.6.
Users should login using the appropriate fronted to the cluster (apl13 or apl15). During first login public/private key-pair should be generated. Users should follow onscreen instructions if such keys were not previously generated. If the procedure will not start automatically users can manually generate keys using the procedure for computers in lab 527 (below). The passphrase should be empty!. If the user uses dedicated domain account once generated keys can be used on every cluster/machine in the laboratory.
Multiple MPI implementations are available, with OpenMPI in /usr/lib64/openmpi/bin/mpirun
set as default. To compile the code uses should use proper compiler wrapper from the following locations:
/usr/lib64/openmpi/bin/mpirun - uses InfiniBand as default transport medium/usr/lib64/mpich/bin/mpirun/opt/openmpi/bin/mpirun - uses InfiniBand as default transport medium/opt/openmpi_postfix/bin/mpirun - implementation supporting MPI_THREAD_MULTIPLE, only apl13/opt/mpich2_local/bin/mpirun - only apl13/opt/mpich2/gnu/bin/mpirun - only apl13/opt/SUNWhpc/HPC8.2.1/ - only apl13
All frontends has screen installed. This software allows disconnecting from running session without killing the processes. To run just execute screen
. To detach from the session press Ctr+a and than d. The commands run inside the screen will be still running in the detached state. To reattach run screen -r
. If there are more than one detached session screen -r pid
reconnects to the on identified by the pid. To close screen just write exit
or press Ctrl+d.
Process cleanup
Students should kill all unused/hanging processes. Multiple such programs can impact the stability of a cluster. Before running new instance of a program students should check whether the previous one exited correctly and no resources are locked. In any case an application will hang use the kill command. To terminate all processes belonging to the given user (on a given node) the following command can be used:
kill -9 `ps -u user -o "pid="`
As user we should enter the username whose processes we want to kill. Remember to use quotation marks exactly as in the example provided! As a result of this command all our processes will be killed and the user will be logged out of the server. As such the command should be run on all nodes that we used to run our parallel program!.
MPI on workstations in room 527
Available versions
There are 2 MPI implementations installed:
- mpich-4.0-3
- openmpi-4.1.2 - default
/opt/openmpi_postfix/bin/mpirun - implementacja ze wsparciem dla MPI_THREAD_MULTIPLE
The default implementation is openmpi, mpich can be run by adding .mpich suffix to standard commands or by changing environmental variables.
Internode communication
For MPI to work correctly it is recommended to generate SSH keys for password-less authentication. Keys can be generated using the following command:
ssh-keygen -t rsa
In .ssh
directory in the users home directory 2 files will be created: id_rsa
oraz id_rsa.pub
. The id_rsa.pub
file (public key) should be added to to the ~/.ssh/authorized_keys
file (e.g. using cat id_rsa.pub >> ~/.ssh/authorized_keys
command) on the computer where we want to login remotely. We need to copy it there first.
If we have multi-node cluster we should repeat the operation for all nodes! If we are using KASK account with shared home directories it is sufficient to perform the operation on one node. All the nodes will share than the authentication keys.
- delete ~/.ssh/known_hosts
- log into each desXY.kask computer and accept the SSH key
Running MPI applications on Ubuntu machines
Due to the laboratory configuration and host naming scheme you have run openMPI with proper switches. MPI usually tries to find the best way to communicate, but this often fails when there are multiple network interfaces.
The best solution so far is the command:
mpirun -np <no_of_nodes> --mca btl tcp,self --mca orte_keep_fqdn_hostnames t --mca btl_tcp_if_include 172.20.83.0/24 --machinefile hostfile <MPI_code>
First switch (--mca btl tcp,self) will tell MPI to use tcp and local sockets, the second (--mca orte_keep_fqdn_hostnames t) will force usage of whole computer name from hostfile file (with .kask suffix), the third one (-mca btl_tcp_if_include <network>) will tell MPI to use only the interface associated with the given network.
Alternatively, one can exclude docker0,docker_gwbridge and lo interface from MPI:
mpirun -np <no_of_nodes> --mca orte_keep_fqdn_hostnames t --mca btl_tcp_if_exclude docker0,docker_gwbridge,lo --hostfile hostfile.des <MPI_code>
or, instead of excluding interfaces you can run the following command to only include one interface:
mpirun -np <no_of_nodes> --mca orte_keep_fqdn_hostnames t --mca btl_tcp_if_include enp0s31f6 --machinefile ./hostfile.des <MPI_code>
With the aforementioned switches the code should run with hostfiles using names and IP like shown below. The slot
option denotes the number of processes run on the given node. It might not work on mpich implementation, if so remove it from the machines file.
172.20.83.201 slots=X
172.20.83.202 slots=X
...
or:
des01.kask slots=X-
des02.kask slots=X
...
„No protocol specified” message
The message comes from hwloc library that is used by openmpi. Tu suppress the warning please add the following line to ~/.bashrc file on each computer:
export HWLOC_COMPONENTS=-gl
For KASK accounts you should add it on any computer,as they all share the home directories. For student account you have to add it on each node. Please check whether it was not already added by other students!
MPI on apl11 and apl12
The aforementioned servers require some initial setup before MPI can be used. Due to the fact that mpi-selector is a bad hack from the OFED project, it has been dropped entirely in favor of environment-modules. The environment-modules package creates a shell command used to load and unload the necessary environment variables for the mpi packages. To see what modules are available, use this command:
module avail
Loading a module is done via
module load <module-name>
Unload is similar
module unload <module-name>
In order to emulate the previous behavior, it is sufficient for a user to place a call to module load in their personal .bashrc (or similar shell init script if they use a different command shell) to cause the proper module to be loaded at login each time.
MPI on APL12 with IntelPhi
Running MPI with ability to use both the IntelPhi and the host requires some environmental variable setup and proper code compilation/execution. Assuming that the code lies in current directory in the file called mpi_example.c
the following commands should be executed. The application should than run on apl12, mic0 and mic1 with 2, 3 and 5 processes respectively.
source /opt/intel/composer_xe_2013_sp1/bin/compilervars.sh intel64
source /opt/intel/impi/4.1.3.048/bin64/mpivars.sh
mpiicc -mmic mpi_example.c -o mpi_example.mic
mpiicc mpi_example.c -o mpi_example.host
scp mpi_example.mic mic0:
scp mpi_example.mic mic1:
cp mpi_example.host ~/
export I_MPI_MIC=enable
export I_MPI_FABRICS=shm:tcp
cd ~
mpirun -n 2 -host apl12 ./mpi_example.host : -n 3 -host mic0 ./mpi_example.mic : -n 5 -host mic1 ./mpi_example.mic