Home Publications
Research
LogGOPSim CoMPIler
NB Collectives
MPI Topologies
LibTopoMap
MPIParMETIS
MPI Datatypes
Netgauge
Network Topologies
Ethernet BTL eth
ORCS
DFSSSP
Older Projects
Teaching
BLOG
Travel
Miscellaneous
CV
Events
Recent Events
|
LibTopoMap - A Generic Topology Mapping Library
LibTopoMap is a library that enables generic topology mapping for
arbitrary process topologies to arbitrary network topologies. Currently,
it supports MPI's distributed graph topology [1] but it can be adopted
to provide topology mapping functions for any parallel programming runtime.
The current implementation is serial, i.e., mappings are computed on each
rank of the parallel allocation. Libtopomap uses different starting values
for greedy mapping and different strategies on different processes to
use the available parallelism in the run. The processes then agree on the
single best solution by either a minimal average dilation or minimal
maximum congestion metric. The library supports weighted application
graphs and heterogeneous networks (different bandwidths in different links).
The library supports multicore with k-way partitioning and provides four
mapping algorithms: (1) Greedy, (2) Recursive, (3) RCM, and (4) SCOTCH
static mapping. Simulated annealing can be used to improve the found
solution further. A detailed description and evaluation can be found in [2].
LibTopoMap requires ParMETIS and supports SCOTCH.
|
|
Download LibTopoMap:
|
|
Building LibTopoMap
- unpack tgz
- edit Makefile.inc and set CXX and ParMETIS (and optionally SCOTCH) paths correctly
- METIS may not support linking to C++, thus, a patch adding 'extern "C"' statements may be necessary, for example, for metis-4.0: metis_4.0-extern_c-patch.diff - (1.62 kb)
- make
- build sparse matrix test: cd matvec-test; make
|
|
Running the Sparse Matrix Vector test
This example explains how to run the sparse matrix-vector multiplication
mapping on a set of processes with a 3x3x3 torus topology. Topology map
and process-to-node mapping ("fake mapping") are delivered with the
package. We describe how to create and evaluate arbitrary network graphs
and multicore below.
- download and unpack a matrix from UFL collection.
e.g., $ wget http://www.cise.ufl.edu/research/sparse/MM/GHS_indef/aug2dc.tar.gz
- start 12 processes with aug2dc.mtx, the topology file ../3x3x3.map, pretending they run on hosts in ../3x3x3.fake.
e.g., $ mpirun -n 12 ./reader 0 ./aug2dc/aug2dc.mtx ../3x3x3.map ../3x3x3.fake
|
|
Using LibTopoMap
LibTopoMap has multiple ways for determining which node a process runs
on. It supports the native interface on BlueGene/P to determine the
Torus topology of the system and the location of each process. For other
architectures, LibTopoMap reads a file that specifies an adjacency list
of the complete physical network topology. Each node is identified by
its hostname in the file and LibTopoMap calls gethostname() to retrieve
the name of the node a process is running on. Thus, hostnames must be
unique in the network.
The next section describes the format of the topology map file by
example:
Format of the Topology Map File
The first example defines a simple ring (direct) network with 3 nodes
called "host0", "host1", and "host2":
num: 3
0 host0
1 host1
2 host2
0 1 2
1 2 0
2 0 1
The second example defines a central switch-based (indirect) network
with 3 nodes called "host0", "host1", and "host2" and one switch.
num: 4
0 host0
1 host1
2 host2
3 switch
0 3
1 3
2 3
3 0 1 2
The format of the topology files is as follows:
- Line 1 specifies number of vertices (switches or endpoints) in the network
- Line 2 to <nvertexes>+1 lists the hostnames of each vertex (the mapper library will identify where each process is located by gethostbyname())
- Line <nvertexes>+1 to 2*<nvertexes>+1 lists the adjacency list of each vertex
The distribution includes a 3x3x3 torus example (3x3x3.map).
Fake Node Allocations
Sometimes, it is not possible to run the topomapper on the target
system. Thus, the library allows a simulation mode where the user
specifies a mapping of processes to nodes in a so-called "fake file".
The format is simply: "<rank> <hostname>" per line.
The user can also emulate a multi-core allocation (multiple processes
pre node in the physical topology).
Single-core allocation (process 0 to host0 etc.):
0 host0
1 host1
2 host2
Dual-core allocation (process 0+1 to host0 etc.):
0 host0
1 host0
2 host1
3 host1
4 host2
5 host2
The distribution includes a fake file for the 3x3x3 torus topology
(3x3x3.fake).
Environment Variables
The behavior of LibTopoMap can be influenced with several environment
variables. LibTopoMap uses the prefix TPM_.
- TPM_STRATEGY - select the mapping strategy (either none, greedy, recursive, rcm, or scotch)
- TPM_ANNEAL - enable (=yes) or disable simulated annealing to improve solution
- TPM_PARMETIS - enable (=yes) partitioning the multicore allocation before mapping with METIS k-way partitioning (only applicable if all nodes have the same number of processes)
- TPM_FIX_PARMETIS - enable (=yes) balancing the resulting k-way partitioned graph (necessary for RCM or load balancing, is enabled automatically if necessary)
- TPM_FIX_GRAPH - enable (=yes) to make input graph symmetric (necessary for RCM or load balancing, is enabled automatically if necessary)
|
References
| CCPE | [1] T. Hoefler, R. Rabenseifner, H. Ritzdorf, B. R. de Supinski, R. Thakur and J. L. Traeff: | | | The Scalable Process Topology Interface of MPI 2.2 Concurrency and Computation: Practice and Experience. Vol 23, Nr. 4, pages 293-310, John Wiley & Sons, Ltd., ISSN: 1532-0634, Aug. 2010,  |
|