Name: PnMPI
Owner: Lawrence Livermore National Laboratory
Description: Virtualization Layer for the MPI Profiling Interface
Created: 2010-10-17 04:54:06.0
Updated: 2018-02-19 18:41:01.0
Pushed: 2018-02-08 23:18:17.0
Homepage: https://computing.llnl.gov/code/pnmpi/
Size: 1365
Language: C
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
PnMPI is a dynamic MPI tool infrastructure that builds on top of the standardized PMPI interface. It allows the user to
The package contains two main components:
So far, this software has mainly been tested on Linux clusters with RHEL-based OS distributions as well as IBM's BG/P systems. Continuous integration tests run at Ubuntu 12.04 and 14.04, OSX El Capitan and macOS Sierra. Some preliminary experiments have also included SGI Altix systems. Ports to other platforms should be straightforward, but this is not extensively tested. Please file an issue if you run into problems porting PnMPI or if you successfully deployed PnMPI on a new platform.
Many thanks to our contributors.
PnMPI uses CMake for its build system.
argp-standalone
package.In addition, PnMPI uses git submodules for several CMake modules, wrap and adept-utils. While the deploy source tarball includes all required submodules, git users need to checkout them with the following command in the root of the cloned repository:
git submodule update --init --recursive
In the simplest case, you can run this in the top-level directory of the PnMPI tree:
cmake -DCMAKE_INSTALL_PREFIX=/path/to/install/destination
make
make install
This will configure, build, and install PnMPI to the destination specified.
PnMPI supports parallel make with the -j
parameter. E.g., for using eight
build tasks, use:
cmake -DCMAKE_INSTALL_PREFIX=/path/to/install/destination
make -j8
make install
On more complex machines, such as those with filesystems shared among multiple platforms, you will want to separate out your build directories for each platform. CMake makes this easy.
Create a new build directory named according to the platform you are using, cd
into it, an run cmake
there. For example:
cd <pnmpi>
mkdir x86_64
cd x86_64
cmake -DCMAKE_INSTALL_PREFIX=/path/to/install/destination ..
Here, <pnmpi>
is the top-level directory in the PnMPI tree. Note that when you
run CMake this way, you need to supply the path to the PnMPI source directory
as the last parameter. Here, that's just ..
as we are building in a
subdirectory of the source directory. Once you run CMake, simply run make and
make install as before:
make -j8
make install
The PnMPI build should auto-detect your MPI installation and determine library and header locations. If you want to build with a particular MPI that is NOT the one auto-detected by the build, you can supply your particular MPI compiler as a parameter:
cmake \
-DCMAKE_INSTALL_PREFIX=/path/to/install/destination \
-DMPI_C_COMPILER=/path/to/my/mpicc \
..
See the documentation in FindMPI.cmake for more details on MPI build configuration options.
If you have problems, you may want to build PnMPI in debug mode. You can do this
by supplying an additional parameter to cmake
, e.g.:
cmake \
-DCMAKE_INSTALL_PREFIX=/path/to/install/destination \
-DCMAKE_BUILD_TYPE=Debug \
..
The extra/build
directory contains a few sample invocations of CMake that have
been successfully used on LLNL systems.
By default PnMPI is configured to work with C/C++ and Fortran codes. However, on systems where Fortran is not available, the system should auto-detect this and not build the Fortran libraries and test cases. It can also be manually turned off by adding
-DENABLE_FORTRAN=OFF
to the cmake
configuration command.
The PnMPI distribution contains test cases for C and Fortran that allow you to test the correct linkage.
If you want to change the default build configuration, you can enable / disable
features by adding the following flags to the cmake
configuration command.
-DBUILD_DOC=ON
: Generate the public code documentation and man-pages.
PnMPI developers may enable BUILD_DOC_INTERNAL
instead for additional
PnMPI internal documentation. (requires Doxygen and help2man)-DENABLE_DEBUG=OFF
: Disable PnMPI on-demand debug logging. Disable this
option only for performance optimization.-DENABLE_MODULES=OFF
: Don't build the built-in modules.-DENABLE_TESTING=ON
: Build the test cases and enable the test
target.-DENABLE_THREAD_SAFETY=OFF
: Build PnMPI without thread safety. PnMPI will
limit the MPI threading level if this option is enabled, so it still will be
thread safe. Disable this option only for performance optimization.-DENABLE_PNMPIZE=OFF
: Don't build the PnMPI invocation tool and don't run
any tests for it. This option may be required when testing PnMPI at a
platform that doesn't support the execvp()
syscall.-DENABLE_ADEPT_UTILS=ON
: Enable support for adept-utils, to check the
module's symbols for their origin. (Not supported with all compilers.)-DENABLE_COVERAGE=ON
option to enable code coverage. You can
generate coverage reports with the gcov
and lcov
targets. Read their
README
for more information.-DSANITIZE_ADDRESS=ON
. Read their
README
for more information.When configuring PnMPI in cross-compiled environments (such as Blue Gene/Q
systems), it is necessary to provide a matching tool chain file. Many toolchain
files are included in CMake, additional example files that allow the compilation
on certain LC machines can be found in cmakemodules/Platform
and
cmakemodules/Toolchain
.
For example, to configure PnMPI for a BG/Q machine using the GNU compliler
suite, add the following to the cmake
configuration command:
-DCMAKE_TOOLCHAIN_FILE=../cmakemodules/Toolchain/BlueGeneQ-gnu.cmake
You may need to modify the toolchain file for your system.
Once you've installed, all your PnMPI files and executables should be in
<CMAKE_INSTALL_PREFIX>
, the path specified during configuration. Roughly, the
install tree looks like this:
bin/
pnmpi PnMPI invocation tool
pnmpi-patch Library patching utility
lib/
libpnmpi[f].[so,a] PnMPI runtime libraries
pnmpi-modules/ System-installed tool modules
cmake/ Build files for external modules
include/
pnmpi/ PnMPI header directory
debug_io.h PnMPI module debug print functions.
hooks.h PnMPI module hook definitions.
service.h PnMPI module service functions.
...
pnmpi.h PnMPI main header
pnmpimod.h PnMPI module support (legacy)
pnmpi-config.h CMake generated configuration file
share/
cmake/ CMake files to support tool module builds
Test programs are not installed, but in the tests/src folder of the build directory, there should also be test programs built with PnMPI. See below for details on running these to test your PnMPI installation.
You will need to set one environment variable to run PnMPI:
PNMPI_LIB_PATH
should be set to the full path to the PnMPI modules
directory. If not set, the default path of your installation directory will
be used.PNMPI_CONF
should be set to the full to the path of the PnMPI
configuration file to be used.PNMPI_BE_SILENT
will silence the PnMPI banner. For benchmark purposes you
should also disable ENABLE_DEBUG
in your CMake configuration. Note:
Warnings on errors still will be printed.To run PnMPI in front of any application (that is dynamically linked to MPI),
you may use the bin/pnmpi
tool. It will setup the environment and preloads
PnMPI for you. For a list of supported arguments, invoke it with the --help
flag.
:~$ pnmpi --help
Usage: pnmpi [OPTION...] utility [utility options]
P^nMPI -- Virtualization Layer for the MPI Profiling Interface
-c, --config=FILE Configuration file
-q, -s, --quiet, --silent Don't produce any output
-?, --help Give this help list
--usage Give a short usage message
-V, --version Print program version
...
:~$ mpiexec -np 2 pnmpi -c pnmpi.conf a.out
_____ _ __ __ __ _____ _____
| __ \ | '_ \ | \/ || __ \ |_ _|
| |__) || | | || \ / || |__) | | |
| ___/ |_| |_|| |\/| || ___/ | |
| | | | | || | _| |_
|_| |_| |_||_| |_____|
Application:
MPI interface: C
Global settings:
Pcontrol: 5
Loaded modules:
Stack default:
sample1 (Pcontrol: 1)
Stack foo:
...
Note: The PnMPI invocation tool is not compatible with all platforms (e.g.
BlueGene/Q), as it requires the execvp()
function, which might not be
supported.
By default, the build adds the paths of all dependency libraries to the rpath of
the installed PnMPI library. This is the preferred behavior on LLNL systems,
where many packages are installed and LD_LIBRARY_PATH
usage can become
confusing.
If you are installing on a system where you do NOT want dependent libraries
added to your RPATH, e.g. if you expect all of PnMPI's dependencies to be found
in system paths, then you can build without rpath additions using this option to
cmake
:
-DCMAKE_INSTALL_RPATH_USE_LINK_PATH=FALSE
This will add only the PnMPI installation lib directory to the rpath of the PnMPI library.
PnMPI supports multiple ways to use it with an MPI application:
If your application is dynamically linked against the MPI library, you may use
PnMPI by simply preloading it via LD_PRELOAD
(or DYLD_INSERT_LIBRARIES
at
macOS).
Instead of manually preloading, you may use the PnMPI invocation tool in
bin/pnmpi
.
Instead of linking your application against the MPI library, you may link against the static MPI library.
Note: By default the linker will only link functions used by your code, so most of the API functions would not get linked into your binary. PnMPI implements a helper function to force the linker to link all required functions into the binary. However there might be complications, if not all functions wrapped by the modules are used by the application. You should tell the linker to link the whole PnMPI archive explicitly:
mpicc main.o -Wl,--whole-archive pnmpi.a -Wl,--no-whole-archive -o test
Note: The linker option --whole-archive
is not available at mac OS.
PnMPI supports two different kind of tool modules:
Among the former are modules that have been created independently of PnMPI and are just based on the PMPI interface. To use a transparent module in PnMPI the user has to perform two steps:
pnmpi-patch
utility, which is included
with the PnMPI distribution.Usage:
pnmpi-patch <original tool (in)> <patched tool (out)>
e.g.:
pnmpi-patch my-module.so my-pnmpi-module.so
After that, copy the tool in one of the directories listed in $PNMPI_LIB_PATH
so that PnMPI can pick it up. Note that all of this is handled automatically
by the CMake build files included with PnMPI (see below for more information).
The second option is the use of PnMPI specific modules: these modules also rely on the PMPI interface, but explicitly use some of the PnMPI features (i.e., they won't run outside of PnMPI). These modules include the PnMPI header files, which describe the interface that modules can use. In short, the interface offers an ability to register a module and after that use a publish/subscribe interface to offer/use services in other modules.
Note: also PnMPI specific modules have to be patched using the utility described above, if they don't use the XMPI interface instead of the PMPI.
This package includes a set of modules that can be used both to create other tools using their services and as templates for new modules.
The source for all modules is stored in separate directories inside the
src/modules/
directory. There are:
sample: A set of example modules that show how to wrap send and receive operations.
empty: A transparent module that simply wraps all calls without executing any code. This can be used to test overhead or as a sample for new modules.
status: This PnMPI specific module offers a service to add extra data to each status object to store additional information.
requests: This PnMPI specific module offers a service to track asynchronous communication requests. It relies on the status module.
datatype: This PnMPI specific module tracks all datatype creations and provides an iterator to walk any datatype.
comm: This PnMPI specific module abstracts all communication in a few simple callback routines. It can be used to write quick prototype tools that intercept all communication operations independent of the originating MPI call. This infrastructure can be used by creating submodules: two such submodules are included: an empty one that can be used as a template and one that prints every message. Note: this module relies on the status, requests, and datatype modules. A more detailed description on how to implemented submodules is included in the comm directory as a separate README.
limit-threading
This PnMPI specific module limits the MPI threading level to the value set in
the PNMPI_THREADING_LEVEL
environment variable. It may be used to check the
behaviour of an application, if the MPI environment doesn't support a specific
threading level.
metrics-counter This PnMPI specific module counts the MPI call invocations. Add the module at the top of your config file to count how often each rank invoked which MPI call or in front of a specific module to count how often invocations reached this module.
metrics-timing This PnMPI specific module measures the time MPI call invocations take. It has two different operation modes:
simple: Add it in front of a module and it will measure the time of all following modules.
module metrics-timing
module sample1
advanced: Add the timing module before and after the modules you want to measure and it will only measure the time of these, but not the following modules.
module metrics-timing
module sample1
module sample2
module metrics-timing
module empty
Measuring MPI_Pcontrol
is available in advanced mode, only. To measure
MPI_Pcontrol
calls both metrics-timing
invocations need to be pcontrol
enabled:
module metrics-timing
pcontrol on # May be ignored if metrics-timing is the first module.
module sample1
module metrics-timing
pcontrol on
wait-for-debugger
This module prints the PID of each rank before executing the application. If
the WAIT_AT_STARTUP
environment variable is set to a numeric value, the
execution will be delayed up to value
seconds, so you may attach with a
debugger in that time.
Note: All modules must be compiled with the same MPI as PnMPI is build with. Modules should only be linked to their required libraries, except PnMPI and MPI, as these routines will be provided by PnMPI when the module will be loaded.
PnMPI installs CMake build files with its distribution to allow external
projects to quickly build MPI tool modules. The build files allow external
tools to use PnMPI, the pnmpi-patch
utility, PnMPI's wrapper generator, and
PnMPI's dependency libraries.
To create a new PnMPI module, simply create a new project that looks something like this:
my-project/
CMakeLists.txt
foo.c
wrapper.w
Assume that wrapper.w
is a wrapper generator input file that will generate
another file called wrapper.c
, which contains MPI interceptor functions for
the tool library. foo.c
is additional code needed for the tool library.
CMakeLists.txt
is the CMake build file.
Your CMakeLists.txt
file should start with something like this:
ect(my-module C)
e_minimum_required(VERSION 2.8.11.2)
_package(PnMPI REQUIRED)
_package(MPI REQUIRED)
wrapped_file(wrapper.c wrapper.w)
i_add_pmpi_module(foo foo.c wrapper.c)
all(TARGETS foo DESTINATION ${PnMPI_MODULES_DIR})
ude_directories(
MPI_INCLUDE_PATH}
PnMPI_INCLUDE_DIR}
CMAKE_CURRENT_SOURCE_DIR})
project()
and cmake_minimum_required()
are standard. These tell the build
the name of the project and what version of CMake it will need.
find_package(PnMPI)
will try to find PnMPI in system locations, and if
it is not found, it will fail because the REQUIRED
option has been
specified.
find_package(MPI)
locates the system MPI installation so that you can use
its variables in the CMakeLists.txt
file.
add_wrapped_file()
tells the build that wrapper.c
is generated from
wrapper.w
, and it adds rules and dependencies to the build to automatically
run the wrapper generator when wrapper.c
is needed.
Finally, pnmpi_add_pmpi_module()
function's much like CMake's builtin
add_library()
command, but it ensures that the PnMPI library is built as a
loadable module and that the library is properly patched.
If your module uses the XMPI interface (i.e. doesn't need to be patched), use
pnmpi_add_xmpi_module()
instead, so it will build the module but don't patch
it.
install()
functions just as it does in a standard CMake build, but here we
install to the PnMPI_MODULES_DIR
instead of an install-relative location.
This variable is set if PnMPI is found by find_package(PnMPI)
, and it allows
users to install directly into the system PnMPI installation.
Finally, include_directories()
functions as it would in a normal CMake
build.
Once you've make your CMakeLists.txt
file like this, you can build your PnMPI
module like so:
cd my-module
mkdir $SYS_TYPE && cd $SYS_TYPE
cmake ..
make -j8
make -j8 install
This should find PnMPI on your system and build your module, assuming that you have your environment set up correctly.
If your module is not thread safe or is only able to process a limited amount of
threading, it should limit the required threading level in the MPI_Init_thread
wrapper:
MPI_Init_thread(int *argc, char ***argv, int required, int *provided)
(required > MPI_THREAD_SINGLE)
required = MPI_THREAD_SINGLE;
turn XMPI_Init_thread(argc, argv, required, provided);
At different points hooks will be called in all loaded modules. These can be
used to trigger some functionality at a given time. All hooks have the return
type void
and are defined in pnmpi/hooks.h
, which should be included for
type safety. These following hooks will be called in all modules:
PNMPI_RegistrationPoint
: This hook will be called just after the module has
been loaded. It may be used to register the name of the module, services
provided by the module, etc.PNMPI_UnregistrationPoint
: This hook will be called just before the module
will be unloaded. It may be used e.g. to free allocated memory.PNMPI_Init
: This hook will be called just after all modules have been
registered to initialize the module. It may be used to initialize the module's
to code and is communicate with other modules.PNMPI_Fini
: This hook will be called just before all modules will be
unregistered. The modules may communicate with each other and should execute
some final steps in here.Note: You can use PNMPI_Service_CallHook()
to call custom hooks in your
modules. Just pass a custom hook name as first parameter.
For a detailed description, see the Doxygen docs or man-pages for these functions.
Modules may interact with the PnMPI core and other modules with the module
service functions defined in pnmpi/service.h
. For a detailed description about
these functions, see the Doxygen docs for the service header or the man-pages.
Modules may print debug messages, warnings and errors with the PnMPI API
functions PNMPI_Debug
, PNMPI_Warning
and PNMPI_Warning
defined in
pnmpi/debug_io.h
. PnMPI will add additional informations like rank or line
numbers to the printed messages.
For a detailed description, see the Doxygen docs or man-pages for these functions.
If PnMPI is build with ENABLE_DEBUG
, PnMPI includes debug print functions,
that can be dynamically enabled. To control it, the environment variable
PNMPI_DBGLEVEL
can be set to any combination of the following debug levels:
0x01
- Messages about PnMPI initialization (before MPI is initialized).0x02
- Messages about module loading.0x04
- Messages about MPI call entry and exit.NOTE: The first two levels should be enabled for single-rank executions only, as their output can't be limited to a single rank and thus will be printed on all ranks.
Additionally, the printouts can be restricted to a single node by setting the
variable PNMPI_DBGNODE
to an MPI rank.
You may set the above options in the PnMPI invocation tool. Use the --debug
option to enable a specific debug level and --debug-node
to limit the debug
output to a single rank.
The PnMPI distribution includes test cases (in C and Fortran). They can be used to experiment with the basic PnMPI functionalities and to test the system setup. The following describes the C version (the F77 version works similarly):
Change into the tests/src directory.
The program test-mpi.c, which only initializes and then finalizes MPI, was compiled into three binaries:
testbin-binary_mpi_c-preload
(plain MPI code)testbin-binary_mpi_c-dynamic
(linked dynamically against PnMPI)testbin-binary_mpi_c-static
(linked statically against PnMPI)Executing the *-preload
binary will not print any output, but the binaries
linked against PnMPI will print the PnMPI header, indicating PnMPI is loaded
before MPI.
PnMPI is configured through a configuration file that lists all modules to be
load by PnMPI as well as optional arguments. The name for this file can be
specified by the environment variable PNMPI_CONF
. If this variable is not
set or the file specified can not be found, PnMPI looks for a file called
.pnmpi_conf
in the current working directory, and if not found, in the
user's home directory.
A simple configuration file may look as follows:
module sample1 module sample2 module sample3 module sample4
(plus some additional lines starting with #
, which indicates comments)
This configuration causes these four modules to be loaded in the specified
order. PnMPI will look for the corresponding modules (.so shared library
files) in PNMPI_LIB_PATH
.
Running the testbin-binary_mpi_sendrecv
(a simple test sending messages
between the ranks) will load all four modules in the specified order and
intercept all MPI calls included in these modules:
The program output (for 2 nodes) will be:
_ _ __ __ __ _____ _____
_ \ | '_ \ | \/ || __ \ |_ _|
_) || | | || \ / || |__) | | |
__/ |_| |_|| |\/| || ___/ | |
| | | || | _| |_
|_| |_||_| |_____|
ication:
interface: C
al settings:
trol: 5
ed modules:
k default:
mple1 (Pcontrol: 1)
mple2
mple3
mple4
PER 1: Before recv
PER 1: Before send
PER 2: Before send
PER 4: Before send
PER 4: After send
PER 2: After send
PER 1: After send
PER 1: Before recv
PER 3: Before recv
PER 4: Before recv
PER 3: Before recv
PER 4: Before recv
PER 4: After recv
PER 3: After recv
PER 1: After recv
1 from rank 1.
PER 1: Before send
PER 2: Before send
PER 4: Before send
PER 4: After send
PER 4: After recv
PER 3: After recv
PER 1: After recv
1 from rank 0.
PER 2: After send
PER 1: After send
When running on a BG/P systems, it is necessary to explicitly export some environment variables. Here is an example:
mpirun -np 4 -exp_env LD_LIBRARY_PATH -exp_env PNMPI_LIB_PATH \
-cwd $PWD testbin-binary_mpi_sendrecv
MPI_Pcontrol
The MPI standard defines the MPI_Pcontrol
, which does not have any direct
effect (it is implemented as a dummy call inside of MPI), but that can be
replaced by PMPI to accepts additional information from MPI applications (e.g.,
turn on/off data collection or markers for main iterations). The information is
used by a PMPI tool linked to the application. When it is used with PnMPI the
user must therefore decide which tool an MPI_Pcontrol
call is directed to.
By default PnMPI will direct MPI_Pcontrol
calls to first module in the tool
stack only. If this is not the desired effect, users can turn on and off which
module Pcontrols reach by adding pcontrol on
and pcontrol off
into the
configuration file in a separate line following the corresponding module
specification. Note that PnMPI allows that MPI_Pcontrol
calls are sent to
multiple modules.
In addition, the general behavior of Pcontrols can be specified with a global
option at the beginning of the configuration file. This option is called
globalpcontrol
and can take one of the following arguments:
int
: Only deliver the first argument, but ignore the variable list arguments
(default)pmpi
: Forward the variable list argumentspnmpi
: Requires the application to specify a specific format in the variable
argument list known to PnMPI (level must be PNMPI_PCONTROL_LEVEL
)mixed
level == PNMPI_PCONTROL_LEVEL
.level != PNMPI_PCONTROL_LEVEL
.typed <tlevel> <type>
: Forward the first argument in any case, the second
only if the level matches <tlevel>
. In this case, assume that the second
argument is of type <type>
.The PnMPI internal format for Pcontrol arguments is as follows:
level (same semantics as for MPI_Pcontrol itself)
type = PNMPI_PCONTROL_SINGLE or PNMPI_PCONTROL_MULTIPLE
(target one or more modules) |
PNMPI_PCONTROL_VARG or PNMPI_PCONTROL_PTR
(arguments as vargs or one pointer)
mod = target module (if SINGLE)
modnum = number of modules (if MULTIPLE)
*mods = pointer to array of modules
size = length of all variable arguments (if VARG)
*buf = pointer to argument block (if PTR)
Forwarding the variable argument list as done in pmpi
and mixed
is only
implemented in a highly experimental version and disabled by default. To enable,
compile PnMPI with the flag EXPERIMENTAL_UNWIND
and link PnMPI with the
libunwind library. Note that this is not extensively tested and not portable
across platforms.
More documentation on PnMPI can be found in the following two published articles:
Martin Schulz and Bronis R. de Supinski. PnMPI Tools: A Whole Lot Greater Than the Sum of Their Parts. Supercomputing 2007. November 2007, Reno, NV, USA.
Available here.
Martin Schulz and Bronis R. de Supinski. A Flexible and Dynamic Infrastructure for MPI Tool Interoperability. International Conference on Parallel Processing (ICPP). August 2005, Columbus, OH, USA. Published by IEEE Press
Available here.
For more information or in case of questions, please contact Martin Schulz or file an issue.
Copyright © 2008-2018 Lawrence Livermore National Security, LLC.
Copyright © 2011-2016 ZIH, Technische Universitaet Dresden, Federal
Republic of Germany
Copyright © 2013-2018 RWTH Aachen University, Federal Republic of Germany
All rights reserved - please read the information in the LICENSE file.