Usage
Writing MPI-Parallel Tests
To create a MPI-parallel test, its test function must be marked with the
mpi mark:
1import pytest
2
3
4@pytest.mark.mpi(ranks=2)
5def test_with_mpi(mpi_ranks): # pylint: disable=unused-argument
6 """Simple passing test"""
7 assert True # replace with actual test code
The number of MPI processes to be used for the test must be set via the
required ranks argument. All MPI tests need to have an mpi_ranks
parameter as shown in the example.
For any test carrying the mpi mark, pytest-isolate-mpi will
launch an MPI job with the requested amount of processes. In this MPI
job, a pytest session runs this particular tests. Each MPI process
produces its own test report which is collected in the main process. To
distinguish the reports form each MPI process, pytest-isolate-mpi
extends the node IDs of the test reports to contain the source rank
where the report is originating from. For instance the test above would
result in (with --verbose passed to pytest):
1============================= test session starts ==============================
2platform linux -- Python 3.10.15, pytest-8.3.4, pluggy-1.5.0 -- /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/envs/stable/bin/python
3cachedir: .pytest_cache
4rootdir: /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/checkouts/stable/examples
5configfile: pytest.ini
6plugins: isolate-mpi-0.3, cov-6.0.0
7collecting ... collected 1 item
8
9test_basic.py::test_with_mpi[2]
10test_basic.py::test_with_mpi[2][rank=0] PASSED [100%]
11test_basic.py::test_with_mpi[2][rank=1] PASSED [200%]
12
13============================== 2 passed in 0.69s ===============================
By having a dedicated report for each MPI process, failing ranks can be easily identified:
1import pytest
2
3
4@pytest.mark.mpi(ranks=2)
5def test_one_failing_rank(mpi_ranks, comm): # pylint: disable=unused-argument
6 """In case of just one process failing an assert, the test counts
7 as failed and the outputs are gathered from the processes."""
8 assert comm.rank != 0
This test will always fail an MPI process 0:
1============================= test session starts ==============================
2platform linux -- Python 3.10.15, pytest-8.3.4, pluggy-1.5.0 -- /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/envs/stable/bin/python
3cachedir: .pytest_cache
4rootdir: /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/checkouts/stable/examples
5configfile: pytest.ini
6plugins: isolate-mpi-0.3, cov-6.0.0
7collecting ... collected 1 item
8
9test_one_failing_rank.py::test_one_failing_rank[2]
10test_one_failing_rank.py::test_one_failing_rank[2][rank=0] FAILED [100%]
11test_one_failing_rank.py::test_one_failing_rank[2][rank=1] PASSED [200%]
12
13=================================== FAILURES ===================================
14_______________________ test_one_failing_rank[2][rank=0] _______________________
15
16mpi_ranks = 2, comm = <mpi4py.MPI.Intracomm object at 0x7f61a5073720>
17
18 @pytest.mark.mpi(ranks=2)
19 def test_one_failing_rank(mpi_ranks, comm): # pylint: disable=unused-argument
20 """In case of just one process failing an assert, the test counts
21 as failed and the outputs are gathered from the processes."""
22> assert comm.rank != 0
23E assert 0 != 0
24E + where 0 = <mpi4py.MPI.Intracomm object at 0x7f61a5073720>.rank
25
26test_one_failing_rank.py:8: AssertionError
27=========================== short test summary info ============================
28FAILED test_one_failing_rank.py::test_one_failing_rank[2][rank=0] - assert 0 ...
29========================= 1 failed, 1 passed in 0.66s ==========================
All tests not marked with the mpi mark are executed as usual in the
main pytest session.
Parametrizing the Number of MPI Processes
By passing a list to ranks argument to the mpi mark, a test is
run multiple times with each requested number of MPI processes in turn
1import pytest
2
3
4@pytest.mark.mpi(ranks=[1, 2, 3])
5def test_number_of_processes_matches_ranks(mpi_ranks, comm):
6 """Simple test that checks whether we run on multiple processes."""
7 assert comm.size == mpi_ranks
Here, for each parametrization a matching number of test reports is produced:
============================= test session starts ==============================
platform linux -- Python 3.10.15, pytest-8.3.4, pluggy-1.5.0 -- /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/envs/stable/bin/python
cachedir: .pytest_cache
rootdir: /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/checkouts/stable/examples
configfile: pytest.ini
plugins: isolate-mpi-0.3, cov-6.0.0
collecting ... collected 3 items
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[1]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[1][rank=0] PASSED [ 33%]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[2]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[2][rank=0] PASSED [ 66%]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[2][rank=1] PASSED [100%]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[3]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[3][rank=0] PASSED [133%]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[3][rank=1] PASSED [166%]
test_number_of_processes_matches_ranks.py::test_number_of_processes_matches_ranks[3][rank=2] PASSED [200%]
============================== 6 passed in 2.06s ===============================
Enforcing a Maximum Runtime for MPI Tests
pytest-isolate-mpi allows to set a maximum runtime for MPI-parallel
tests with the timeout argument of the mpi mark:
1import pytest
2
3
4@pytest.mark.mpi(ranks=2, timeout=10, unit="s")
5def test_mpi_deadlock(mpi_ranks, comm): # pylint: disable=unused-argument
6 """Only the first process enters the barrier, all others move on
7 and complete the test this leads to a deadlock. pytest-isolate-mpi
8 handles this with timeouts"""
9 if comm.rank == 0:
10 comm.Barrier()
timeout sets maximum allowed runtime before the test is
forcefully terminated. With the optional unit argument, one can set
the time unit for the duration. Supported are "s" for seconds,
"m" for minutes and h for hours. If not specified explicitly,
the default unit is seconds.
By setting a timeout for an MPI-parallel test, deadlocks in this test will no longer prevent the completion of the test suite:
============================= test session starts ==============================
platform linux -- Python 3.10.15, pytest-8.3.4, pluggy-1.5.0 -- /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/envs/stable/bin/python
cachedir: .pytest_cache
rootdir: /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/checkouts/stable/examples
configfile: pytest.ini
plugins: isolate-mpi-0.3, cov-6.0.0
collecting ... collected 1 item
test_mpi_deadlock.py::test_mpi_deadlock[2]
test_mpi_deadlock.py::test_mpi_deadlock[2][rank=1] PASSED [100%]
test_mpi_deadlock.py::test_mpi_deadlock[2] FAILED [200%]
=================================== FAILURES ===================================
_____________________________ test_mpi_deadlock[2] _____________________________
Timeout occurred for test_mpi_deadlock.py::test_mpi_deadlock[2]: exceeded run time limit of 10s.
=========================== short test summary info ============================
FAILED test_mpi_deadlock.py::test_mpi_deadlock[2]
========================= 1 failed, 1 passed in 10.02s =========================
MPI Fixtures
pytest-isolate-mpi offers a selection of fixtures for the
development of MPI-parallel tests:
- comm
The MPI communicator available for the MPI-parallel test, i.e.
mpi4py.MPI.COMM_WORLD.- mpi_tmpdir
Wraps Pytest builtin
tmpdirfixture such that it can be used under MPI from all MPI processes.- mpi_tmp_path
Wraps Pytest builtin
tmp_pathfixture such that it can be used under MPI from all MPI processes.See also
pytest_isolate_mpi.fixtures.mpi_tmp_path_fixture().
Customization
Command Line Options
The behavior of pytest-isolate-mpi can be customized via the
following command line arguments to pytest:
- --no-mpi-isolation
Run tests without MPI and/or process isolation. This is particular useful for debugging parallel test cases. Normally, when
pytestis run in a debugger, breakpoints in parallel tests would not trigger because of the process isolation.- --verbose-mpi
Include detailed MPI information in output.
- --mpi-default-test-timeout
Sets a default test timeout for all MPI-isolated tests. This timeout can be overriden per test via the the
timeoutargument of thempimarker, see Enforcing a Maximum Runtime for MPI Tests. Defaults to no timeout if not specified.- --mpi-default-test-timeout-unit
Sets a default test timeout unit for all MPI-isolated tests. This timeout can be overriden per test via the the
unitargument of thempimarker, see Enforcing a Maximum Runtime for MPI Tests. Defaults tosfor seconds if not specified. The other valid choices aremfor minutes andhfor hours.
Configuration
pytest-isolate-mpi can be configured through the pytest
configuration file:
- mpi_executable
The mpi executable to launch the forked MPI environment with. If none is given,
pytest-isolate-mpitriesmpirunandmpiexec.- mpi_option_for_processes
The command line option of the MPI executable indicating the number of processes, such that
pytest-isolate-mpican launch the MPI environment with the appropriate number of processes as defined in thempimark. Defaults to-n.- mpi_command_line_args
Additional command line arguments to run the MPI executable with. By default, none are given.
For example, the following pytest.ini will result in tests marked
with @pytest.mark.mpi(ranks=2) to be launched by Slrum’s srun on
two compute nodes with 128 processes each.
# pytest.ini mpi_executable = srun mpi_option_for_processes = -N mpi_command_line_args = –ntasks-per-node 128 –account <MySlrumAccount>
When running Slurm with multiple compute nodes, make sure that $TMPDIR
is set to a single directory outside the compute nodes, e.g a directory on
on /scratch or /lustre.
Limitations
Reports for Crashed MPI Tests
If a Pytest session running a single MPI-parallel test exits
prematurely, it may fail to write its test report to its predetermined
location. In this case, pytest-isolate-mpi can no longer provide a
per-process test report for the failed ranks. Instead,
pytest-isolate-mpi will produce the output of mpirun
which will contain the full output of all parallel-run Pytest sessions
and mpirun itself:
============================= test session starts ==============================
platform linux -- Python 3.10.15, pytest-8.3.4, pluggy-1.5.0 -- /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/envs/stable/bin/python
cachedir: .pytest_cache
rootdir: /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/checkouts/stable/examples
configfile: pytest.ini
plugins: isolate-mpi-0.3, cov-6.0.0
collecting ... collected 1 item
test_one_aborting_rank.py::test_one_aborting_rank[2]
test_one_aborting_rank.py::test_one_aborting_rank[2][rank=1] PASSED [100%]
test_one_aborting_rank.py::test_one_aborting_rank[2] FAILED [200%]
=================================== FAILURES ===================================
__________________________ test_one_aborting_rank[2] ___________________________
At least one MPI process has exited prematurely.
------------------------------- Captured stdout --------------------------------
============================= test session starts ==============================
collecting ... ============================= test session starts ==============================
collecting ...
collected 1 item
test_one_aborting_rank.py::test_one_aborting_rank[2]
collected 1 item
test_one_aborting_rank.py::test_one_aborting_rank[2]
test_one_aborting_rank.py::test_one_aborting_rank[2][rank=1] PASSED [100%]
============================== 1 passed in 0.24s ===============================
------------------------------- Captured stderr --------------------------------
--------------------------------------------------------------------------
WARNING: Open MPI tried to bind a process but failed. This is a
warning only; your job will continue, though performance may
be degraded.
Local host: build-27296104-project-1100620-pytest-isolate-mpi
Application name: /home/docs/checkouts/readthedocs.org/user_builds/pytest-isolate-mpi/envs/stable/bin/python
Error message: failed to bind memory
Location: ../../../../../../orte/mca/rtc/hwloc/rtc_hwloc.c:447
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[15417,1],0]
Exit code: 127
--------------------------------------------------------------------------
[build-27296104-project-1100620-pytest-isolate-mpi:01442] 1 more process has sent help message help-orte-odls-default.txt / memory not bound
[build-27296104-project-1100620-pytest-isolate-mpi:01442] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
=========================== short test summary info ============================
FAILED test_one_aborting_rank.py::test_one_aborting_rank[2]
========================= 1 failed, 1 passed in 1.60s ==========================
Fixture Scopes
Pytest allows to reuse fixtures between tests with the help of fixture
scopes. Since pytest-isolate-mpi executes each MPI-parallel test
in a Pytest sub session, support for session scopes other than the
default function scope is limited for MPI-parallel tests:
session:pytest-isolate-mpiwill store the result of session-scoped fixture functions in a cache file. This file will be read back when the fixture is requested by subsequent tests. The file is managed per MPI communicator size and rank so each MPI process caches its own dedicated fixture. Sharing fixtures between tests of differently sized communicators and non-MPI/MPI tests is not possible. Fixtures are serialized with thepicklemodule. Please note that not all Python objects support pickling.class,module, andpackage: Fixtures for these scopes are re-created for each MPI-parallel tests. Such fixtures effectively behave as if they were function-scoped.
For non-MPI tests, fixture scopes behave as usual even if
pytest-isolate-mpi is employed in the project.
Percentage of Completed Tests During Pytest Run
As pytest-isolate-mpi produces one test protocol per MPI-process
while not increasing the test count, the reported percentages for test
run completion are incorrect.
Troubleshooting
Test Collection Fails with function uses no argument 'mpi_ranks'
pytest-isolate-mpi parametrizes all MPI tests with regards to the
chosen number of MPI processes. As such, all test marked using the
pytest.mark.mpi() marker must accept the argument mpi_ranks,
even if the test makes no use of this information:
@pytest.mark.mpi(ranks=2)
def test_pass(mpi_ranks): # Argument required
assert True
If at least one MPI test misses this argument, the test collection fails.