0

My mpi4py (3.1.5) installation with openmpi (4.1.4) on python3.8 and ubuntu 20.04 has randomly stopped working today. Whenever I execute anything that loads mpi4py in python, I get the following error:

[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 572
[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 172
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value A system-required executable either could not be found or was not executable by this user (-126) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "A system-required executable either could not be found or was not executable by this user" (-126) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[juanMS:15643] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

This is really frustrating because I did not make any system changes or package updates or anything like that. I have tried removing all openmpi packages on my system and python venv:

sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common

pip uninstall mpi4py

I have tried this multiple times and for some reason the same error keeps popping up. There is nothing wrong that I can see with my openmpi version, as a simple test like this works fine:

mpirun -np 4 hostname 

I have found virtually no help online, so I'm hoping someone here can guide me in the right direction!

EDIT --------------------------------------------------------------

Exemplar python script to reproduce the error with mpi4py=3.1.5, openmpi=4.1.4, and python3.8 from ubuntu 20.04:

from mpi4py import MPI
import sys

def print_hello(rank, size, name):
    msg = "Hello World! I am process {0} of {1} on {2}.\n"
    sys.stdout.write(msg.format(rank, size, name))

if __name__ == "__main__":
    size = MPI.COMM_WORLD.Get_size()
    rank = MPI.COMM_WORLD.Get_rank()
    name = MPI.Get_processor_name()

    print_hello(rank, size, name)

It appears mpi4py=3.1.5 is compatible with openmpi=4.0.X and mpich=3.3.2, as far as I have tested.

7
  • I assume you found and tried the "solutions" in at least the top 3 results? Commented Mar 19, 2024 at 16:40
  • Correct, I had no success with that. The debian package for libopenmpi-dev comes with openmpi-bin and openmpi-common. I seem to have had success with removing all traces of openmpi and building mpi4py with just MPICH (3.3.2) installed on my machine. Commented Mar 19, 2024 at 21:29
  • Can you add a minimal code sample to get that message? Commented Mar 19, 2024 at 22:31
  • I have added a sample python script to the original post for others to test, including what mpi versions I have managed compatibility with Commented Mar 21, 2024 at 0:20
  • Was there a particular reason for you to install mpi4py=3.1.5 via pip, and not use the distro-provided 3.0.3-4build2? Using the latter your test-code works as designed. Commented Mar 21, 2024 at 0:27

1 Answer 1

0

I had a similar issue and solved it using:

sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
sudo pip uninstall mpi4py

Followed by:

sudo apt install libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
sudo pip install mpi4py
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.