2

Recently I was trying to compile and run my mpi code on a single machine (Ubuntu 12.04 - 64 bits core i7 2670 QM) I installed mpich2 version 1.2 using the following configuration:

./configure --prefix=/opt/mpich2 --enable-f77 --enable-fc --enable-cxx --with-device=ch3:sock --with-pm=mpd CC=icc CXX=icpc F77=ifort FC=ifort 2>&1 | tee configure.log

The installation was ok, and I got mpd working well, I tested mpd with the examples and all is perfect.

I compile my code using mpif77 because I don't know why when I compiled mpich2 mpif90 was not created. But even if with mpif77 I got the code compiled with no errors.

The flags I'm using to compile the code are:

For the compiler:

LN_FLAGS= -lm -larpack -lsparskit -lfftw3 -lrt -llapack -lblas

For MPI linker:

LN_FLAGS_MPI= $(LN_FLAGS) -I$(MPIHOME)/include -L$(MPIHOME) $(MPIHOME)/lib/libmpich.a -lfmpich -lopa -lmpe

So the problem is when I try to run the code on my machine:

First I invoke mpd as:

mpd &

and then run the code as:

mpirun -np 4 ./code_mpi

I tried a lot of variations as:

mpiexec -np 4 ./code_mpi
mpirun -n 2 ./code_mpi
mpiexec -n 2 ./code_mpi

And all results in the same error:

Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_2]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_1]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
rank 2 in job 1  ubuntu_38132   caused collective abort of all ranks
  exit status of rank 2: killed by signal 9 
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_3]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_0]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
rank 1 in job 1  ubuntu_38132   caused collective abort of all ranks
  exit status of rank 1: return code 1 

I spent almost 2 weeks trying to solve this problems because I really need to run this code in my personal computer to work at home. I appreciate all that can help me!


Here is how I initialize the MPI Library

subroutine init()
integer                      :: provided
call mpi_init(mpi_err)
call mpi_comm_rank(mpi_comm_world,rank,mpi_err)
call mpi_comm_size(mpi_comm_world,an_proc,mpi_err)
call MPI_BARRIER(MPI_COMM_WORLD,mpi_err)
end subroutine init
2
  • 1
    You shouldn't need to start mpd first; and can you compile/run a simple MPI "Hello world" successfully, eg slac.stanford.edu/comp/unix/farm/mpi.html ? Commented Oct 19, 2012 at 11:14
  • 2
    Show us this part of the code where you initialise the MPI library, e.g. the part that contains the calls to MPI_INIT and to MPI_COMM_RANK. Commented Oct 19, 2012 at 11:33

1 Answer 1

4

The problem is your subroutine has no idea what mpi_comm_world is. This integer value is set in the mpif.h header (or the mpi module for f90). As your code is written, mpi_comm_world is assigned randomly by the compiler and has no association with the actual mpi_comm_world communicator handle provided by mpi.

Generally, it's best to use implicit none in your code which will warn you of these types of errors. Try the following:

subroutine init()
!use mpi   !This one is for f90
implicit none
include 'mpif.h'   !use this for f77
integer  :: provided,rank,an_proc,ierr
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world,rank,ierr)
call mpi_comm_size(mpi_comm_world,an_proc,ierr)
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
end subroutine init
Sign up to request clarification or add additional context in comments.

1 Comment

Really strange, because when I run on a cluster, I don't get this error. I made the changes and the error continues...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.