0

I am a mechanical engineering looking to do CFD simulations on HPC machines. Currently, I am programming my procedures within the framework of PARAMESH, which is a fortran library with MPI calls for handling data and workload distribution during parallel execution.

During parallel runs the application gives false results, where data between processes is not transmitted, or terminates with different kinds of errors. The bizarre thing is that the location of failure in the source code and the error itself depends on the number of processes running. In my opinion the problem lies with my MPI setup or the execution with mpirun as the library is still used for numerical simulations at other research institutions.

As I am familiar only with the basics on how MPI works and not the intricacies I hope that you might provide the insight I lack. Maybe I am overlooking something basic, which leads to these program failures.

Here is a summary of the errors depending on the process count N when the command mpirun -np N --hostfile hostfile --oversubscribe --report-bindings --display-map --display-allocation lsp is executed (hostfile contains: hostname and max_slots = 1):

N = 1: no errors;

N = 2: application terminates normally, invalid data in result arrays

  • Data from process 0 not transmitted correctly to process 1

N = 4:

malloc(): invalid size (unsorted)
Program received signal SIGABRT: Process abort signal.

failure occurs on standard fortran allocate statement allocate (x, size (y, 2), size (z, 3), stat = istat)

  • allocatable variable stored in separate module with save attribute
  • allocatable variable deallocated on all processors before
  • correct integer values determining size of array

N = 6:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

failure occurs on standard fortran deallocate statement if (allocated(x)) deallocate(x)

  • deallocatable variable local to subroutine
  • deallocatable variable allocated on all processors prior to deallocation

N = 8; 10; 12; 14; 16; 32: Multiple errors

free(): invalid size
Program received signal SIGSEGV: Segmentation fault - invalid memory reference

malloc(): unaligned tcache chunk detected
Program received signal SIGABRT: Process abort signal. 

failure occurs on call MPI_ALLTOALL (commatrix_recv, 1, MPI_INTEGER, commatrix_send, 1, MPI_INTEGER, MPI_COMM_WORLD, ierror)

  • in a previous call to MPI_ALLTOALL, commatrix_send is allocated on all processors with valid values
  • directly before call, in which failure occurs, allocated (commatrix_send) returns nothing and print*, commatrix_send returns no values at all as if the two commands where not there. Also there is no error message when executing allocated (commatrix_send). Is this a memory leak?

I also tried running mpirun -use-hwthread-cpus --oversubscribe --report-bindings --display-map --display-allocation lsp and mpirun -use-hwthread-cpus --report-bindings --display-map --display-allocation lsp, which resulted in the same error as in the cases N = 8; 10; 12; 14; 16; 32.

Thanks in advance


I am running the application on my desktop:

  • AMD Ryzen 9 7950X 16-Core Processor
  • 64 GB of RAM
  • Ubuntu 22.04.3 LTS
  • OPEN MPI 4.1.5
  • GCC 11.4.0

My Open MPI installation:

                 Package: Open MPI root@Workstation-001 Distribution
                Open MPI: 4.1.5
  Open MPI repo revision: v4.1.5
   Open MPI release date: Feb 23, 2023
                Open RTE: 4.1.5
  Open RTE repo revision: v4.1.5
   Open RTE release date: Feb 23, 2023
                    OPAL: 4.1.5
      OPAL repo revision: v4.1.5
       OPAL release date: Feb 23, 2023
                 MPI API: 3.1.0
            Ident string: 4.1.5
                  Prefix: /usr/local
 Configured architecture: x86_64-pc-linux-gnu
          Configure host: Workstation-001
           Configured by: root
           Configured on: Thu Jul 27 15:08:14 UTC 2023
          Configure host: Workstation-001
  Configure command line: '--enable-mem-debug' '--enable-mem-profile'
                          '--enable-picky' '--enable-debug' '--enable-timing'
                          '--enable-ipv6' '--enable-peruse'
                          '--enable-mpi-fortran' '--enable-mpi-cxx'
                          '--enable-mpi1-compatibility'
                          '--enable-grequest-extensions' '--enable-spc'
                          '--enable-cxx-exceptions' '--enable-event-debug'
                Built by: root
                Built on: Thu 27 Jul 15:12:22 UTC 2023
              Built host: Workstation-001
              C bindings: yes
            C++ bindings: yes
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
                          limitations in the gfortran compiler and/or Open
                          MPI, does not support the following: array
                          subsections, direct passthru (where possible) to
                          underlying Open MPI's C functionality
  Fort mpi_f08 subarrays: no
           Java bindings: no
  Wrapper compiler rpath: runpath
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
  C compiler family name: GNU
      C compiler version: 11.3.0
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
           Fort compiler: gfortran
       Fort compiler abs: /usr/bin/gfortran
         Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
          Fort PROTECTED: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
         Fort USE...ONLY: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
           C++ profiling: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          C++ exceptions: yes
          Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
                          OMPI progress: no, ORTE progress: yes, Event lib:
                          yes)
           Sparse Groups: no
  Internal debug support: yes
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: yes
Memory debugging support: yes
              dl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
       MPI_WTIME support: native
     Symbol vis. support: yes
   Host topology support: yes
            IPv6 support: yes
      MPI1 compatibility: yes
          MPI extensions: affinity, cuda, pcollreq
   FT Checkpoint support: no (checkpoint thread: no)
   C/R Enabled Debugging: no
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
           MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v4.1.5)
           MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v4.1.5)
           MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA btl: self (MCA v2.1.0, API v3.1.0, Component v4.1.5)
                 MCA btl: vader (MCA v2.1.0, API v3.1.0, Component v4.1.5)
                 MCA btl: tcp (MCA v2.1.0, API v3.1.0, Component v4.1.5)
            MCA compress: bzip (MCA v2.1.0, API v2.0.0, Component v4.1.5)
            MCA compress: gzip (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA crs: none (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                  MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA event: libevent2022 (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA hwloc: hwloc201 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                  MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
                  MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
         MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v4.1.5)
         MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v4.1.5)
              MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component v4.1.5)
             MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
                          v4.1.5)
                MCA pmix: flux (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA pmix: pmix3x (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component v4.1.5)
              MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v4.1.5)
           MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v4.1.5)
              MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component
                          v4.1.5)
              MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, Component
                          v4.1.5)
              MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component
                          v4.1.5)
              MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, Component
                          v4.1.5)
                 MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component
                          v4.1.5)
                 MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA ess: env (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v4.1.5)
               MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v4.1.5)
             MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA iof: orted (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA odls: default (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA odls: pspawn (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
                 MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA regx: naive (MCA v2.1.0, API v1.0.0, Component v4.1.5)
                MCA regx: fwd (MCA v2.1.0, API v1.0.0, Component v4.1.5)
                MCA regx: reverse (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA rml: oob (MCA v2.1.0, API v3.0.0, Component v4.1.5)
              MCA routed: radix (MCA v2.1.0, API v3.0.0, Component v4.1.5)
              MCA routed: binomial (MCA v2.1.0, API v3.0.0, Component v4.1.5)
              MCA routed: direct (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component v4.1.5)
              MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v4.1.5)
              MCA schizo: orte (MCA v2.1.0, API v1.0.0, Component v4.1.5)
              MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component v4.1.5)
              MCA schizo: jsm (MCA v2.1.0, API v1.0.0, Component v4.1.5)
              MCA schizo: flux (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA state: tool (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA state: orted (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA state: app (MCA v2.1.0, API v1.0.0, Component v4.1.5)
               MCA state: novm (MCA v2.1.0, API v1.0.0, Component v4.1.5)
                 MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: self (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: adapt (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: inter (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: basic (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: monitoring (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
                MCA coll: han (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
               MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v4.1.5)
               MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                  MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                  MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                  MCA io: romio321 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                  MCA op: avx (MCA v2.1.0, API v1.0.0, Component v4.1.5)
                 MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
                          v4.1.5)
                 MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v4.1.5)
                 MCA pml: v (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
                 MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                 MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v4.1.5)
            MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
            MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)
            MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
                MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
                          v4.1.5)
                MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v4.1.5)
           MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
                          v4.1.5)

UPDATE #1 Compiler Flags

FC := mpifort
FFLAGS := -cpp -freal-4-real-8 -fdefault-real-8 -fdefault-double-8 -ffloat-store -Wpedantic -Wall -Wextra -I$(HDRDIR) -I/usr/local/hdf5/include -Jbin/mod

# C compiler options. Check C compiler flags.
CC := mpicc
CFLAGS := -g -I$(HDRDIR) -I/usr/local/hdf5/include

# Libraries
LIBS := -lz -lm -lc -L/usr/local/hdf5/lib/ -lhdf5

UPDATE #2

REAL TYPE

PARAMESH can handle 4 byte and 8 byte REAL types by setting #DEFINE REAL4 or #DEFINE REAL8 in a header file. Depending on the choice in the header, a global variable of type integer amr_mpi_real is defined (see below) and accessible to all processes. This variable is used in all MPI calls where the a REAL datatype is used.

#ifdef REAL8
      amr_mpi_real = MPI_DOUBLE_PRECISION
#else
      amr_mpi_real = MPI_REAL
#endif

However, type declaration statements in PARAMESH do not specify the KIND attribute of the declared variables. Instead, the PARAMESH manual states that the compiler flags must be set to account for the type of REAL used; -fdefault-real-8 -fdefault-double-8 in my case as I set #DEFINE REAL8.

DATA REDISTRIBUTION

Regarding data redistribution, PARAMESH uses MPI_Type_vector, MPI_Type_create_hvector and MPI_IRECV calls, where the type of REAL used is also important. Depending on the type specified in the header file (REAL4 or REAL8), nbytes takes on different values, which is used then to specify the stride of the vector.

I put together a snippet of the subroutine responsible for data redistribution; the data object to be redistributed is the array real, dimension (:, :, :, :, :,), allocatable :: UNK. After allocation, the size of UNK is constant accros all processes at (nvar, nxb, nyb, nzb, maxblocks).

Do you see any possible pitfalls with the code displayed below?

#ifdef REAL8
      nbytes = 8
#else
      nbytes = 4
#endif
.
.
.
allocate (unk_test(nvar, nxb, nyb, nzb))
do i = 1, 4
  udim_tot(i) = size (unk, dim = i)
  udim(i) = size (unk_test, dim = i)
end do
deallocate (unk_test)

call MPI_TYPE_VECTOR ( &
  & udim(2), &
  & udim(1), &
  & udim_tot(1), &
  & amr_mpi_real, &
  & type1, &
  & ierr &
  & )

call MPI_Type_create_hvector ( &
  & udim(3), &
  & 1, &
  & int (udim_tot(1) * udim_tot(2) * nbytes, MPI_ADDRESS_KIND), &
  & type1, &
  & type2, &
  & ierr &
  & )

call MPI_Type_create_hvector ( &
  & udim(4), &
  & 1, &
  & int (udim_tot(1) * udim_tot(2) * udim_tot(3) * nbytes, MPI_ADDRESS_KIND), &
  & type2, &
  & type3, &
  & ierr &
  & )

unk_int_type = type3

call MPI_TYPE_COMMIT (unk_int_type, ierr)
.
.
.
do lb = 1, new_lnblocks
  if (.Not. newchild(lb)) Then
    if (old_loc(2, lb) /= mype) Then
      nrecv = nrecv + 1
      call MPI_IRECV ( &
        & unk(1, is_unk, js_unk, ks_unk, lb), &
        & 1, &
        & unk_int_type, &
        & old_loc(2, lb), &
        & lb, &
        & MPI_COMM_WORLD, &
        & reqr(nrecv), &
        & ierr &
        & )

    end if
  end if
end do
16
  • There really isn't enough here to say anything for certain - it could be any number of things, and without code displaying the above symptoms it will be extremely difficult for us to say much. But I will say I doubt it is the MPI implementation, my guess would be a bug or bugs in the code to do with describing how the data is distributed amongst the processes. But I stress that is a guess. Commented Feb 7, 2024 at 16:30
  • @IanBush Thanks for your reply! The source code is thousands of rows long and distributed across multiple subroutines, but I think that I accurately pinpointed the problem to two locations. In both cases, the source code itself is not the problem: if (allocated(x)) deallocate (x) can't produce an error as the condition for successful deallocation is checked; the array commatrix_send is missing completely at some point, although it is only deallocated at the very end of the program. This is why I presume it is a problem with OpenMPI on my machine. Commented Feb 7, 2024 at 16:46
  • Make sure the library used by the app matches the mpirun command. Commented Feb 7, 2024 at 16:46
  • @GillesGouaillardet I assume you are referring to the compiler and the command used to run the application. I only have OpenMPI with GCC on my machine; I run the app using mpirun. Compiler flags have been included in the post. Maybe I have missed something there. Commented Feb 7, 2024 at 16:59
  • "if (allocated(x)) deallocate (x) can't produce an error". I wish I had your faith. If you have written to a memory location you shouldn't have anything can happen. You are compiling wih all debugging flags turned on? Especially are you using -fcheck=all? Commented Feb 7, 2024 at 17:01

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.