First time here? Checkout the FAQ!
x
0 votes
by (120 points)
edited by
Hello, I need help with the distributed use of mpiexec. I have defined a set of machines in a host file and setup the network (firewall port redirection...) successfully. I can start OpenCarp on a single machine (remote or local) with multiple processes defined by -n argument or by the host file (list), but as soon as I try to use multiple machines it comes to an error. As long as the proc count -np is fullfilled by the first machine it works, but as soon as the process count reaches machine 2 it fails.
Maybe someone has a hint or the solution. (WSL/Ubuntu LTS 2404, OpenCarp15 )
Thnx in advance

here's my call

mpiexec -f ~/mpi_hosts -np 2 -bootstrap ssh -bootstrap-exec "$HOME/mpi_ssh_bootstrap.sh" openCARP +F <...>

my mpi_hosts file like (tried different process counts)

<ipAddr1>:1
<ipAddr2>:1

and the error msg

[0]PETSC ERROR: --------------------- Error Message -----------------------------------------------------[0]PETSC ERROR: General MPI error
[0]PETSC ERROR: MPI error 1 Invalid buffer pointer Ignore the following value 22
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.17.1, Apr 28, 2022
[0]PETSC ERROR: openCARP on a  named <...> by <...>
[0]PETSC ERROR: Configure options PETSC_ARCH=docker-opt --prefix=/usr/local/lib/opencarp/lib/petsc --download-mpich --download-fblaslapack --download-metis --download-parmetis --download-hypre --with-debugging=0 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2
[0]PETSC ERROR: #1 PetscOptionsGetenv() at /tmp/petsc-3.17.1/src/sys/utils/pdisplay.c:62
[0]PETSC ERROR: #2 PetscStrreplace() at /tmp/petsc-3.17.1/src/sys/utils/str.c:1119
[0]PETSC ERROR: #3 PetscOptionsFilename() at /tmp/petsc-3.17.1/src/sys/objects/options.c:396
[0]PETSC ERROR: #4 PetscOptionsInsertFile() at /tmp/petsc-3.17.1/src/sys/objects/options.c:621
[0]PETSC ERROR: #5 PetscOptionsInsert() at /tmp/petsc-3.17.1/src/sys/objects/options.c:839
[0]PETSC ERROR: #6 PetscInitialize_Common() at /tmp/petsc-3.17.1/src/sys/objects/pinit.c:933
[0]PETSC ERROR: #7 PetscInitialize() at /tmp/petsc-3.17.1/src/sys/objects/pinit.c:1224

1 Answer

0 votes
by (19.1k points)

Can you run other MPI programs on multiple of those machines?

One thing that caught my attention is your PETSc configure option "PETSC_ARCH=docker-opt" which seems to target Docker. Are you trying to run on multiple machines from within Docker?

by (120 points)
Thanks for your answer Axel and sry for the late response.
I tested the mpiexec command with the same setup and a sample program and everything works as expected. There's nothing I do explicitly with docker on any of the machines used. This msg comes from deep inside.

mpirun -n 7 -f ~/mpi_hosts -bootstrap ssh -bootstrap-exec "$HOME/mpi_ssh_bootstrap.sh" ./mpich_hello_world
Warning: Permanently added '[ip_addr1]:2222' (ED25519) to the list of known hosts.
Warning: Permanently added '[ip_addr2]:2222' (ED25519) to the list of known hosts.
Hello world from processor host_name1, rank 0 out of 1 processors
Hello world from processor host_name1, rank 0 out of 1 processors
Hello world from processor host_name1, rank 0 out of 1 processors
Hello world from processor host_name1, rank 0 out of 1 processors
Hello world from processor host_name2, rank 0 out of 1 processors
Hello world from processor host_name2, rank 0 out of 1 processors
Hello world from processor host_name2, rank 0 out of 1 processors

ip addresses and host names replaced by aliases
The mpi_hosts file for this example was setup accordingly.
by (19.1k points)
How did you install openCARP on this machine?

Building with Spack? Building from scratch?
by (120 points)
I downloaded and installed the deb package
by (120 points)
Seems to be an issue with the WSL/Ubuntu  setup on my machines. In a pure Linux environment it works. Anyway any help is appreciated. Thanks for your time.
Welcome to openCARP Q&A. Ask questions and receive answers from other members of the community. For best support, please use appropriate TAGS!
architecture, carputils, documentation, experiments, installation-containers-packages, limpet, slimfem, website, governance
...