First time here? Checkout the FAQ!
x
0 votes
ago by (390 points)
Hi all,

I am running exactly the same simulations (same mesh, parameters, initial state, etc...) in 3 different environments:

- my PC: older version of openCARP, --np=16

- HPC cluster: latest spack openCARP, --np=64

- HPC cluster: latest spack openCARP, --np=96

All three results initially seem similar but they diverge a lot over time and end up very different at the end of 5s simulation. Watching vm.igb it seems that reentrant wave on np=96 is the fastest, as I was able to track the waves across all three simulations for the first couple of ms before they fully diverged. Then I recorded ECGs and they also look different.

Is it a known issue?

Best,

Jakub
ago by (390 points)

opencarp versions:

- my PC:

(not sure how to get version so here is git hash from parameters.par)

CARP GIT commit hash: 2271a3cccd7137f1e28c043c10adbd80480f1462

- HPC cluster:

-- linux-ubuntu22.04-zen2 / %c,cxx=gcc@11.4.0 -------------------

jfta4kp opencarp@19.0+carputils~ipo+meshtool build_system=cmake build_type=Release commit=9accae0522f4108774ed8eb1df90b8ef0afd4c8e generator=make

1 Answer

0 votes
ago by (1.8k points)
Hey Jakub,

no this is not a known issue. The testing framework of openCARP generally tests parallel solutions against serial solutions for most general use cases. If something was inherently wrong with parallelism, this would very likely be flagged by the testing pipeline. A slight deviation (~1e-6) of the solutions is expected.

To help you further, we would need more information about your simulation.

Best,

Tobias
ago by (390 points)
Hi Tobias,

I took a liberty of finding your KIT email online and I shared with you OneDrive folder with simulations outputs and docs.

Do you need me to provide any extra information regarding packages versions etc?

Best,

Jakub
ago by (1.8k points)
Got it, thanks! I will take a look.
ago by (1.8k points)
I looked at the files and also ran simulations starting from the state.1000.0.roe file for 1000ms. The results look the same using np 4 and np 8. There are floating point differences as expected.

Something I noticed is the following line in your HPC logs for the simulation starting from the tissue_init/state.2.0.roe file:

L3 : Number of nonmatching nodes: 235

The np 64 and np 96 logs show a different number of non matching nodes. I got the same warning when I ran the simulations locally with different numbers.

This begs the question: did your mesh change in any way since you created the state.2.0.roe file? If you have non matching nodes, it means your imp_region IDs have changed compared to when the original state was saved. Those nodes then have different initial conditions since they go back to assume the default state. Since you have different numbers of non matching nodes, I assume this might be the source why your solutions drift apart.
Welcome to openCARP Q&A. Ask questions and receive answers from other members of the community. For best support, please use appropriate TAGS!
architecture, carputils, documentation, experiments, installation-containers-packages, limpet, slimfem, website, governance
...