[Opm] OPM Flow multi-node simulations stuck at domain decomposition step

Yogi Pandey yogi.pandey at oracle.com
Tue Mar 10 21:15:19 UTC 2020



I am trying to run OPM Flow simulations on multiple nodes. I have built OPM Flow from source on Oracle Linux 7 OS (binary compatible with RHEL) with:

.        GCC-8.3.1

.        openmpi-4.0.2 (built from source)

.        boost-1.72.0 (built from source)

.        cmake-3.16.4 (built from source)

.        parmetis-4.0.3 (built from source)

.        dune-2.6.0: dune-common, dune-geometry, dune-grid, dune-istl (built from source)

.        Zoltan-3.83 (built from source)

.        OPM Flow modules are built using following commads:


o   sudo make


For Norne data set, following is the input file (params) content:








Simulation is being run on 4 nodes with 32 processors each using following command:

mpirun --display-map -mca btl self -x UCX_TLS=rc,self,sm -x HCOLL_ENABLE_MCAST_ALL=0 -mca coll_hcoll_enable 0 -x UCX_IB_TRAFFIC_CLASS=105 -x UCX_IB_GID_INDEX=3 --cpu-set 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35 -np 144 --hostfile /etc/opt/rdma/hostfile /mnt/nfs-share/etc/opm-flow/opm-simulators/build/bin/flow --parameter-file=/mnt/nfs-share/data/norne/params


The simulation get stuck indefinitely at the domain decomposition step. I am able to finish a parallel run up to 3 nodes, but always getting stuck at 4 nodes.


I have also created some customized simulation decks with about 11 million cells to rule-out that fewer number of cells in the Norne model may be a reason, but the simulation gets stuck as soon as I scale from 1 node to 2 nodes. Can someone please help me understand, what might be causing it?


Thank you,



More information about the Opm mailing list