[Opm] OPM Flow multi-node simulations stuck at domain decomposition step
Markus Blatt
markus at dr-blatt.de
Wed Mar 11 10:03:49 UTC 2020
On Tue, Mar 10, 2020 at 02:15:19PM -0700, Yogi Pandey wrote:
> All,
> I am trying to run OPM Flow simulations on multiple nodes. I have built OPM Flow from source on Oracle Linux 7 OS (binary compatible with RHEL) with:
>
> [...]
>
> . OPM Flow modules are built using following commads:
>
> o cmake -DCMAKE_BUILD_TYPE=Release -DUSE_MPI=ON -DUSE_OPENMP=ON -DBLAS_LIBRARIES=/usr/lib64 -DCMAKE_INSTALL_PREFIX=/usr/local ..
>
> o sudo make
>
>
>
> For Norne data set, following is the input file (params) content:
>
> ecl-deck-file-name=NORNE_ATW2013.DATA
> [...]
>
> Simulation is being run on 4 nodes with 32 processors each using following command:
>
> mpirun --display-map -mca btl self -x UCX_TLS=rc,self,sm -x HCOLL_ENABLE_MCAST_ALL=0 -mca coll_hcoll_enable 0 -x UCX_IB_TRAFFIC_CLASS=105 -x UCX_IB_GID_INDEX=3 --cpu-set 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35 -np 144 --hostfile /etc/opt/rdma/hostfile /mnt/nfs-share/etc/opm-flow/opm-simulators/build/bin/flow --parameter-file=/mnt/nfs-share/data/norne/params
>
>
>
> The simulation get stuck indefinitely at the domain decomposition step. I am able to finish a parallel run up to 3 nodes, but always getting stuck at 4 nodes.
>
>
>
> I have also created some customized simulation decks with about 11 million cells to rule-out that fewer number of cells in the Norne model may be a reason, but the simulation gets stuck as soon as I scale from 1 node to 2 nodes. Can someone please help me understand, what might be causing it?
>
>
WHich version of OPM are using? If you are using the release, then chances are that you might simply run out of available memory. You could check that with top or htop on one of the machines and look for the kswapd process popping up.
>
--
Dr. Markus Blatt
OPM-OP AS
More information about the Opm
mailing list