[Opm] OPM Flow multi-node simulations stuck at domain decomposition step

Atgeirr Rasmussen Atgeirr.Rasmussen at sintef.no
Wed Mar 11 11:45:41 UTC 2020


Hi Yogi,

Lots of things have changed in Flow for the initialization in parallel recently. Could you try to check out the previous release (2019.10) and check if you see the same problems?

Atgeirr
________________________________
Frå: Opm <opm-bounces at opm-project.org> på vegne av Markus Blatt <markus at dr-blatt.de>
Sendt: onsdag 11. mars 2020 11:08
Til: opm at opm-project.org <opm at opm-project.org>
Emne: Re: [Opm] OPM Flow multi-node simulations stuck at domain decomposition step

Hi Yogi,

On Tue, Mar 10, 2020 at 02:15:19PM -0700, Yogi Pandey wrote:
> Simulation is being run on 4 nodes with 32 processors each using following command:
>
> mpirun --display-map -mca btl self -x UCX_TLS=rc,self,sm -x HCOLL_ENABLE_MCAST_ALL=0 -mca coll_hcoll_enable 0 -x UCX_IB_TRAFFIC_CLASS=105 -x UCX_IB_GID_INDEX=3 --cpu-set 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35 -np 144 --hostfile /etc/opt/rdma/hostfile /mnt/nfs-share/etc/opm-flow/opm-simulators/build/bin/flow --parameter-file=/mnt/nfs-share/data/norne/params
>

Out of curiosity. Is there a special reason why --cpu-set runs until 35 with 32 cpus per node? Might you be oversubscribing a node?

Markus

--
Dr. Markus Blatt
OPM-OP AS
_______________________________________________
Opm mailing list
Opm at opm-project.org
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopm-project.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopm&data=02%7C01%7CAtgeirr.Rasmussen%40sintef.no%7C92c439a2c2934fdb4f6308d7c5a44bcd%7Ce1f00f39604145b0b309e0210d8b32af%7C1%7C0%7C637195181746088987&sdata=DP6PfdO7olXefHaRUJk6jRg6%2Bmgthhr4lgZTV%2BJ7zIs%3D&reserved=0


More information about the Opm mailing list