[Opm] ParMETIS error on HPC
Atgeirr Rasmussen
Atgeirr.Rasmussen at sintef.no
Tue Oct 20 09:43:14 UTC 2020
Hi Antoine!
Our partitioning scheme starts with the whole graph on a single process, so indeed this would be a "bad" starting partition. The partitioning we end up with does not seem any worse though, although for very large process counts, this could become a bottleneck.
I am a little confused though, as OPM Flow uses Zoltan for partitioning, not ParMETIS. This is because ParMETIS is not open source. However, if you do have access to ParMETIS I believe you can configure the dune-istl parallel linear solvers (that in turn are used by OPM Flow) to use ParMETIS (or the PTScotch workalike library) for redistribution of coarse systems within the algebraic multigrid (AMG) solver. However, that is not the default linear solver for OPM Flow. So I am a bit at a loss, as to where those ParMETIS messages come from! Did you run with the default linear solver or not? I assume that the simulation actually runs?
Atgeirr
________________________________
From: Opm <opm-bounces at opm-project.org> on behalf of Antoine B Jacquey <ajacquey at mit.edu>
Sent: 19 October 2020 14:22
To: opm at opm-project.org <opm at opm-project.org>
Subject: [Opm] ParMETIS error on HPC
Hi OPM community,
I recently compiled OPM Flow on a local cluster in my institute. I linked to the PartMETIS library during configuration to make use of mesh partitioning when using large number of MPI processes.
When I run a flow simulation, it seems that the mesh is partitioned automatically. Here is part of the output I get for a simulation with 8 MPI processes:
Load balancing distributes 300000 active cells on 8 processes as follows:
rank owned cells overlap cells total cells
--------------------------------------------------
0 36960 2760 39720
1 40110 3720 43830
2 38100 4110 42210
3 38940 2250 41190
4 36600 2280 38880
5 33660 3690 37350
6 37800 3690 41490
7 37830 2730 40560
--------------------------------------------------
sum 300000 25230 325230
The problem occurs when I use a larger number of MPI processes (here for 27 MPI processes). The mesh is also partitioned:
Load balancing distributes 1012500 active cells on 27 processes as follows:
rank owned cells overlap cells total cells
--------------------------------------------------
0 40230 6390 46620
1 40185 5175 45360
2 40635 4050 44685
3 40230 6255 46485
4 40905 5850 46755
5 39825 6030 45855
6 37035 2610 39645
7 36945 5625 42570
8 40680 4185 44865
9 35835 5460 41295
10 41250 6765 48015
11 39825 5310 45135
12 36855 2655 39510
13 32850 3690 36540
14 38790 5400 44190
15 36540 5625 42165
16 30105 3105 33210
17 40320 5400 45720
18 35685 4185 39870
19 39465 5490 44955
20 20160 1800 21960
21 39915 4860 44775
22 40050 6165 46215
23 34020 2475 36495
24 39645 6345 45990
25 36990 6570 43560
26 37530 4005 41535
--------------------------------------------------
sum 1012500 131475 1143975
But during the first time step calculation, I get the following errors:
Time step 0, stepsize 1 days, at day 0/7, date = 01-Jan-2015
Switching control mode for well INJ from RATE to BHP on rank 20
Switching control mode for well INJ from BHP to RATE on rank 20
PARMETIS ERROR: Poor initial vertex distribution. Processor 2 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 4 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 6 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 8 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 12 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 14 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 16 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 18 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 20 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 0 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 10 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 22 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 26 has no vertices assigned to it!
PARMETIS ERROR: Poor initial vertex distribution. Processor 24 has no vertices assigned to it!
Does anyone know what this error means? Is it coming because of a bad mesh partitioning or is it due to something else?
I would appreciate any advice or tip to solve this issue.
Thank you in advance.
Best,
Antoine
More information about the Opm
mailing list