[Opm] ParMETIS error on HPC
Antoine B Jacquey
ajacquey at mit.edu
Tue Oct 20 13:05:36 UTC 2020
Hi Atgeirr,
I use AMG as preconditioner (—use-amg=true). When the ParMETIS errors occur, the simulation crashes.
I indeed linked to the ParMETIS library when configuring DUNE.
Do you actually advise to use PTScotch instead of ParMETIS? I could try to recompile Dune + OPM with PTScotch to see if the simulation runs with this configuration.
Thanks for your answer.
Antoine
> On Oct 20, 2020, at 05:43, Atgeirr Rasmussen <Atgeirr.Rasmussen at sintef.no> wrote:
>
> Hi Antoine!
>
> Our partitioning scheme starts with the whole graph on a single process, so indeed this would be a "bad" starting partition. The partitioning we end up with does not seem any worse though, although for very large process counts, this could become a bottleneck.
>
> I am a little confused though, as OPM Flow uses Zoltan for partitioning, not ParMETIS. This is because ParMETIS is not open source. However, if you do have access to ParMETIS I believe you can configure the dune-istl parallel linear solvers (that in turn are used by OPM Flow) to use ParMETIS (or the PTScotch workalike library) for redistribution of coarse systems within the algebraic multigrid (AMG) solver. However, that is not the default linear solver for OPM Flow. So I am a bit at a loss, as to where those ParMETIS messages come from! Did you run with the default linear solver or not? I assume that the simulation actually runs?
>
> Atgeirr
> ________________________________
> From: Opm <opm-bounces at opm-project.org> on behalf of Antoine B Jacquey <ajacquey at mit.edu>
> Sent: 19 October 2020 14:22
> To: opm at opm-project.org <opm at opm-project.org>
> Subject: [Opm] ParMETIS error on HPC
>
> Hi OPM community,
>
> I recently compiled OPM Flow on a local cluster in my institute. I linked to the PartMETIS library during configuration to make use of mesh partitioning when using large number of MPI processes.
> When I run a flow simulation, it seems that the mesh is partitioned automatically. Here is part of the output I get for a simulation with 8 MPI processes:
>
> Load balancing distributes 300000 active cells on 8 processes as follows:
> rank owned cells overlap cells total cells
> --------------------------------------------------
> 0 36960 2760 39720
> 1 40110 3720 43830
> 2 38100 4110 42210
> 3 38940 2250 41190
> 4 36600 2280 38880
> 5 33660 3690 37350
> 6 37800 3690 41490
> 7 37830 2730 40560
> --------------------------------------------------
> sum 300000 25230 325230
>
> The problem occurs when I use a larger number of MPI processes (here for 27 MPI processes). The mesh is also partitioned:
>
> Load balancing distributes 1012500 active cells on 27 processes as follows:
> rank owned cells overlap cells total cells
> --------------------------------------------------
> 0 40230 6390 46620
> 1 40185 5175 45360
> 2 40635 4050 44685
> 3 40230 6255 46485
> 4 40905 5850 46755
> 5 39825 6030 45855
> 6 37035 2610 39645
> 7 36945 5625 42570
> 8 40680 4185 44865
> 9 35835 5460 41295
> 10 41250 6765 48015
> 11 39825 5310 45135
> 12 36855 2655 39510
> 13 32850 3690 36540
> 14 38790 5400 44190
> 15 36540 5625 42165
> 16 30105 3105 33210
> 17 40320 5400 45720
> 18 35685 4185 39870
> 19 39465 5490 44955
> 20 20160 1800 21960
> 21 39915 4860 44775
> 22 40050 6165 46215
> 23 34020 2475 36495
> 24 39645 6345 45990
> 25 36990 6570 43560
> 26 37530 4005 41535
> --------------------------------------------------
> sum 1012500 131475 1143975
>
> But during the first time step calculation, I get the following errors:
>
> Time step 0, stepsize 1 days, at day 0/7, date = 01-Jan-2015
> Switching control mode for well INJ from RATE to BHP on rank 20
> Switching control mode for well INJ from BHP to RATE on rank 20
> PARMETIS ERROR: Poor initial vertex distribution. Processor 2 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 4 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 6 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 8 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 12 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 14 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 16 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 18 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 20 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 0 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 10 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 22 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 26 has no vertices assigned to it!
> PARMETIS ERROR: Poor initial vertex distribution. Processor 24 has no vertices assigned to it!
>
> Does anyone know what this error means? Is it coming because of a bad mesh partitioning or is it due to something else?
>
> I would appreciate any advice or tip to solve this issue.
> Thank you in advance.
>
> Best,
>
> Antoine
> _______________________________________________
> Opm mailing list
> Opm at opm-project.org
> https://opm-project.org/cgi-bin/mailman/listinfo/opm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1866 bytes
Desc: not available
URL: <//opm-project.org/pipermail/opm/attachments/20201020/caf9d756/attachment.bin>
More information about the Opm
mailing list