[Opm] File formats

Tue May 19 05:22:51 UTC 2020

My take on this is:

 1. Yes I see the value of a transposed file format - however the value
    is quite limited before it is implemented in post processing tools.
    The feasability of implementing/using said format in post processing
    tools should therefor be an important criteria.
 2. I *think* Petrel / eclrun / eclipse has some functionality in this
    regard - if this is a file we can be compatible with that would make
    very much sense.
 3. In addition to HDF5 I would consider looking into Parquet which at
    least is a much newer format than HDF5

Here is an extensive file-format comparison: 
https://indico.cern.ch/event/613842/contributions/2585787/attachments/1463230/2260889/pivarski-data-formats.pdf 

On 5/18/20 5:51 PM, Alf Birger Rustad wrote:
> Dear community,
>
> We are at a cross roads with respect to file formats, and I hope you are motivated to help us arrive at the best solution. We need better load-on-demand performance for summary files than what is currently possible with the default Eclipse format for summary files. Currently you will find an implementation in opm-common that simply transposes the summary vectors, while still using the same Fortran77 binary format. That approach has mainly three drawbacks. One is that it is not supported by any post-processing application (yet).
> The second is that it can only be created from a finished simulation, so you need to wait for simulations to finish before you get the performant result file.

For a traditional column oriented file format in any sense I think you 
will need to write out the file in full, i.e. I think this will apply 
anyways. Use of a database format might resolve this, or at least handle 
the appending transparently, but that is maybe a bit overkill?

> The third being that it is not suited for parallel processing, so forget about each process writing out it's part.

For the summary files that is not so relevant, because the final 
calculation of summary properties like WWCT = WWPR / (WWPR + WOPR) is 
only done on the IO rank anyway.

Joakim