Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to variable-based encoding #1

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions FORMAT_ADIOS.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,27 +33,27 @@ Output from `bpls -A` for a boolean attribute `pybool` stored in the location of
There is no convention yet for a unique representation of ADIOS2 variables with boolean type.
Thus, implementations should cast the data to and from `unsigned char` instead.

## `stepBased` Encoding of Iterations
## `variableBased` Encoding of Iterations

The `iterationEncoding` mode `stepBased` must be implemented via ADIOS steps.
The `iterationEncoding` mode `variableBased` must be implemented via ADIOS steps.

## Datasets

An openPMD **data set** is represented by an group prefix that contains an ADIOS variable `__data__`.
An openPMD **data set** is represented by an ADIOS `Variable` at the location where it would usually be stored.

**attributes** are defined further below and can also appear at the dataset's **group** prefix level.

## Attributes

openPMD **attributes** stored as ADIOS `Variables` at the location where they would usually be stored.
openPMD **attributes** stored as ADIOS `Attributes` at the location where they would usually be stored.

Example for a mesh record `E` with record component `x` and attributes `unitDimension` and `unitSI`:
```
double /data/meshes/E/unitDimension 10*{7}
double /data/meshes/E/x/__data__ 10*{1000}
double /data/meshes/E/x/position 10*{1}
double /data/meshes/E/x/unitSI 10*scalar
double /data/meshes/E/unitDimension attr = {1, 1, -3, -1, 0, 0, 0}
double /data/meshes/E/x {128, 2048, 128}
double /data/meshes/E/x/position attr = {0.5, 0.5, 0.5}
double /data/meshes/E/x/unitSI attr = 1.22627e+13
```

This example uses `stepBased` iteration encoding, but other iteration encodings would work similarly with their respective `basePath` prefix.
This example uses `variableBased` iteration encoding, but other iteration encodings would work similarly with their respective `basePath` prefix.

24 changes: 16 additions & 8 deletions STANDARD.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ Each file's *root* group (path `/`) must at least contain the attributes:
- allowed values:
- see *Iterations and Time Series* below
- for `fileBased` and `groupBased`, this is fixed to `/data/%T/`
- for `stepBased` this is fixed to `/data/`
- for `variableBased` this is fixed to `/data/`
- note: all the data that is formatted according to the present
standard (i.e. both the meshes and the particles) is to be
stored within a path of the form given by `basePath` (e.g. in
Expand Down Expand Up @@ -214,9 +214,9 @@ Each file's *root* group (path `/`) must further define the attributes:
is an other `open/close` call necessary to access other
iterations
- allowed values:
- `fileBased` (multiple files)
- `groupBased` (one file)
- `stepBased` (one file with internal encoding for iterations, if supported by the data format)
- `fileBased` (multiple files; one iteration per file)
- `groupBased` (one file; iterations use groups in that file)
- `variableBased` (one file; if the data format supports to store multiple iterations in the same variables and attributes)

- `iterationFormat`
- type: *(string)*
Expand All @@ -233,13 +233,13 @@ Each file's *root* group (path `/`) must further define the attributes:
- `filename_%T.h5` (without file system directories)
- for `groupBased`: (fixed value)
- `/data/%T/` (must be equal to and encoded in the `basePath`)
- for `stepBased`: (fixed value)
- for `variableBased`: (fixed value)
- data-format internal convention
- *slowest varying index* of data

### `stepBased` Encoding of Iterations
### `variableBased` Encoding of Iterations

In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the *root* group (path `/`) must contain an additional variable once `stepBased` is chosen for `iterationEncoding`:
In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the iteration base path (default: path `/data`) must contain an additional variable once `variableBased` is chosen for `iterationEncoding`:

- `snapshot`
- type: 1-dimensional array containing N *(int)* elements, where N is the number of updates/steps in the data format
Expand All @@ -248,11 +248,19 @@ In order to correlate openPMD iterations with an index of data-format internal u
- advice to implementers: an openPMD iteration might be spread over multiple updates/steps, but not vice versa.
In such a scenario, an individual openPMD record's update/step must appear exactly once per iteration.

Notes:

* In implementations without support for storing multiple versions of datasets/attributes, the variable-based encoding of iterations may still be used for storage of a single iteration.
In that case, the `snapshot` attribute is optional and defaults to zero (0).
* In implementations with support for storing multiple versions of datasets/attributes, the `snapshot` attribute may optionally be used in group-based encoding to associate openPMD iterations with IO steps.
In group-based encoding, there is still only one instance of this attribute globally (`/data/snapshot`).
In consequence, the attribute shall only be written if modifiable attributes are supported by the implementation.


Required Attributes for the `basePath`
--------------------------------------

In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`stepBased`) should have attributes that describe the current time and the last time step.
In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`variableBased`) should have attributes that describe the current time and the last step.

- `time`
- type: *(floatX)*
Expand Down