Gather scatter #52

pmarguinaud · 2020-05-22T08:09:24Z

This is a proposal for improving the current gather/scatter methods.

Some minor modifications have to be applied to Atlas :

fields & views of type char
atlas::vector dimensioned and indexed by size_t
add a size method to distribution

I have assumed that for a distributed field, jloc1 < jloc2 <=> jglo1 < jglo2, jglo being the global indice, and jloc the local indice (on the current task), but I can do without this assumption.

The proposal, although not finalized yet, makes it possible :

to process fields of any type, any rank
to distribute evenly fields with multiple dimensions (multiple levels or variables)
could be extended to handle fields with NPROMA, NGPBLKS dimensions
is much faster than the current implementation; on my reference test case (2.5km global, 300 fields, the gather takes 25s vs 3 minutes on 4 Rome nodes)

The proposal includes two new features (these should be implemented as new methods for views) :

convert any view to a view of byte elements, simply by creating a view with an extra dimension
drop a dimension of a view; such as in Fortran : if ZFLD is an array of dimension 4, then it possible to derive arrays of dimension 3 (eg ZFLD (:,:,4,:)), simply by setting the value of a dimension

These templates are included in this file : arrayViewHelpers.h, and make it possible to recursively build a list of views of type atlas::array::ArrayView<byte,2> from any view (any type, any rank). The first dimension of atlas::array::ArrayView<byte,2> is the fastest varying dimension of the original view, and the second has a length which corresponds to the type size.

The GatherScatter class handles only atlas::array::ArrayView<byte,2> views, which makes the code quite simple, as all type and rank issues have already been resolved. This could be even simpler is we could assume that the fastest varying dimension of Atlas fields had a stride ==1 (this should be always the case); in this case, we could restrict ourselves to atlas::array::ArrayView<byte,1> views.

Some further possible improvements :

buffer packing/unpacking in parallel of communications
multiple dimensions are involved : number of MPI tasks, number of fields, size of global and distributed fields; choosing the better dimension for OpenMP depends on the test case

pmarguinaud · 2020-05-26T16:30:52Z

Hello Willem,

I have looked at the slicer function which is nice for people working with views a of known rank, but I cannot see how I could use it to reduce a view with N dimensions to a list of views of dimension 1. I have to do that recursively, but I cannot write code such as :

auto slice = ZFLD.slice( Range::all(), Range::all(), 4, Range::all() );

because I need to handle a view of an arbitrary rank, and the code above deals with views of rank 4 only.

Hence I thought I could create a function which removes a dimension to a view, whatever its dimension :
auto w = dropDimension (v, 4, 2);

Please look at dropDimension (arrayViewHelpers.h) and reconsider this point, or tell me how to address my basic need which is to create a list of views of dimension 1 from a view of any rank.

Philippe

pmarguinaud · 2020-05-26T16:40:01Z

Hello again,

This is the code you wrote :

! Fortran
type(atlas_Field) :: field_GFL_4
field_GFL_4 = atlas_Field( GFL( 1:NPROMA, 1:NLEV, 4, 1:NBLK ) )

In this case, the fastest varying dimension is 1:NPROMA; it is indeed contiguous, and has a stride of 1. This what I would like to assume : that fields involved in I/O have a fastest varying dimension which is contiguous.

Is it reasonable to make this assumption or not ?

Philippe

pmarguinaud · 2020-05-26T16:52:15Z

Hello again,

What you ask for :

FunctionSpace structured_columns = functionspace::StructuredColumns( grid ,
                                        option::levels(nlev));
Field wind = structured_columns.createField<double>("wind",option::variables(2));
    // shape={ ngp(partition), nlev=100, nvar=2} ,  rank=3
Field u_global = structured_columns.createField<double>("u",option::global());
   // shape={ ngp(GLOBAL), nlev=100 }  , rank=2

gather( wind.slice( Range::all(), Range::all(), 0 ).data(), .... , u_global.data(), ... );
    // ugly API that needs to manually provide strides and shapes etc...

Looks very feasible to me with what I propose. In fact the ioFieldDescriptor lets the user distribute the fields in any possible way. It would be possible to choose a target processor for any of the 1:NGPTOT or 1:NPROMA/1:NGPBLKS fields.

Philippe

wdeconinck · 2020-05-26T16:54:27Z

You are completely right about the slicing :) I was considering this as well after I wrote my reply, but still wondered if it could be useful or were aware. It will be good to have.
Note however that the return type of a slice is of type LocalView. The semantic difference is that the ArrayView gets created from an Array, which contains either NativeDataStore or GridToolsDataStore. The latter is GPU aware, and the ArrayView can then be either on host or device. A LocalView is more straightforward, created as a slice from an ArrayView, but exactly the same API.

This will mess up perhaps the API of gather/scatter which expects an ArrayView. Perhaps that can be templated to accept both LocalView and ArrayView.

pmarguinaud · 2020-05-26T16:59:55Z

I have to admit that I had not this in mind at all :

auto T_global = Field( "T", array::make_datatype<double>(), array::make_shape(nlev, ngp_global));
auto T_local  = Field( "T", array::make_datatype<double>(), array::make_shape(ngp_local, nlev);

scatter(T_global, T_local);
gather(T_local,T_global);

I do not think it happens in ARPEGE/IFS for grid-point fields, only for spectral data we do stuff like that; but spectral data requires a slightly different treatment.

Achieving something like that is probably possible, I need to think about it. It is clear we would need to specify for each field which dimension is the grid index.

In any case, I think the assumption I suggested to make (ie contiguous field dimension) would not resist it.

Philippe

pmarguinaud · 2020-05-27T07:33:29Z

Hello Willem,

I can make this code work :

auto T_global = Field( "T", array::make_datatype<double>(), array::make_shape(nlev, ngp_global));
auto T_local  = Field( "T", array::make_datatype<double>(), array::make_shape(ngp_local, nlev);

scatter(T_global, T_local);
gather(T_local,T_global);

It is sufficient that the user explicitly states which dimension is the grid index; then I can drop all other dimensions while reducing the view to a list of 1D views. Same thing for NPROMA/NGPBLKS dimensions; I can easily add a blocking dimension. Note that fields without NPROMA (that is with a dimension 1:NGPTOT) are just a special case (single block, NPROMA=NGPTOT).

To be more explicit, the ioFieldDesc objects are essentially this :

  atlas::array::ArrayView<byte,2> _v;  // pointer to a grid section (1:NGPTOT) x (1:sizeof (double) or sizeof (long), etc.)
  const std::vector<atlas::idx_t> _ind;  // Values of dimensions we had to drop to build this view
  const atlas::Field & _f;             // Reference to the original field (access to its metadata)

ioFieldDesc objects therefore hold a chunk of field data and associated metadata (for fields with a level, the _ind member would hold the level index).

I do not know a lot about the difference between ArrayView and LocalView, but I am pretty sure we can sort out the problem using templates.

I have also made this assumption :

I have assumed that for a distributed field, jloc1 < jloc2 <=> jglo1 < jglo2, jglo being the global indice, and jloc the local indice (on the current task), but I can do without this assumption.

This simplifies things (I do not need a functionspace anymore, but only the distribution object). I would like you to comment on that.

You also address the issue of metadata. These are small and of unpredictable size; therefore, it would be better not to pack them with field data, but rather distribute them using an allgather. The situation would be different for an IO server but we can see that later.

Please also keep in mind that we are breaking the MPI standard (packing double/long data as bytes); but this allows much more flexibility.

wdeconinck · 2020-05-27T08:50:39Z

Very nice that the transposition could work! A use case would be e.g. to distribute horizontal levels obtained from e.g. grib into a different transposed or blocked (NPROMA) configuration.
The assumption that jloc1 < jloc2 <=> jglo1 < jglo2 is not guaranteed. Especially as I have added mesh renumbering algorithms (Hilbert space filling curve).
It would be nice if we did not need to pass a functionspace. Currently one GatherScatter object exists per functionspace. Could we not pass the extra information contained within the functionspace to the GatherScatter setup? We can assume that all fields will share the same functionspace.
Why is casting to MPI_CHAR breaking the standard? We are at least not doing any reductions, only message passing.

pmarguinaud · 2020-05-27T13:27:02Z

I will provide some tests/examples with NPROMA/NGPBLKS and NGPTOT/NFLEVG. It will take a few days.

wdeconinck · 2020-05-27T13:36:58Z

I will provide some tests/examples with NPROMA/NGPBLKS and NGPTOT/NFLEVG. It will take a few days.

Thank you Philippe, having this functionality will be really great.

pmarguinaud · 2020-06-04T09:14:12Z

Hello Willem,

I have added two new test cases (I use Fortran order to describe them) :

(1:NFLEVG,1:NGPTOT,1:NFLDS) <-> list of (1:NGPTOTG) fields
(1:NPROMA,1:NFLEVG,1:NGPBLKS) <-> list of (1:NGPTOTG) fields

Both of them are included in test_gatherscatter.cc. Please look at them and see whether you find that sensible.

There are other points we should sort out :

LocalView/ArrayView distinction; not 100% clear to me, but you said I have to do something about it
The assumption jloc1 < jloc2 <=> jglo1 < jglo2; how I can do without. Maybe an example where it is not the case would help me.

Regards,

Philippe

…cal-64 to develop * commit 'b26965ddfd3c31feecb07578de61ae20a2742d2d': ATLAS-317 Prevent integer overflows if ATLAS_BITS_LOCAL=64 ATLAS-317 Support ATLAS_BITS_LOCAL=64 in tests

pmarguinaud added 30 commits May 13, 2020 14:28

Start working on batchgather

e5824d5

First working test_batchgather.cc

8906c08

Save

42ad722

Drop view last dimension, create view without check

b32d684

dropDimension is an external template

6c16d78

Convert a view to a list of ArrayView<byte,2>

726e613

Use ioFieldDesc

56ffc8a

Split into several files; first working gather

f9a929e

Refactor

6ffc7ca

Split into several files

8a38618

Cleaning

f869fd2

Cleaning

1acc57d

Add trace

1f31855

Add trace

d38948f

Cleaning

ac62479

Try grep function

192cf21

Cleaning

fbb4699

Cleaning

8e3671f

Cleaning

b7d59aa

Cleaning

0179045

Cleaning

e894b5a

Cleaning

5fd5172

Cleaning

eb2980d

Cleaning

d319595

Cleaning

af51904

Cleaning

457d403

Cleaning

bb184ad

Cleaning

0efd0d8

Cleaning

61b1dc0

Cleaning

ae522ab

pmarguinaud added 15 commits May 28, 2020 07:23

Cleaning

064179f

Add block dimension

c824bff

test_gatherscatter_ngptotxnflevg

f8aaa39

Fiddle with templates

4908463

Fiddle with templates

4bac9da

Fiddle with templates

00c9641

Test gather

81cf89e

Test gather/scatter options

3f14811

More Atlas traces

6932a92

Prepare for NPROMA blocks

b1179c5

Prepare for NPROMA blocks

412ae85

Test gather/scatter for (1:NPROMA,1:NFLEVG,1:NGPBLKS)

15aaf87

Clean simple test case

1cec258

Remove useless test case

44cba0d

Cleanup

90753b2

Handle other types/ranks

6724f9f

wdeconinck added Status: In Progress This is currently being worked on Type: Enhancement New feature or request labels Jun 12, 2020

Use viewLoop template

196d2d8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gather scatter #52

Gather scatter #52

pmarguinaud commented May 22, 2020

pmarguinaud commented May 26, 2020 •

edited

Loading

pmarguinaud commented May 26, 2020

pmarguinaud commented May 26, 2020 •

edited

Loading

wdeconinck commented May 26, 2020

pmarguinaud commented May 26, 2020

pmarguinaud commented May 27, 2020 •

edited

Loading

wdeconinck commented May 27, 2020

pmarguinaud commented May 27, 2020

wdeconinck commented May 27, 2020

pmarguinaud commented Jun 4, 2020

Gather scatter #52

Are you sure you want to change the base?

Gather scatter #52

Conversation

pmarguinaud commented May 22, 2020

pmarguinaud commented May 26, 2020 • edited Loading

pmarguinaud commented May 26, 2020

pmarguinaud commented May 26, 2020 • edited Loading

wdeconinck commented May 26, 2020

pmarguinaud commented May 26, 2020

pmarguinaud commented May 27, 2020 • edited Loading

wdeconinck commented May 27, 2020

pmarguinaud commented May 27, 2020

wdeconinck commented May 27, 2020

pmarguinaud commented Jun 4, 2020

pmarguinaud commented May 26, 2020 •

edited

Loading

pmarguinaud commented May 26, 2020 •

edited

Loading

pmarguinaud commented May 27, 2020 •

edited

Loading