Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New data structure for multidimensional data (+some bugfixes) #1286

Merged
merged 55 commits into from
May 29, 2021

Conversation

maxaehle
Copy link
Contributor

@maxaehle maxaehle commented May 11, 2021

Edit: Besides introducing the NdFlattener data structure, this PR applies the following bugfixes:

  • correct retrieval of the wall roughness in a multizone setup, the original intention of this PR
  • the non-MPI substitute for SU2_MPI::Allgatherv and Alltoallv now takes the displacement into account
  • fix compilation with GCC v11.1 fails #1293, adding a missing #include statement

Below is the original PR description.

Proposed Changes

I think that currently the roughness at the closest wall is retrieved in a wrong way for multizone setups. The current structure is like this:

def CGeometry::ComputeWallDistance:
    for each iZone:
         construct an ADT which stores the markers of iZone
         for each jZone:
             call CPhysicalGeometry::SetWallDistance to jZone geometry which \
             reduces the stored wall distances of jZone's points according to iZone's markers in the ADT and\
             sets wall roughnesses of jZone's points according to data from jZone's geometry

The issue I perceive here is that the closest wall in the ADT, identified by the markerID, belongs to iZone. iZone's geometry and config must therefore be an additional argument to CPhysicalGeometry::SetWallDistance. We must not use the wall roughness from jZone's geometry and config associated to the markerID, as it is currently done.

Related Work

?

PR Checklist

Put an X by all that apply. You can fill this out after submitting the PR. If you have any questions, don't hesitate to ask! We want to help. These are a guide for you to know what the reviewers will be looking for in your contribution.

  • [X ] I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags, or simply --warnlevel=2 when using meson).
  • [X ] My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.
  • I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.

CodeFactor:  "virtual" is redundant since function is already
declared as "override"
@maxaehle
Copy link
Contributor Author

Ok, now CPoint stores information on the closest wall element location in vectors ClosestWall_{Rank,Zone,Marker,Elem}. CPoint::SetWallRoughness performs the lookup, inserting the location into the map (rank,zone,marker)->(roughness) that is provided to CPoint::SetWallRoughness as an argument of a new templated type NdFlattener.

I wrote the new file Common/include/ndflattener.hpp some time ago and I think it is a nice generalization of the

custom type with interface (rank, [zone,] marker)

you're referring to. In short, it allows to

  • serialize pointer-to-pointer-to... arrays into 1D arrays, as many as there are indices. The array corresponding to the last index stores the data and the others store indices.
  • And to MPI_Allgatherv them, adding a new index for the rank.

Using it detaches a bit from the original goal of this PR (which was to fix some illegal reads reported by valgrind, because the wrong array of roughnesses was accessed) but I think it's a useful structure. Probably I should also add some tests.

resolves the warning:
comparison between signed and unsigned integer expression
@maxaehle
Copy link
Contributor Author

I found only one testcase for rough walls, in TestCases/incomp_rans/rough_flatplate/rough_flatplate_incomp.cfg. It's not contained in the regression tests.

Copy link
Member

@pcarruscag pcarruscag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Max, I think this is a nice addition but if you have time, it would be good to make it more obvious how to use it (see below).

Common/src/geometry/CGeometry.cpp Outdated Show resolved Hide resolved
Common/src/geometry/CGeometry.cpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/ndflattener.hpp Outdated Show resolved Hide resolved
Common/src/geometry/dual_grid/CPoint.cpp Outdated Show resolved Hide resolved
@maxaehle maxaehle changed the title Fix retrieval of wallroughness in multizone setup Data structure for communication of list-of-list-... data (+fix retrieval of wallroughness in multizone setup) May 15, 2021
Copy link
Member

@pcarruscag pcarruscag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, another batch of comments, it looks like there is some room for simplification.

Common/src/geometry/CGeometry.cpp Outdated Show resolved Hide resolved
Common/include/fem/fem_geometry_structure.hpp Outdated Show resolved Hide resolved
Common/include/geometry/dual_grid/CPoint.hpp Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Common/include/toolboxes/ndflattener.hpp Outdated Show resolved Hide resolved
Comment on lines 160 to 161
* - To communicate su2double data, just call ND_MPI_Environment().
* - To communicate unsigned long data, call Nd_MPI_Environment(MPI_UNSIGNED_LONG).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

maxaehle and others added 3 commits May 25, 2021 14:12
which turn the NdFlattener type to const (for N>1) or return a
const reference/pointer (for N==1).
Now the ndflattener_tests.cpp should NOT compile.
Copy link
Member

@pcarruscag pcarruscag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UB sanitizer had nothing to say, so I think the cases that changed might just be due to solar flares or something.

let's see what happens in the CI pipeline
Copy link
Contributor

@TobiKattmann TobiKattmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright @maxaehle, thanks for this escalation of a bug fix 💐
I really did not understand much of the implementation details tbh so my question are more of a general nature. That said, the implementation looks fine otherwise and I appreciate the extensive commenting. I forego the approval because I simply feel I didn't understand enough to do so.

You have a Usage section and Implementation details in the head documentation of ndflattener.hpp and I would love to see a When/why to use | What is my purpose section. So as far as I understand the ndflattener can be used to (deep)copy specified data that is behind multidimensional (pointer-to-pointer-... ) access (e.g. rank,zone,marker) into a new datastructure that inherits the same access pattern to the outside and internally is a set of a 1D arrays. The data to be copied might be part a small part of a larger class like geometry so we would also "extract" the data from such (more complicated, and less contiguous) structure while keeping the multidimensional acces logic.

So it is always useful when i just want to use very specific data which is 'hidden' in a larger class/dataset but need to keep the accessing logic. I can extract the data with the ndflattener at a top level and pass just this data I need the a method which needs it. Like that I dont have to pass all kinds of class pointers down.

let me know it any of this makes sense 🐌

su2activevector Wall_Distance; /*!< \brief Distance to the nearest wall. */
su2vector<int> ClosestWall_Rank; /*!< \brief Rank of process holding the closest wall element. */
su2vector<unsigned short> ClosestWall_Zone; /*!< \brief Zone index of closest wall element. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I already have problems to understand the basics: Why is the whole WallRoughness thing not a matter of every zone by themselves? I have no real clue of how wall roughness is intended to work but I imagined that in each (fluid-)zone there is an added viscosity (or sth else, idk) added to nodes in a certain vicinity (also as function of wall distance) to a wall with wall roughness. Now I don't know how other zones would influence that. ClosestWall_Rank (and Marker and Elem), sure ... but ClosestWall_Zone, why?
plz help 🏳️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code cites

Aupoix, B. and Spalart, P. R., "Extensions of the Spalart-Allmaras Turbulence Model to Account for Wall Roughness,"
International Journal of Heat and Fluid Flow, Vol. 24, 2003, pp. 454-462.
See https://turbmodels.larc.nasa.gov/spalart.html#sarough for detailed explanation.

The TMR page says that you require the " conventional Nikuradse sand roughness scale height" at the wall. In the old implementation each marker is allowed to have its specific roughness, and for integration at one point, the roughness of the closest marker is chosen. As I just wanted to fix the bug and I'm not familiar with the theory, I kept this level of generality.

I encountered the bug in setup with two fluid zones connected by a sliding mesh interface.

The point is that in CPhysicalGeometry::SetWallDistance, the call to WallADT::DetermineNearestElement gives the closest wall marker in the iZone (which is not the zone in which this CPhysicalGeometry lives in, generally). It is therefore wrong to access the wall roughness array of this CPhysicalGeometry with the marker ID that belongs to a different zone. In my case, valgrind reported an invalid read, because there were different numbers of markers in both zones.

Maybe the dependency on the rank is irrelevant, I don't know if all markers have the same marker IDs in all ranks... I don't care ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't. iMarker only makes sense locally and different ranks may have different boundaries.
If you try to do any MPI as part of a boundary condition you will have a bad time.

auto rankID = ClosestWall_Rank[iPoint];
auto zoneID = ClosestWall_Zone[iPoint];
auto markerID = ClosestWall_Marker[iPoint];
if(rankID >= 0){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which case can a rank < 0? Or is this a pure safety

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mainly safety. The ClosestWall_Rank is initialized by -1. If there is no viscous wall, it will stay -1. Right now this code is only called from CGeometry::ComputeWallDistance, and only if there is a viscous wall, so there is no immediate danger IMO. But I think we should keep it safe.

Comment on lines 100 to 103
* std::cout << nd_global[rank][zone][marker];
* auto nd_g_rank = nd_global[rank];
* nd_g_rank[zone][marker] += 1.0;
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you maybe want to add expected output for an arbitrary example? Like (e.g.) the above cout returns 2 and after the last line if you call were to call the cout-line it would output 3
The IndexAccumulator is understandable in what is does though anyway imo 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a second output line with example value. This does also demonstrate that writing to the IndexAccumulator does not invalidate it (in case somebody asks).

Comment on lines 164 to 166
* To form such a structure, we typically iterate twice through the pointer-to-pointer-...
* array: The first time we get to know how much space to reserve in each layer, then we
* allocate it and fill it with data during the second iteration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Nd-flattener copies data in its own datastructure or does it just provides a different way to access what is already there?
Edit: From the unittest (further down) it becomes apparent that the former (copying) is the case

Comment on lines 804 to 809
void SetRoughness(su2double val_roughness_i, su2double val_roughness_j) {
if(val_roughness_i!=0.0) SU2_MPI::Error("val_roughness_i!=0.0",CURRENT_FUNCTION);
if(val_roughness_j!=0.0) SU2_MPI::Error("val_roughness_j!=0.0",CURRENT_FUNCTION);
roughness_i = val_roughness_i;
roughness_j = val_roughness_j;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is going here? To me it is not exactly clear why a SetRoughness function would throw if the roughness value is unequal to zero. I could imagine somehow why one would error in the opposite case of val_roughness_i==0.0. Maybe also not just print the conditional but a short explanation why that is bad thing such that a user could derive some action to take (if applicable)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just a temporary addition in 1244638 to see whether it works as intended ;) There is no roughness in the regression tests yet, so these errors should never occur; yet, they do. I will further investigate this and report in the main discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so just temporary debugging output/error 👍

int rank; SU2_MPI::Comm_rank(SU2_MPI::GetComm(), &rank);
int size; SU2_MPI::Comm_size(SU2_MPI::GetComm(), &size);

/*-- Provide non-flat array --*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SU2 consistent documentation would be

/*---

not

/*--

...just kidding :D dont change this

Comments on when/why to use NdFlattener,
and some other changes.
Comment on lines 504 to 505
indices.reserve(nNodes + 1);
indices[0] = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this use of vector::reserve is not correct, reserve changes the capacity of the vector but not its size.
Technically indices[0] is out of bounds (not in memory but in "logic").
Whether this is the problem or not IDK.

resize instead of reserve?
Comment on lines +42 to +49
* # What is an NdFlattener?
*
* NdFlattener is a container to store a deep copy of multidimensional data in a single object that provides a
* generic []...[] interface. The data and their structure are stored in few contiguous arrays, allowing for an
* efficient MPI gathering operation.
*
* # Which problem does it solve?
* In SU2, there is a lot of multidimensional data: data depending on K>1 indices, like a wall property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for added parts 👍
A lot of work for the wall roughness feature that is not used (to my knowledge) but hopefully finds some good usage in the future 🔮 it certainly is one the better documented code bits :)

before, possibly non-zero displacements were ignored
@maxaehle
Copy link
Contributor Author

8372309 fixed the MPI_Allgatherv function used in a non-MPI build. Before, it ignored the displacement (which is 1 for the index arrays). This was revealed by experiments with the CI pipeline in #1292. Now the tests succeed.

fixes bug #1293: compilation with GCC 11 fails
@maxaehle maxaehle changed the title Data structure for communication of list-of-list-... data (+fix retrieval of wallroughness in multizone setup) New data structure for multidimensional data (+some bugfixes) May 28, 2021
@maxaehle maxaehle linked an issue May 28, 2021 that may be closed by this pull request
@pcarruscag pcarruscag merged commit 1d3dc1a into develop May 29, 2021
@pcarruscag pcarruscag deleted the fix_wallroughness_multizone branch May 29, 2021 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compilation with GCC v11.1 fails
3 participants