Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize twtstight for arbitrary beta0 and code refactoring #5174

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

BeyondEspresso
Copy link
Member

The symmetry scales dY and dT are used to maintain longterm numerical stability by exploiting a spatio-temporal symmetry in the twtstight laser pulse. There has been a bug in dY and dT. Since their equations were unnecessarily convoluted, I also refactored them in the fix. The equations now pass the dY-dT-symmetry test.

@chillenzer
Copy link
Contributor

Without having had much more than a quick glance on it, would it be an option to add that test you were talking about as a unit/end-to-end test?

@BeyondEspresso
Copy link
Member Author

Yes, that should be quite simple.

@chillenzer
Copy link
Contributor

Great, thanks!

@chillenzer chillenzer marked this pull request as draft October 18, 2024 10:26
@BeyondEspresso
Copy link
Member Author

BeyondEspresso commented Oct 18, 2024

@chillenzer @steindev The small PR has grown quite a bit. After applying the original updates I noticed that the twtstight-implementation does not work for beta0 != 1.0, which is not the normal usecase, but it is in the interface and actually nice to have. I updated the laser equations to now also include beta0. I tested the new implementation that compared to my model implementation correctly generates the old (beta0=1.0) and now also new results (beta0 != 1.0). While applying the changes, I seized the opportunity for some reductions in boiler plate code that still existed from previous, now outdated, twts implementations.

@BeyondEspresso
Copy link
Member Author

@chillenzer Please leave the draft-tag for now until Monday. It is late and it is a good idea to have some time pass,

@BeyondEspresso
Copy link
Member Author

@chillenzer I actually have further minor changes to make the PR more round. I will submit them soon.

@BeyondEspresso BeyondEspresso force-pushed the topic-bugfix-twtstight-dy-dt-lengths branch from 8881fc8 to cbeff6b Compare October 27, 2024 15:28
@BeyondEspresso BeyondEspresso changed the title Fix dY and dT symmetry scales in twtstight Generalize twtstight for arbitrary beta0 and code refactoring Oct 27, 2024
@BeyondEspresso
Copy link
Member Author

BeyondEspresso commented Oct 27, 2024

@chillenzer @steindev OK, I rebased my PR and completed my generalization and refactoring. Additionally, I successfully tested the code in an actual PIConGPU run. The draft-tag can be removed now.

@chillenzer chillenzer marked this pull request as ready for review October 28, 2024 09:28
Copy link
Contributor

@chillenzer chillenzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well annotated, thanks for the great guidance. Still there are a few things to discuss. Also, I can't say anything about the actual physics formula. Needs a check from @steindev .

*
* For moving window simulations, POSITION for the YMax side can be located outside the initially
* simulated volume. In this case, parts of the generation surface outside of the currently simulated
* volume is are treated as if they had zero incident field and it is user's responsibility to apply a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* volume is are treated as if they had zero incident field and it is user's responsibility to apply a
* volume are treated as if they had zero incident field and it is user's responsibility to apply a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change this.

auto const phiT = float_T(math::abs(phi));
float_T sinPhi;
float_T cosPhi;
pmacc::math::sincos(phiT, sinPhi, cosPhi);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my ignorance but could you please quickly explain if math and pmacc::math are different namespaces. Also, given that those are frequently used, a using directive might be beneficial for this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the development of this code, namespaces have changed. I guess that this is a remainder that can be removed now. I'll check.

@@ -262,15 +242,12 @@ namespace picongpu
/* To avoid underflows in computation, fields are set to zero
* before and after the respective TWTS pulse envelope.
*/
if(math::abs(y - z * math::tan(phiT / float_T(2.0)) - (cspeed * t)) > (numSigmas * tauG * cspeed))
if(math::abs(y - z * tanAlpha - (beta0 * cspeed * t)) > (numSigmas * tauG * cspeed))
return float_T(0.0);

/* Calculating shortcuts for speeding up field calculation */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're perfectly fine to assume that the compiler would be capable of figuring this out for itself and will happily ignore if these are present or not, using its own heuristics instead. The only benefit of these could be simplifying formulae later on. You can decide for yourself if for any or all of these this purpose is fulfilled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this comment meant for a different line? Without this if-statement, the code produces NaNs due to underflow. We already were severely bitten by this. The function of this statement corresponds to the "laser window" of a standard Gaussian laser.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, always forget how badly GH renders comments on a single line. This was supposed to be a comment on "Calculating shortcuts for speeding up field calculation".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite happy with the three functions being mostly duplications of one another with a few switches of sin and cos. It is well annotated and I could follow along quite well but it would be a huge pain to change every definition three times (as I believe you have experienced during this refactoring). Could we make this more elegant? One option could be to extract some common methods for computing the basics at the top, potentially using structured bindings or a shallow struct for the return value. Alternatively, one could think about merging the functions computing and returning all three components (if they are not needed separately).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree, as I also spend some thought on it. However, I always postponed it until later. Let us discuss the most suitable approach offline (or in a VC call).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chillenzer @BrianMarre I just successfully compiled a small proof of concept for the structured bindings within twtstight using 3 variables on both NVIDIA and AMD GPUs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimming through this, the same comments apply as for the B field. Even more so, I think we should extract common functions, so we don't have a 6-fold code duplication of round about 70 lines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. Let us discuss this before I make changes.

@chillenzer
Copy link
Contributor

@BrianMarre should probably follow the developments here after we've done a similar refactoring with his code.

@chillenzer
Copy link
Contributor

Without having had much more than a quick glance on it, would it be an option to add that test you were talking about as a unit/end-to-end test?

Have you actually added said test yet or is this still on your todo list?

@BeyondEspresso
Copy link
Member Author

@chillenzer Yes, adding the test is still on my todo list.

@psychocoderHPC psychocoderHPC added this to the 0.8.0 / Next stable milestone Oct 30, 2024
@psychocoderHPC psychocoderHPC added the refactoring code change to improve performance or to unify a concept but does not change public API label Oct 30, 2024
@BeyondEspresso
Copy link
Member Author

@psychocoderHPC @chillenzer Ok, here is the refactored version. To reduce duplicate code I moved the implementation into the new template class TWTSTight and specialized for the respective E- and B-Field implementation. In the math, I also did a separation of complex and float-numbers via reordering and brackets according to @psychocoderHPC suggestions. Due to the new struct, I renamed twtstight.tpp to TWTSTight.tpp. In total this results in a code reduction by almost 600 LOCs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring code change to improve performance or to unify a concept but does not change public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants