Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure workflow to facilitate workflow partitioning for efficient streaming execution #163

Merged
merged 12 commits into from
Sep 25, 2024

Conversation

SimonHeybrock
Copy link
Member

@SimonHeybrock SimonHeybrock commented Aug 29, 2024

There are two structure changes here:

  1. Split the wavelength-reduction (giving the new ReducedQ) from the normalization. This is essential for streamed processing since we need to accumulate data (numerator and denominator) before normalization, but as late as possible in the workflow. The wavelength-reduction is relatively expensive since it performs binned data operations, e.g., to handle wavelength bands. Further, accumulating event data must be avoided since it will quickly run out of memory.
  2. Split the detector term from the monitor term in the computation of the denominator. As the detector-term (solid angle and direct beam function) is static, it can be pre-computed. This is important since it involves a relatively costly reduction over many pixels in Q-bins.

Base automatically changed from ess-reduce-nexus-workflow to main August 29, 2024 08:26
@SimonHeybrock SimonHeybrock marked this pull request as ready for review September 4, 2024 03:18
src/ess/sans/conversions.py Outdated Show resolved Hide resolved
src/ess/sans/conversions.py Outdated Show resolved Hide resolved
src/ess/sans/normalization.py Outdated Show resolved Hide resolved
:py:func:`iofq_norm_wavelength_term_sample` or
:py:func:`iofq_norm_wavelength_term_background`.
Keeping the monitor term separate from the detector term allows us to compute the
the latter only once when repeatedly processing chunks of events in streamed data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is assuming that the pixel positions, and thus the solid angle, remain the same throughout the run.
This is fine I think, as we are no longer computing the beam center as part of the workflow.

Thats said, could the beam center be refined as more and more signal is collected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current idea is to use the stream-processing feature for component position updates, i.e., one lever higher than the Sciline workflow.


The denominator is then simply:
:math:`M_{\\lambda} T_{\\lambda} D_{\\lambda} \\Omega_{R}`,
which is equivalent to ``wavelength_term * solid_angle``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think wavelength_term here was refering to variable names in the code. Should we update or rewrite so that we don't have to change docstring if we change variable names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph_no_grav = pipeline.compute(ElasticCoordTransformGraph)
pipeline[CorrectForGravity] = True
data_with_grav = (
pipeline.compute(CleanWavelengthMasked[SampleRun, Numerator])
pipeline.compute(CleanWavelength[SampleRun, Numerator])
.flatten(to='pixel')
.hist(wavelength=sc.linspace('wavelength', 1.0, 12.0, 101, unit='angstrom'))
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding a test which sort of simulates the streaming? Basically make two chunks of data that are accumulated? I'm not sure what you would test though? Just that it doesn't fail?

Or does this belong more in the data streaming (i.e. in a different repository)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, it does not belong into this PR. And since even before restructure streaming was possible (it just recomputed more than strictly necessary) it would require mocking some workflow components and adding call counters for the tests. Such tests exist in the ess.reduce.streaming module.

@SimonHeybrock SimonHeybrock merged commit 8826f6b into main Sep 25, 2024
4 checks passed
@SimonHeybrock SimonHeybrock deleted the streaming-workflow branch September 25, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants