Investigate generating simulated data #194

callumforrester · 2023-09-21T14:15:04Z

Making an issue to document our various thoughts on this...

Current State

The eiger detector in tickit-devices includes a single frame taken by a real eiger which it just repeatedly spits out. The frame has been pre-compressed so the simulation doesn't even need to understand bslz4.

Use Cases

@DominicOram to comment...

Fixed Output

Simulated detectors should be able to output a series of predetermined frames that look like real data (and probably are real data originally) and can be piped into the same analysis pipelines that take real data as a form of end-to-end system validation. The emphasis here is on performance, since a tickit detector is probably already slower than the real thing and we don't want to slow it down further.

Custom Output

There should be a facility to customize the detector data. At the expense of speed we may wish to output random data to test the system further, or vary the data quality for testing with an adaptive scan.

N.b. tickit is not a physics simulator. Its job is not to do the maths that shapes the beam or works out how it is scattered by a sample etc. There are technique-specific packages for this such as geant4.

Design Ideas

The simplest possible design is to include a facility for generating data inside each detector, possibly following a protocol or ABC for interoperability. You can potentially change/compose different data sources/generation methods in the config.

We could also take it out into the tickit graph, i.e. making separate "devices" to produce data and wiring them to detectors in many composable ways. The below examples show various possible levels of granularity.

Unsure if the design would require any framework changes. @abbiemery to comment...

Data Sources

Below are some potential ideas, they may or may not be good ones...

Existing Data File

A detector could stream data out of an HDF5 file, probably captured by the real thing at some point.

Data Simulation Framework

Sirepo is a synchrotron beam data simulation framework developed at NSLS-II. It already has integration with bluesky/ophyd, similar integration could be done with tickit.
Alternatively it could be used to pre-generate data for the "Existing Data File" case.

Python Function

It would be nice to be able to write an arbitrary python function that returns a numpy array to generate frames as the ultimate level of customization

The text was updated successfully, but these errors were encountered:

DominicOram · 2023-10-02T14:22:38Z

I'm going to strongly advocate for just the detector spitting out a fixed output based on an existing data file. It's very simple and it will give us a lot. The specific file it spits out should be runtime configurable so that we can have some tests with good data and some with bad.

callumforrester · 2023-10-02T14:45:41Z

Indeed, not saying we should support all of these, but I think we can avoid designing them out.

callumforrester added the enhancement New feature or request label Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate generating simulated data #194

Investigate generating simulated data #194

callumforrester commented Sep 21, 2023

DominicOram commented Oct 2, 2023

callumforrester commented Oct 2, 2023

Investigate generating simulated data #194

Investigate generating simulated data #194

Comments

callumforrester commented Sep 21, 2023

Current State

Use Cases

Fixed Output

Custom Output

Design Ideas

Data Sources

Existing Data File

Data Simulation Framework

Python Function

DominicOram commented Oct 2, 2023

callumforrester commented Oct 2, 2023