Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate generating simulated data #194

Open
callumforrester opened this issue Sep 21, 2023 · 2 comments
Open

Investigate generating simulated data #194

callumforrester opened this issue Sep 21, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@callumforrester
Copy link
Contributor

Making an issue to document our various thoughts on this...

Current State

The eiger detector in tickit-devices includes a single frame taken by a real eiger which it just repeatedly spits out. The frame has been pre-compressed so the simulation doesn't even need to understand bslz4.

Use Cases

@DominicOram to comment...

Fixed Output

Simulated detectors should be able to output a series of predetermined frames that look like real data (and probably are real data originally) and can be piped into the same analysis pipelines that take real data as a form of end-to-end system validation. The emphasis here is on performance, since a tickit detector is probably already slower than the real thing and we don't want to slow it down further.

Custom Output

There should be a facility to customize the detector data. At the expense of speed we may wish to output random data to test the system further, or vary the data quality for testing with an adaptive scan.

N.b. tickit is not a physics simulator. Its job is not to do the maths that shapes the beam or works out how it is scattered by a sample etc. There are technique-specific packages for this such as geant4.

Design Ideas

The simplest possible design is to include a facility for generating data inside each detector, possibly following a protocol or ABC for interoperability. You can potentially change/compose different data sources/generation methods in the config.

We could also take it out into the tickit graph, i.e. making separate "devices" to produce data and wiring them to detectors in many composable ways. The below examples show various possible levels of granularity.

image

Unsure if the design would require any framework changes. @abbiemery to comment...

Data Sources

Below are some potential ideas, they may or may not be good ones...

Existing Data File

A detector could stream data out of an HDF5 file, probably captured by the real thing at some point.

Data Simulation Framework

Sirepo is a synchrotron beam data simulation framework developed at NSLS-II. It already has integration with bluesky/ophyd, similar integration could be done with tickit.
Alternatively it could be used to pre-generate data for the "Existing Data File" case.

Python Function

It would be nice to be able to write an arbitrary python function that returns a numpy array to generate frames as the ultimate level of customization

@callumforrester callumforrester added the enhancement New feature or request label Sep 21, 2023
@DominicOram
Copy link
Collaborator

I'm going to strongly advocate for just the detector spitting out a fixed output based on an existing data file. It's very simple and it will give us a lot. The specific file it spits out should be runtime configurable so that we can have some tests with good data and some with bad.

@callumforrester
Copy link
Contributor Author

Indeed, not saying we should support all of these, but I think we can avoid designing them out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants