Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-dataset data check summary #873

Open
IevgenVovk opened this issue Jan 18, 2022 · 2 comments
Open

Per-dataset data check summary #873

IevgenVovk opened this issue Jan 18, 2022 · 2 comments

Comments

@IevgenVovk
Copy link

Issue: Presently the data check scripts work on the nightly basis displaying all the runs taken on a given date. There's no elegant way to select the runs using (1) known run numbers, (2) observation target name or (3) data (run) files already selected / copied for the analysis. This complicates data check by analyzers working on a specific source using the data scattered over multiple (many) observations.

Proposal: at least two options possible:

  • provide an interface to run the data check toolchain by the analyzers offline on a given run files (may be not feasible if subsystems other than camera are to be included);
  • provide a script downloading the available data check output based on the input data (above) and then running longterm_dl1_check.py.
@moralejo
Copy link
Collaborator

  • If as output we want the interactive html, we should first make longterm_dl1_check.py work properly for very long lists of input files (as of now, in the resulting html file the runnumber sliders jump a few runs at a time). This approach would have the advantage that one could use also the resulting log file, in which runs with parameters beyond some predefined limits are reported, to select the 'good quality' data sample.

  • We might as well make a dedicated script for the purpose of data selection, loading all of the DL1 check hdf5 nightwise files (the files are light (the existing ones are at most ~3 MB per night), and afterwards picking the run numbers which belong to the sample.

@moralejo
Copy link
Collaborator

This is addressed by: #880

Also, #881 adds the necessary information to the (light) DL1 data check h5 files so that they can be used for the same purpose (also allowing to select runs based on the DL1 data check quantities)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants