Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add calibration to provenance tracking #74

Merged
merged 33 commits into from
Jan 14, 2022
Merged

Add calibration to provenance tracking #74

merged 33 commits into from
Jan 14, 2022

Conversation

Bultako
Copy link
Collaborator

@Bultako Bultako commented Jan 10, 2022

This PR extends the provenance products files and graph produced to previous calibration steps (as shown in example prov graph calibration_to_dl2_01808_prov.pdf). It adds to the provenance tracking the calibration processes that create pedestal, time and charge calibration files used in already tracked process ro_to_dl1, as well as the check plots files produced. Consequently prov products files are now renamed:

  • calibration_to_dl1_runID_prov.pdf
  • calibration_to_dl2_runID_prov.pdf

For the moment, the calibration processes tracked are those involved in the next day processing, those using external lstchain onsite calibration scripts, only input and output files relative to those scripts are tracked skipping the rest of parameters hard coded in those scripts and outside the scope of lstosa. In the future, provenance tracking for massive reprocessing will be added. See #54 (comment) for more details.

It also improves and refactor existing provenance tracking code.

@codecov
Copy link

codecov bot commented Jan 10, 2022

Codecov Report

Merging #74 (217fc17) into main (29f6cb1) will increase coverage by 0.12%.
The diff coverage is 98.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #74      +/-   ##
==========================================
+ Coverage   82.05%   82.18%   +0.12%     
==========================================
  Files          39       39              
  Lines        4202     4265      +63     
==========================================
+ Hits         3448     3505      +57     
- Misses        754      760       +6     
Impacted Files Coverage Δ
osa/scripts/calibration_pipeline.py 89.28% <87.50%> (-6.95%) ⬇️
osa/provenance/capture.py 71.64% <94.44%> (+0.08%) ⬆️
osa/job.py 79.23% <100.00%> (-1.17%) ⬇️
osa/provenance/io.py 76.12% <100.00%> (ø)
osa/provenance/utils.py 96.36% <100.00%> (+1.03%) ⬆️
osa/scripts/closer.py 75.27% <100.00%> (+0.27%) ⬆️
osa/scripts/provprocess.py 94.42% <100.00%> (+<0.01%) ⬆️
osa/scripts/simulate_processing.py 94.35% <100.00%> (+0.09%) ⬆️
osa/scripts/tests/test_osa_scripts.py 98.77% <100.00%> (+1.05%) ⬆️
osa/utils/cliopts.py 78.73% <100.00%> (+0.32%) ⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29f6cb1...217fc17. Read the comment docs.

@morcuended
Copy link
Member

Thanks, @Bultako. I'll have a look.

"""
Create a DRS4 pedestal file for baseline correction.

Parameters
----------
drs4_pedestal_run_id : str
String with run number of the pedestal run
pedcal_run_id : str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding this argument here? Is it needed for provenance?

Copy link
Collaborator Author

@Bultako Bultako Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
It is needed to link both two calibration tasks to a single calibration workflow identified with a label of two numbers, namely drs4_pedestal_run_id-pedcal_run_id.

@morcuended
Copy link
Member

Looks good to me. I'm going to test it in the container.

@Bultako Bultako force-pushed the prov branch 3 times, most recently from 8f60e52 to efc9a93 Compare January 11, 2022 11:38
@Bultako
Copy link
Collaborator Author

Bultako commented Jan 11, 2022

I think we have to define a boundary on which provenance info lstosa should keep. I'm tempted to say that all info defined outside lstosa should not be kept, so external scripts used like those from lstchain would appear as blackboxes with inputs and outputs files that may be also connected to other tasks in lstosa. The reason why to do this is to simplify maintenance. We could also decide to add the detailed provenance info happening in these external boxes, in that case there is now a lot of info missing, and any modif in those lstchain scripts used will need to modify the code for provenance tracking in lstosa.

@morcuended
Copy link
Member

I think we have to define a boundary on which provenance info lstosa should keep. I'm tempted to say that all info defined outside lstosa should not be kept, so external scripts used like those from lstchain would appear as blackboxes with inputs and outputs files that may be also connected to other tasks in lstosa. The reason why to do this is to simplify maintenance. We could also decide to add the detailed provenance info happening in these external boxes, in that case there is now a lot of info missing, and any modif in those lstchain scripts used will need to modify the code for provenance tracking in lstosa.

Agreed. At least for now, we should at least keep track of which calibration files are used in r0_to_dl1 (which we have been doing so far). The information on the production of calibration files produced by osa is limited to the drs4 pedestal and calibration files and corresponding run numbers. We have no straight access to the time file (although in lstosa we look for this file in the same way for using it in r0 to dl1) nor the factor systematic file.

@morcuended morcuended merged commit 364cfdc into main Jan 14, 2022
@morcuended morcuended deleted the prov branch January 14, 2022 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants