signal extractor SlidingWindowMaxSum #1568

jsitarek · 2020-12-29T15:25:31Z

a new simple signal extractor, slightly slower, but with better accuracy (in particular for weak pulses): SlidingWindowMaxSum

It maximizes the sum on "width" consecutive slices

some speed test using 3 trials of 1000 events of LST1 data:
only r0==> r1 calibration: 14.575s, 14.465s, 14.550s
LocalPeakWindowSum (current extractor) 19.853s, 20.188s, 20.698s
MaxWindowSum (new code): 21.731s, 20.813s, 21.132s

one feature can be improved, namely the correction for the signal outside of the integration window, the current code is reusing LocalPeakWindowSum approach assuming that the shift is half of the total window, which is correct only if the pulse is symmetric (which is not really the case)

… particular for weak pulses): SlidingWindowMaxSum It maximizes the sum on "width" consecutive slices

…axSum class

codecov · 2020-12-30T08:45:29Z

Codecov Report

Merging #1568 (fd9e917) into master (09931e2) will decrease coverage by 0.04%.
The diff coverage is 84.88%.

@@            Coverage Diff             @@
##           master    #1568      +/-   ##
==========================================
- Coverage   90.80%   90.76%   -0.05%     
==========================================
  Files         192      191       -1     
  Lines       14006    14060      +54     
==========================================
+ Hits        12718    12761      +43     
- Misses       1288     1299      +11

Impacted Files	Coverage Δ
ctapipe/image/extractor.py	`82.75% <63.88%> (-3.02%)`	⬇️
ctapipe/image/tests/test_extractor.py	`100.00% <100.00%> (ø)`
...pipe/image/tests/test_sliding_window_correction.py	`100.00% <100.00%> (ø)`
ctapipe/reco/__init__.py	`100.00% <100.00%> (ø)`
ctapipe/instrument/atmosphere.py	`90.90% <0.00%> (-9.10%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09931e2...7798779. Read the comment docs.

…dowMaxSum

…y SlidingWindowMaxSum extractor

jsitarek · 2020-12-30T11:54:50Z

I made a few iterations solving things pointed by codacy and coverage checks.
codacy reports a missing argument in one of the functions, but I think this is just an issue of the paralelization, since the same type of code as in extract_around_peak is used.
coverage check claims that most of the extract_sliding_window function is not tested, however this function is explicitly tested in test_extractor.py.

I think the code is ready for the review,

kosack

Looks good to me.

In fact, I think a similar implementation could be used to speed up the TwoPass "MARS-like" method (which also uses a sliding window in the first pass).

maxnoe · 2021-01-05T16:22:14Z

coverage check claims that most of the extract_sliding_window function is not tested, however this function is explicitly tested in test_extractor.py.

This is because the code that actually runs is the compiled numba code, not the python function. Unfortunately that means that numba functions do not report coverage correctly.

jsitarek · 2021-01-05T21:22:29Z

thank you @kosack for the approval and @maxnoe for the explanation about numba

kosack · 2021-01-06T13:03:57Z

By the way, for the Numba code coverage issue, see #1400

maxnoe · 2021-01-18T11:14:06Z

ctapipe/image/extractor.py

+        This method is decorated with @lru_cache to ensure it is only
+        calculated once per telescope.
+
+        WARNING: TO BE DONE properly, the current code reuses the function of


Could you implement this directly here, does not sound to complicated?

the main reason why I did not do so is because this feature does not seem to be used (at least in LST), so I did not have a proper set-up to test it, but I can look into making some dummy pulse shape and testing on it.

that would be great

You have the reference pulse shape in the CameraDescription. I guess it's a fairly small effect though, and the correction doesn't really matter much except to get the cleaning thresholds in the same units for all cameras.

Sorry for the delay, I restarted working on this.
CameraDescription.readout is where the code is taking the pulse shape from. However this is not really reliable.
If I execute the following code

import numpy as np import astropy.units as u import matplotlib.pyplot as plt plt.ion() from ctapipe.instrument import SubarrayDescription, TelescopeDescription subarray = SubarrayDescription( "LST1", tel_positions={1: np.zeros(3) * u.m}, tel_descriptions={ 1: TelescopeDescription.from_name( optics_name="LST", camera_name="LSTCam" ), }, ) pulse_shape=subarray.tel[1].camera.readout.reference_pulse_shape[ dt=subarray.tel[1].camera.readout.reference_pulse_sample_width xs=np.arange(len(pulse_shape))*dt plt.plot(xs, pulse_shape)

I get the following figure:

which is a much broader pulse then it should

The calculation of the correction factor would be much simpler if the pulse shape in this class had the same binning as the actual readout, this is the case in the above example, and one would assume to take it from granted since the shape is taken from the "readout" object, which has the binning embedded, however in the first tests that i was doing in lstchain, when the array was being read from the data the pulse shapes there were actually a delta function with a SSC-like sampling, so obviously it cannot be taken for granted.

I will change the code to use a simple conversion and rounding of sampling to make it work also in this more general case, but the whole issue of the LST pulse shape deserves a separate "issue"

EDIT: I forgot to mention that there seems to be only one reference pulse shape in the CameraDescription, while in reality we should have HG and LG

The from_name() methods load up a file from ctapipe-extra (which is now a directory on a server rather than a package), and are just meant for unit-testing purposes. Currently everything in ctapipe-extra is from PROD3 or even PROD2 simulations, so quite out of date for real analysis. In the future I want to clean that up and have an option to select which "prod" to use, but there has not been manpower for that (see e.g #738 )

If you load real data from a SimTel file or something else supported, the correct waveform should be loaded into the instrument model that you get from source.subarray.

E.g. if you do:

with EventSource("some_prod5_sim.simtel.gz") as source: readout = source.subarray.tel[2].camera.readout plt.plot(readout.reference_pulse_sample_time, readout.reference_pulse_sample_width)

You will get the "latest" pulse that is defined in Prod5

Note that if you don't want to always load up a file, you can take any reference prod5 file, for example, and run

ctapipe-dump-instrument --input=some_reference_file.simtel.gz

And it will dump a bunch of FITS files including the Camera geometry and readout definitions to the local directory. You can then setenv CTAPIPE_SVC_PATH=[directory where those files are], and ctapipe will use that when you run the from_name() functions instead of the defaults (by default it searches all paths listed in a ":" separated list in in CTAPIPE_SVC_PATH first, then if it doesn't find that, it will download the default file from the dataserver, which as I said are a bit out of date.

Update: I found the actual problem... In fact, the camera readout definition for LSTCam does not even exist in the ctapipe-extra directory on the dataserver. It seems the default behavior is to just return some dummy pulse shape if the file is not found, which has no real meaning (you should see a logger warning message if logging is set up)... Clearly this is not good behavior (I think it was there to prevent tests from failing until we updated the testing files, which never happened).

here is a comparison with prod3b:

This is also why you get only 1 reference pulse shape if you use from_name()...
I had already opened an issue about this, but obviously forgot: See #1450

kosack · 2021-02-02T10:04:29Z

I opened a PR with at least a temporary fix to the pulse shape problem.
Until that is accepted, you can also do the following as a hack while testing:

from ctapipe.utils import datasets
datasets.DEFAULT_URL = "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/"

jsitarek · 2021-02-02T10:59:57Z

thanks a lot this is really helpful

- improved the calculation of the correction for not full integration of a pulse in SlidingWindowMaxSum extractor - added another test (temporarily in a separate file because of PR cta-observatory#1588) with testing this correction for LST pulse shape

…nto implement_SlidingWindowMaxSum

jsitarek · 2021-02-02T14:59:22Z

I made this correction properly, tested it using the test_extractor.py, and also made an extra test (using the pulse shapes from @kosack suggestion), so I made the commit, which as you can see however fails:

ctapipe/image/tests/test_concentration.py::test_concentration FAILED [ 27%]
I've run the same test in my machine, but it goes through.

I've run
git pull upstream master
to check if the current changes are in conflict with some other changes made in the meantime (I doubt it because I only changed the code of the new extractor which is not yet used anywhere)

and still test_concentration.py works in my PC.
I checked the other tests and
tests/test_reducer.py::test_tailcuts_data_volume_reducer FAILED
also fails in my PC now, however I do not see how this has anything to do with my commit, since a different extractor is used in that test

I suspect that there might have been some other commit done in the meantime that broke those tests.
How should we proceed ?

kosack · 2021-02-02T15:07:14Z

I made this correction properly, tested it using the test_extractor.py, and also made an extra test (using the pulse shapes from @kosack suggestion), so I made the commit, which as you can see however fails:

See #1588, I have exactly the same problem. Not sure what the solution is - so far I don't see why it is happening, except for a small change in pixel area (which i still don't understand, as the pixel distances are the same, and I compared the computation to past versions of ctapipe, and it is identical)

kosack · 2021-02-02T15:12:43Z

ctapipe/image/tests/test_sliding_window_correction.py

+
+from ctapipe.utils import datasets
+
+datasets.DEFAULT_URL = "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/"


Shouldn't really set this globally, since it affects all other tests (and definitely should not be committed when we merge the PR). Perhaps for now, you might want to use the monkeypatch test fixture instead, something like:

def test_xxx(monkeypatch): with monkeypatch.context() as m: m.setattr(datasets, "DEFAULT_URL", "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/") # the rest of the test

see https://docs.pytest.org/en/stable/monkeypatch.html

hmm, I put this test explicitely into a separate file to avoid affecting other files with a global variable, because I thought that each file is run independently.
either way, now modified as suggested by you

- swapped imports

jsitarek · 2021-02-03T10:55:18Z

the failing tests were solved in the other PR, I did some small updates to solve the codacy issues, there are three left:

Instance of 'int' has no 'tel' member
self.window_width.tel[telid]

No value for argument 'sum_' in function call
charge, peak_time = extract_sliding_window(

Instance of 'int' has no 'tel' member
waveforms, self.window_width.tel[telid], self.sampling_rate[telid]

1st and 3rd are somehow strange because the window_width in fact is not int but IntTelescopeParameter, but investigating it I corrected how the width of the integration window was changed in the test file.
2nd one is I guess also not a problem, but a feature of guvectorize

so, can we merge this PR?

jsitarek · 2021-02-11T08:03:26Z

Hi @maxnoe @kosack
Please let me know if you want any additional modifications in this PR, or if you can give it already green light for merging

ctapipe/image/tests/test_sliding_window_correction.py

…WindowMaxSum

removed a monkeypatch that is unnecessary after PR 1451

maxnoe · 2021-02-11T15:05:55Z

ctapipe/image/tests/test_sliding_window_correction.py

+def test_sw_pulse_lst():
+    """
+    Test function of sliding window extractor for LST camera pulse shape with
+the correction for the integration window completeness


wrong indentation here

corrected, I'm not sure why precommit and flake8 did not catch this.
thx @maxnoe for reapproval
@kosack I also need it from you
and then can one of you merge the PR? I do not have permissions for doing it myself.

Yes, will merge as soon as both approvals are there

It is valid code and since it's a test it is probably also not checked by the documentation build

a new signal extractor, slightly slower, but with better accuracy (in…

beab6d5

… particular for weak pulses): SlidingWindowMaxSum It maximizes the sum on "width" consecutive slices

jsitarek requested review from HealthyPear and watsonjj as code owners December 29, 2020 15:25

added missing apply_integration_correction variable to SlidingWindowM…

4adb6ae

…axSum class

jsitarek added 4 commits December 30, 2020 11:11

adding a standard test of the newly added single extractor SlidingWin…

3cc405f

…dowMaxSum

added the test for the extract_sliding_window function that is used b…

1849d7d

…y SlidingWindowMaxSum extractor

fixing a few small issues from codacy scan

7e2844a

fixing two trailing whitespaces

c9c327e

kosack previously approved these changes Jan 5, 2021

View reviewed changes

maxnoe reviewed Jan 18, 2021

View reviewed changes

kosack added this to the v0.10.2 milestone Jan 21, 2021

kosack mentioned this pull request Feb 2, 2021

bump version of ctapipe-extra directory to v0.3.2 #1588

Merged

jsitarek dismissed kosack’s stale review via 439c203 February 2, 2021 14:16

Merge branch 'master' of https://github.com/cta-observatory/ctapipe i…

093dca3

…nto implement_SlidingWindowMaxSum

kosack reviewed Feb 2, 2021

View reviewed changes

jsitarek added 3 commits February 2, 2021 16:21

moved the dirty fix of the LST pulse shape to monkeypatch

d3528e6

solving codacy warnings in PR 1588

8a2bce9

resolving codacy issues

84effe4

- swapped imports

jsitarek requested review from maxnoe and kosack February 3, 2021 10:49

maxnoe modified the milestones: v0.10.2, v0.11.0 Feb 4, 2021

maxnoe reviewed Feb 11, 2021

View reviewed changes

ctapipe/image/tests/test_sliding_window_correction.py Outdated Show resolved Hide resolved

jsitarek added 3 commits February 11, 2021 11:41

Merge remote-tracking branch 'upstream/master' into implement_Sliding…

69749bf

…WindowMaxSum

follow up of PR 1568

bf2d2e4

removed a monkeypatch that is unnecessary after PR 1451

fixed an error left in the previous commit that would fail the test

ae70502

kosack previously approved these changes Feb 11, 2021

View reviewed changes

maxnoe previously approved these changes Feb 11, 2021

View reviewed changes

maxnoe reviewed Feb 11, 2021

View reviewed changes

corrected indentation

7798779

jsitarek dismissed stale reviews from maxnoe and kosack via 7798779 February 11, 2021 15:19

maxnoe approved these changes Feb 11, 2021

View reviewed changes

kosack approved these changes Feb 11, 2021

View reviewed changes

kosack merged commit 8f5c793 into cta-observatory:master Feb 11, 2021

jsitarek deleted the implement_SlidingWindowMaxSum branch February 11, 2021 17:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

signal extractor SlidingWindowMaxSum #1568

signal extractor SlidingWindowMaxSum #1568

jsitarek commented Dec 29, 2020

codecov bot commented Dec 30, 2020 •

edited

Loading

jsitarek commented Dec 30, 2020

kosack left a comment

maxnoe commented Jan 5, 2021 •

edited

Loading

jsitarek commented Jan 5, 2021

kosack commented Jan 6, 2021

maxnoe Jan 18, 2021

jsitarek Jan 18, 2021

maxnoe Jan 18, 2021

kosack Jan 18, 2021

jsitarek Feb 1, 2021 •

edited by maxnoe

Loading

kosack Feb 2, 2021 •

edited

Loading

kosack Feb 2, 2021 •

edited

Loading

kosack Feb 2, 2021 •

edited

Loading

kosack Feb 2, 2021

kosack commented Feb 2, 2021

jsitarek commented Feb 2, 2021

jsitarek commented Feb 2, 2021

kosack commented Feb 2, 2021

kosack Feb 2, 2021 •

edited

Loading

jsitarek Feb 2, 2021

jsitarek commented Feb 3, 2021

jsitarek commented Feb 11, 2021

maxnoe Feb 11, 2021

jsitarek Feb 11, 2021

maxnoe Feb 11, 2021

maxnoe Feb 11, 2021


		from ctapipe.utils import datasets

		datasets.DEFAULT_URL = "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/"

signal extractor SlidingWindowMaxSum #1568

signal extractor SlidingWindowMaxSum #1568

Conversation

jsitarek commented Dec 29, 2020

codecov bot commented Dec 30, 2020 • edited Loading

Codecov Report

jsitarek commented Dec 30, 2020

kosack left a comment

Choose a reason for hiding this comment

maxnoe commented Jan 5, 2021 • edited Loading

jsitarek commented Jan 5, 2021

kosack commented Jan 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsitarek Feb 1, 2021 • edited by maxnoe Loading

Choose a reason for hiding this comment

kosack Feb 2, 2021 • edited Loading

Choose a reason for hiding this comment

kosack Feb 2, 2021 • edited Loading

Choose a reason for hiding this comment

kosack Feb 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kosack commented Feb 2, 2021

jsitarek commented Feb 2, 2021

jsitarek commented Feb 2, 2021

kosack commented Feb 2, 2021

kosack Feb 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsitarek commented Feb 3, 2021

jsitarek commented Feb 11, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 30, 2020 •

edited

Loading

maxnoe commented Jan 5, 2021 •

edited

Loading

jsitarek Feb 1, 2021 •

edited by maxnoe

Loading

kosack Feb 2, 2021 •

edited

Loading

kosack Feb 2, 2021 •

edited

Loading

kosack Feb 2, 2021 •

edited

Loading

kosack Feb 2, 2021 •

edited

Loading