-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
signal extractor SlidingWindowMaxSum #1568
signal extractor SlidingWindowMaxSum #1568
Conversation
… particular for weak pulses): SlidingWindowMaxSum It maximizes the sum on "width" consecutive slices
Codecov Report
@@ Coverage Diff @@
## master #1568 +/- ##
==========================================
- Coverage 90.80% 90.76% -0.05%
==========================================
Files 192 191 -1
Lines 14006 14060 +54
==========================================
+ Hits 12718 12761 +43
- Misses 1288 1299 +11
Continue to review full report at Codecov.
|
I made a few iterations solving things pointed by codacy and coverage checks. I think the code is ready for the review, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
In fact, I think a similar implementation could be used to speed up the TwoPass "MARS-like" method (which also uses a sliding window in the first pass).
This is because the code that actually runs is the compiled numba code, not the python function. Unfortunately that means that numba functions do not report coverage correctly. |
By the way, for the Numba code coverage issue, see #1400 |
ctapipe/image/extractor.py
Outdated
This method is decorated with @lru_cache to ensure it is only | ||
calculated once per telescope. | ||
|
||
WARNING: TO BE DONE properly, the current code reuses the function of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you implement this directly here, does not sound to complicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main reason why I did not do so is because this feature does not seem to be used (at least in LST), so I did not have a proper set-up to test it, but I can look into making some dummy pulse shape and testing on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have the reference pulse shape in the CameraDescription
. I guess it's a fairly small effect though, and the correction doesn't really matter much except to get the cleaning thresholds in the same units for all cameras.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, I restarted working on this.
CameraDescription.readout is where the code is taking the pulse shape from. However this is not really reliable.
If I execute the following code
import numpy as np
import astropy.units as u
import matplotlib.pyplot as plt
plt.ion()
from ctapipe.instrument import SubarrayDescription, TelescopeDescription
subarray = SubarrayDescription(
"LST1",
tel_positions={1: np.zeros(3) * u.m},
tel_descriptions={
1: TelescopeDescription.from_name(
optics_name="LST", camera_name="LSTCam"
),
},
)
pulse_shape=subarray.tel[1].camera.readout.reference_pulse_shape[ dt=subarray.tel[1].camera.readout.reference_pulse_sample_width
xs=np.arange(len(pulse_shape))*dt
plt.plot(xs, pulse_shape)
I get the following figure:
which is a much broader pulse then it should
The calculation of the correction factor would be much simpler if the pulse shape in this class had the same binning as the actual readout, this is the case in the above example, and one would assume to take it from granted since the shape is taken from the "readout" object, which has the binning embedded, however in the first tests that i was doing in lstchain, when the array was being read from the data the pulse shapes there were actually a delta function with a SSC-like sampling, so obviously it cannot be taken for granted.
I will change the code to use a simple conversion and rounding of sampling to make it work also in this more general case, but the whole issue of the LST pulse shape deserves a separate "issue"
EDIT: I forgot to mention that there seems to be only one reference pulse shape in the CameraDescription, while in reality we should have HG and LG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The from_name()
methods load up a file from ctapipe-extra (which is now a directory on a server rather than a package), and are just meant for unit-testing purposes. Currently everything in ctapipe-extra is from PROD3 or even PROD2 simulations, so quite out of date for real analysis. In the future I want to clean that up and have an option to select which "prod" to use, but there has not been manpower for that (see e.g #738 )
If you load real data from a SimTel file or something else supported, the correct waveform should be loaded into the instrument model that you get from source.subarray
.
E.g. if you do:
with EventSource("some_prod5_sim.simtel.gz") as source:
readout = source.subarray.tel[2].camera.readout
plt.plot(readout.reference_pulse_sample_time, readout.reference_pulse_sample_width)
You will get the "latest" pulse that is defined in Prod5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if you don't want to always load up a file, you can take any reference prod5 file, for example, and run
ctapipe-dump-instrument --input=some_reference_file.simtel.gz
And it will dump a bunch of FITS files including the Camera geometry and readout definitions to the local directory. You can then setenv CTAPIPE_SVC_PATH=[directory where those files are]
, and ctapipe will use that when you run the from_name()
functions instead of the defaults (by default it searches all paths listed in a ":" separated list in in CTAPIPE_SVC_PATH first, then if it doesn't find that, it will download the default file from the dataserver, which as I said are a bit out of date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I found the actual problem... In fact, the camera readout definition for LSTCam does not even exist in the ctapipe-extra directory on the dataserver. It seems the default behavior is to just return some dummy pulse shape if the file is not found, which has no real meaning (you should see a logger warning message if logging is set up)... Clearly this is not good behavior (I think it was there to prevent tests from failing until we updated the testing files, which never happened).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also why you get only 1 reference pulse shape if you use from_name()
...
I had already opened an issue about this, but obviously forgot: See #1450
I opened a PR with at least a temporary fix to the pulse shape problem. from ctapipe.utils import datasets
datasets.DEFAULT_URL = "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/" |
thanks a lot this is really helpful |
- improved the calculation of the correction for not full integration of a pulse in SlidingWindowMaxSum extractor - added another test (temporarily in a separate file because of PR cta-observatory#1588) with testing this correction for LST pulse shape
…nto implement_SlidingWindowMaxSum
I made this correction properly, tested it using the test_extractor.py, and also made an extra test (using the pulse shapes from @kosack suggestion), so I made the commit, which as you can see however fails: ctapipe/image/tests/test_concentration.py::test_concentration FAILED [ 27%] I've run and still test_concentration.py works in my PC. I suspect that there might have been some other commit done in the meantime that broke those tests. |
See #1588, I have exactly the same problem. Not sure what the solution is - so far I don't see why it is happening, except for a small change in pixel area (which i still don't understand, as the pixel distances are the same, and I compared the computation to past versions of ctapipe, and it is identical) |
|
||
from ctapipe.utils import datasets | ||
|
||
datasets.DEFAULT_URL = "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't really set this globally, since it affects all other tests (and definitely should not be committed when we merge the PR). Perhaps for now, you might want to use the monkeypatch
test fixture instead, something like:
def test_xxx(monkeypatch):
with monkeypatch.context() as m:
m.setattr(datasets, "DEFAULT_URL", "http://cccta-dataserver.in2p3.fr/data/ctapipe-extra/v0.3.2/")
# the rest of the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I put this test explicitely into a separate file to avoid affecting other files with a global variable, because I thought that each file is run independently.
either way, now modified as suggested by you
the failing tests were solved in the other PR, I did some small updates to solve the codacy issues, there are three left: Instance of 'int' has no 'tel' member No value for argument 'sum_' in function call Instance of 'int' has no 'tel' member 1st and 3rd are somehow strange because the window_width in fact is not int but IntTelescopeParameter, but investigating it I corrected how the width of the integration window was changed in the test file. so, can we merge this PR? |
removed a monkeypatch that is unnecessary after PR 1451
def test_sw_pulse_lst(): | ||
""" | ||
Test function of sliding window extractor for LST camera pulse shape with | ||
the correction for the integration window completeness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong indentation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will merge as soon as both approvals are there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is valid code and since it's a test it is probably also not checked by the documentation build
a new simple signal extractor, slightly slower, but with better accuracy (in particular for weak pulses): SlidingWindowMaxSum
It maximizes the sum on "width" consecutive slices
some speed test using 3 trials of 1000 events of LST1 data:
only r0==> r1 calibration: 14.575s, 14.465s, 14.550s
LocalPeakWindowSum (current extractor) 19.853s, 20.188s, 20.698s
MaxWindowSum (new code): 21.731s, 20.813s, 21.132s
one feature can be improved, namely the correction for the signal outside of the integration window, the current code is reusing LocalPeakWindowSum approach assuming that the shift is half of the total window, which is correct only if the pulse is symmetric (which is not really the case)