Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HillasReconstructor in case of too few telescopes #994

Merged
merged 6 commits into from
Jun 5, 2019

Conversation

LukasNickel
Copy link
Member

@mackaiver , @maxnoe and me faced an issue where the HillasReconstructor would fail even though there were two telescopes left after cleaning. It turned out that you could encounter situations where one of the hillas ellipses had a width of np.NaN because there were only a few pixels remaining which are all aligned.
This would render the weights useless and leave the h_max estimation with a singular matrix.

We decided that the HillasReconstructor should check for valid telescopes (that is telescopes with a defined width of the hillas ellipse) and that we would not even try to reconstruct other events (which should be very few if any).

Is there any reason against this approach?

@LukasNickel LukasNickel requested a review from kbruegge February 28, 2019 13:14
@vuillaut
Copy link
Member

Hi.

Checking for valid telescopes is indeed a good approach. Crucial parameters can fail for many reasons.

I also remember suggesting in some discussion about Hillas parameters that the width should be equal to 0 and not nan in case of aligned pixels.

@LukasNickel
Copy link
Member Author

I think u refer to #772, right?

if len(hillas_dict) < 2:

# stereoscopy needs at least two telescopes with valid widths
valid_telescopes = sum([1 if not np.isnan(hillas_dict[x]['width'].value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could directly build a new dict here with all the valid telescopes and only use those later.

E.g. valid_hillas = {tel_id: hillas for tel_id, hillas in hillas_dict.items() if np.isfinite(hillas['width'].value}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be the cleaner approach for now but it kind of comes back to #772.
Do we still use these images by adjusting the weights or do we not use them at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, thanks for digging it out.
My feeling is that 3 aligned pixels still give information about the direction and should not simply be thrown out - but I have not done a study on that ;-)
I see the problem with weights, however, and have no clear fix to handle this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's nice to keep track of what telescopes were dropped. Generally, i think a real pipeline would use a CutFlow to select telescopes before Hillas parameterization (e.g. with a minimum width and length or number of surviving pixels), but the code should still do reasonable things without that separate step, so testing for at least NaN is good. It should only skip telescopes if there is a real problem, and leave it up to the user to make any looser cuts for robustness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and by real problem I mean e.g. a singular matrix. In that case you could still reconstruct a direction if possible, and set the h_max to NaN

@thomasgas
Copy link

Just a question: what happens in case of we have 3 telescopes triggered and one of them has a NaN width? is h_max well reconstructed in this case?
I would also add a comment above those lines saying something like "This is done in case the width is NaN and h_max cannot be reconstructed"...or something like this...so that in the future we forget why we put it there.

@codecov
Copy link

codecov bot commented Feb 28, 2019

Codecov Report

Merging #994 into master will increase coverage by 0.05%.
The diff coverage is 95.12%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #994      +/-   ##
=========================================
+ Coverage   84.14%   84.2%   +0.05%     
=========================================
  Files         181     181              
  Lines       10907   10946      +39     
=========================================
+ Hits         9178    9217      +39     
  Misses       1729    1729
Impacted Files Coverage Δ
ctapipe/reco/HillasReconstructor.py 97.45% <100%> (+1.02%) ⬆️
ctapipe/reco/tests/test_HillasReconstructor.py 94.44% <94.11%> (-0.23%) ⬇️
ctapipe/core/container.py 69.3% <0%> (+0.99%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2efcfa0...8450731. Read the comment docs.

@LukasNickel
Copy link
Member Author

Just a question: what happens in case of we have 3 telescopes triggered and one of them has a NaN width? is h_max well reconstructed in this case?
I would also add a comment above those lines saying something like "This is done in case the width is NaN and h_max cannot be reconstructed"...or something like this...so that in the future we forget why we put it there.

Im not sure actually. If we only go with the valid telescopes as @maxnoe suggested above we would obviously avoid that (potential) problem

@thomasgas
Copy link

Im not sure actually. If we only go with the valid telescopes as @maxnoe suggested above we would obviously avoid that (potential) problem

I agree we should treat those images in a certain way (and we can discuss about this).
I'm just wondering that happens to those passing this condition even though the width in one on them might be NaN.
In any case I would approve this PR: if you could also have a look at what happens to the reconstruction if we use those images by setting the width to a certain small value (which of course should make sense) instead of NaN, it would be great!

@LukasNickel
Copy link
Member Author

Im not sure actually. If we only go with the valid telescopes as @maxnoe suggested above we would obviously avoid that (potential) problem

I agree we should treat those images in a certain way (and we can discuss about this).
I'm just wondering that happens to those passing this condition even though the width in one on them might be NaN.
In any case I would approve this PR: if you could also have a look at what happens to the reconstruction if we use those images by setting the width to a certain small value (which of course should make sense) instead of NaN, it would be great!

I will try to find some time for this :)

@kosack
Copy link
Contributor

kosack commented Feb 28, 2019

Reiterating my comments from above, we should avoid putting any "cuts" in the code, other than absolute failure cases (e.g. NaN or Inf preventing a sensible value). The user should explicitly control all cuts before the hillas code is executed (e.g they might replace the telescope dict with a subset after rejecting telescopes for any reason)

@kosack
Copy link
Contributor

kosack commented Feb 28, 2019

And we must make sure any ignored telescopes are recorded somehow, so we don't get the case of events that are marked as 3 tel in the output that only really used 2 telescopes (which will throw off any benchmarks we do)

@maxnoe
Copy link
Member

maxnoe commented Feb 28, 2019

Reiterating my comments from above, we should avoid putting any "cuts" in the code, other than absolute failure cases (e.g. NaN or Inf preventing a sensible value). The user should explicitly control all cuts before the hillas code is executed (e.g they might replace the telescope dict with a subset after rejecting telescopes for any reason
)

That would mean to fail loudly, if any of the telescopes contain unusable data.
I like that a lot! This makes it much easier, to design systems, that keep track where and why telescope data was discarded.

So this means, the Reconstructor should fail early and load, if any of its input data is invalid.

@LukasNickel
Copy link
Member Author

@kpfrang
Did you actually succeed with other weighting methods for the planes?

@kosack
Copy link
Contributor

kosack commented Mar 4, 2019

So this means, the Reconstructor should fail early and load, if any of its input data is invalid.

Yes, I think that's cleaner. I worry about adding any "hidden" cuts inside classes like Reconstructors, since it will clearly lead to some unexpected behavior later when people forget they are there (e.g. expecting the telescope multiplicity to be what the reconstructor used, when sometimes it is not).

@kbruegge
Copy link
Member

kbruegge commented Mar 4, 2019

So this should fail as soon as there any nans in the hillas_dict. Makes sense to me. 👍

@thomasgas
Copy link

I would like to solve this PR asap since I just run into this same problem today.
I had 28 telescopes triggered in a huge event and with just one of them having a width = 0 the reconstruction was giving nan. How to solve this:

  • discard all telescopes with width = 0 (should be done in the cleaning and not hard-coded then);
  • if width = 0 (which is unlikely with more than 3 pixels remaining...), use @dneise suggestion (Corrupted weighting in HillasReconstructor for small images #772 (comment)) to use a minimum value for width from the pixel geometry;
  • discard all telescopes with less than 5 (or configurable number of) pixels remaining after cleaning

We leave this selection to the user (replace width or apply some cleaning) and decide that in the HillasReconstructor we keep only those telescopes not breaking the reconstruction? Of course if the reconstructor drops a telescope, this should be reported to the user (with explanations).

Since my problem is with width = 0 and not width = NaN, I would add the 0-case to the exclusion of a telescope from the reconstruction because asking for isnan doesn't solve my problems.

- Hillas Reconstroctur now fails in any of these cases:
	- len(hillas_dict) < 2
	- any width is np.nan
	- any width is 0
thomasgas
thomasgas previously approved these changes Mar 11, 2019
.format(len(hillas_dict)))

# check for np.nan or 0 width's as these screw up weights
if any([np.isnan(hillas_dict[tel]['width'].value) for tel in hillas_dict]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better in one loop and use a generator expression?

any(np.isnan(h.width.value) or h.width.value == 0 for h in hillas_dict.values())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me. I wanted to seperate the error messages but we can combine them if you feel like we dont gain anything with it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to print two different messages in case of w=0 or w=NaN.
Ok to use one generator expression per each case.

@kpfrang
Copy link
Member

kpfrang commented Mar 11, 2019

@kpfrang
Did you actually succeed with other weighting methods for the planes?

What I did was investigating what we gain using weights from lookup tables for the distance of closest approach between the semi-major axis of the image and the true direction projected in the camera.

Besides improving the overall reconstruction, it also helped to save those images. For this investigation I always used width = 0 for such images (back then this was not always consistent). The LUTs were able to deal with 0 as one of the parameters was the ratio width / length. If width = NaN however, the LUTs obviously would also fail to come up with a weight for those images.

I did not dig too deep into this, but I didn't find a problem in the reconstruction when I kept those images.

@thomasgas
Copy link

What about using those weights to try to fix the reconstruction for just 2 LSTs? Would it solve the issue that @jjlk showed in #967?

Copy link
Contributor

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry this has sat around for so long - I hadn't noticed it was ready. Still, it's missing a unit test to check that the exception is thrown. Could you add one? Then it's ready

- there are less than 2 telescopes for the given event
- any width is 0
- any width is nan

Implementation is mostly copied from the test_reconstruction test.
@LukasNickel LukasNickel force-pushed the fix_hillas_reconstructor branch from 0572d33 to 8450731 Compare May 22, 2019 11:38
@LukasNickel LukasNickel requested a review from kosack May 22, 2019 12:12
@HealthyPear
Copy link
Member

Hi all,

regarding this (still open) discussion, now that the protopipe code is out, you could give a look to event_preparer.py.

I couldn't yet elaborate on what Julien left me, but I believe that from line #133 of that file there is code very similar to what you suggest - perhaps it can be possible to get inspired from it.

@kosack kosack merged commit 1480c0c into cta-observatory:master Jun 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants