Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk interpolation to select calibration data #2634

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open

Conversation

ctoennis
Copy link

I will need a method to select calibration data for the strar tracker. I made some slides to decribe how it is supposed to work:
https://docs.google.com/presentation/d/1oxIcYSQvGnU7IQYy3fGdcv0qXiLpvaXR9YtmnDesj4Y/edit?usp=sharing

@ctoennis ctoennis requested review from maxnoe and kosack October 31, 2024 11:27
@ctoennis ctoennis self-assigned this Oct 31, 2024

This comment has been minimized.

This comment has been minimized.

2 similar comments

This comment has been minimized.

Copy link

Passed

Analysis Details

1 Issue

  • Bug 0 Bugs
  • Vulnerability 0 Vulnerabilities
  • Code Smell 1 Code Smell

Coverage and Duplications

  • Coverage 98.00% Coverage (94.30% Estimated after merge)
  • Duplications 0.00% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

@ctoennis
Copy link
Author

@maxnoe @kosack Can you have another look if there is something else to be changed?

@mexanick
Copy link
Contributor

@kosack @maxnoe this PR is needed to complete the pointing calibration (for the variance calibration application), can we advance it?

@ctoennis
Copy link
Author

I am a bit stuck here with one of the tests. test_hdf5 is failing in pytest begause the data from the file is not loaded correctly. When i look in the test what x and y values the interpolators have i get some wrong values. However if i try to do the same outside of pytest it works. I used this code to test by myself:

import astropy.units as u
import numpy as np
import tables
from astropy.table import Table
from astropy.time import Time

from functools import partial
from ctapipe.core import Component, traits

from ctapipe.monitoring.interpolation import PointingInterpolator

from ctapipe.io import write_table

t0 = Time("2022-01-01T00:00:00")

table = Table(
    {"time": t0 + np.arange(0.0, 10.1, 2.0) * u.s, "azimuth": np.radians(np.linspace(0.0, 10.0, 6)) * u.rad, "altitude": np.radians(>)

path = "pointing.h5"

write_table(table, path, "/dl0/monitoring/telescope/pointing/tel_001")

with tables.open_file(path) as h5file:
    interpolator = PointingInterpolator(h5file)
    t = t0 + 1 * u.s
    alt, az = interpolator(tel_id=1, time=t)
    print(interpolator._interpolators[1]["alt"].y,interpolator._interpolators[1]["alt"].x)
    print(alt,az)

Has anyone an idea what is wrong here?

@maxnoe
Copy link
Member

maxnoe commented Dec 4, 2024

Can also be done in a follow-up PR of course.

@mexanick
Copy link
Contributor

mexanick commented Dec 4, 2024

This looks good now. A remaining question would be if you want to add specific ChunkInterpolators (like PointingInterpolator for LinearInterpolator)?

I.e. CalibrationInterpolator, FlatFieldInterpolator, PedestalInterpolator etc.

I'd consider having a factory then. I think, just "CalibrationInterpolator" won't make much sense, but a specific ones like "FFInterpolator" may. I'd address it in another PR.

@maxnoe
Copy link
Member

maxnoe commented Dec 4, 2024

Yes, factory makes sense for this

mexanick
mexanick previously approved these changes Dec 4, 2024
@mexanick mexanick requested a review from maxnoe December 4, 2024 11:05
super().__init__(**kwargs)
self._interpolators = {}
self.required_columns = ["start_time", "end_time"]
self.expected_units = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these instance variables? And why are they empty?

This is not in line with how the other class works.

It seems this class variable is unused even.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here required_columns shall become class attributes, but the interpolators and expected units shall remain instance variable, to allow creation of multiple instances. We have to see in the future PR, whether we want further specialization (e.g. a factory of VarNameInterpolators), that may lead to change this (they will basically become singletons).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected units are unused because the quantity is dimensionless, we may want to actually enforce this through u.dimensioneless_unscaled

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had implemented that for collumns that are supposed to not have a unit we set the expected unit to None and check if the actual unit is equivalent. I put it like that.

def __init__(self, h5file: None | tables.File = None, **kwargs: Any) -> None:
super().__init__(**kwargs)
self._interpolators = {}
self.required_columns = ["start_time", "end_time"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required columns as start and stop time shall be the class attributes and shall be frozen, they are mandatory. You can copy them to an instance and extend with a value column(s) when you engage a __call__.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the functions that use required_columns are in the parent class and always look to that name it makes more sense to have it as a modifiable instance variable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The original design idea was to have the required_columns frozen per "final" class. I.e. MonitoringInterpolator requires altitude and azimuth columns.

This is due to the configuration system in ctapipe, which works on class name basis, not instances.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the most sensible way to keep the required units and columns as class variables is to have pedestal and flatfield interpolators that inherit from the ChunkInterpolator. The ChunkInterpolator now has no required columns or units, but rather the subclasses have those variables. If we use the FlatFieldInterpolator we know we will always look for a column with relative gain factors with no unit, similarly we know what data a PedestalInterpolator will need.

I made those classes, and if we want to use the Chunk interpolation later for some other data we can add another subclass.

self.required_columns.update(columns)
self.required_columns = set(self.required_columns)
for col in columns:
self.expected_units[col] = None
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here i set the unit of the new columns to None, which is then in the next line enforced by _check_tables. This way we ensure the values have no unit.


for column in self.columns:
self.values[tel_id][column] = input_table[column]
self.start_time[tel_id][column] = input_table["start_time"].to_value("mjd")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why store start_time and end_time per column?

raise ValueError(
f"Column '{column}' not found in interpolators for tel_id {tel_id}"
)
result[column] = self._interpolators[tel_id][column](mjd)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use self._interpolators, why keep that around at all?

Why not just call self._interpolate_chunk(tel_id, column, mjd)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_table checks if _interpolators has already been set up for the given tel_id. _check_interpolators in MonitoringInterpolator also does that check and adds data from hdf5file if the interpolator is not set. I can move column to be an argument of _interpolate_chunk though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because you didn't move _check_interpolators to the LinearInterpolator but kept it in MonitoringInterpolator. The MonitoringInterpolator still contains more than just the abstract interface.

mexanick
mexanick previously approved these changes Dec 11, 2024
@mexanick mexanick requested a review from maxnoe December 12, 2024 09:59
@mexanick mexanick self-requested a review January 16, 2025 09:41
@mexanick
Copy link
Contributor

@maxnoe can you please take a look? I believe, all your comments have been addressed.

self.values[tel_id] = {}
self.start_time[tel_id] = input_table["start_time"].to_value("mjd")
self.end_time[tel_id] = input_table["end_time"].to_value("mjd")
self._interpolators[tel_id] = partial(self._interpolate_chunk, tel_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we still storing a lambda / partial here in _interpolators? Why not just call self._interpolate_chunk directly in __call__?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants