Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implements add_column_transform_regexp feature #1690

Merged
merged 5 commits into from
May 11, 2021

Conversation

kosack
Copy link
Contributor

@kosack kosack commented May 3, 2021

This allows one to add column transforms by patterns that match the table name and column name, using re.fullmatch. It adds a new function add_column_transform_regexp(), so the original API does not change.

It also adds an internal helper function _realize_regexp_transforms() that converts transform patterns into explicit transform dicts that are applied as before. This way there is no speed impact compared to the original implementation, as this function is only called during schema generation.

This feature is needed in order to make the DL2 output of #1673 work if we split tables by tel_id, since it allows one to add column transforms to all tables for each algorithm without a huge and complex loop over all possible names.

Technical note: we could replace add_column_transform() with this implementation, i.e. to always treat all transforms as regexps and hide this as an implementation detail, but I decided on simply adding a separate API function for adding by regexp, since otherwise it could make schema generation slow in cases where there are a lot of tables (like DL1 files, etc).

This allows one to add column transforms by patterns that match the table name and column name, using re.fullmatch.  It adds a new function `add_column_transform_regexp()`, so the original API does not change.

It also adds an internal helper function `_realize_regexp_transforms()` that converts transform patterns into explicit transform dicts that are applied as before.  This way there is no speed impact compared to the original implementation, as this function is only called during schema generation.
@codecov
Copy link

codecov bot commented May 3, 2021

Codecov Report

Merging #1690 (86e088a) into master (8836a96) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1690      +/-   ##
==========================================
+ Coverage   90.52%   90.57%   +0.04%     
==========================================
  Files         186      186              
  Lines       13998    14065      +67     
==========================================
+ Hits        12672    12739      +67     
  Misses       1326     1326              
Impacted Files Coverage Δ
ctapipe/io/hdf5tableio.py 96.92% <100.00%> (+0.01%) ⬆️
ctapipe/io/tableio.py 93.33% <100.00%> (+0.96%) ⬆️
ctapipe/io/tests/test_hdf5.py 98.47% <100.00%> (+0.06%) ⬆️
ctapipe/io/tests/test_astropy_helpers.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8836a96...86e088a. Read the comment docs.

@kosack kosack requested a review from maxnoe May 4, 2021 14:39
Copy link
Member

@maxnoe maxnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kosack kosack requested a review from HealthyPear May 6, 2021 12:08
@maxnoe maxnoe added this to the v0.12.0 milestone May 10, 2021
@maxnoe maxnoe merged commit b30ea3b into cta-observatory:master May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants