Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method to transform catalog to band-named columns #442

Merged
merged 3 commits into from
Sep 10, 2024

Conversation

mwcraig
Copy link
Contributor

@mwcraig mwcraig commented Sep 8, 2024

This adds a method for "un-tidying" the data in a catalog to get table in which separate passbands are columns. This briefly made me wonder if going with a tidy format for the catalog was a good decision, but I think it was. This ensures some uniformity across catalogs, hopefully.

This PR is part of the prep for bringing back magnitude transforms.

This adds a method for "un-tidying" the data in a catalog to get an
arrangement in which separate passbands are columns.
@mwcraig mwcraig added this to the 2.0.0-beta milestone Sep 8, 2024
@mwcraig mwcraig requested a review from JuanCab September 8, 2024 20:43
Copy link

codecov bot commented Sep 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.26%. Comparing base (e5422ea) to head (7a719a4).
Report is 124 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #442      +/-   ##
==========================================
+ Coverage   80.18%   80.26%   +0.08%     
==========================================
  Files          30       30              
  Lines        3921     3938      +17     
==========================================
+ Hits         3144     3161      +17     
  Misses        777      777              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@JuanCab JuanCab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as long as it passes the tests.

``mag_error_``. If the catalog already has columns with these names, they will
be overwritten. The input catalog will not be changed.
"""
catalog_passbands = set(self["passband"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, a use for set that isn't students trying to de-duplicate items in a list by looking up answer in Stack Overflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

Comment on lines +1009 to +1018
# This makes a MultiIndex for the columns -- "mag" and "mag_error" are the
# top level, and the passbands are the second level.
df = df.pivot(
columns="passband", index=["id", "ra", "dec"], values=["mag", "mag_error"]
)

# The column names are a MultiIndex, so we flatten them to either "mag_band"
# or "mag_error_band", where "band" is the passband name.
df.columns = df.columns.to_series().str.join("_")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to stare at this a bit to see what it is doing. I think I understand it now. Your creating a pivoted table that lists magnitude and magnitude error with passband a a column, then using that to build mag_band and magerr_band columns... Still a bit of head scratcher for me, but I understand it has to be done this way.

Comment on lines +877 to +881
assert set(new_cat["mag_V"]) == set(apass_input["Vmag"])
assert set(new_cat["mag_error_V"]) == set(apass_input["e_Vmag"])
assert set(new_cat["mag_SI"]) == set(apass_input["i_mag"])
assert set(new_cat["mag_error_SI"]) == set(apass_input["e_i_mag"])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a reasonable way to check that the new catalog format holds the same data as the original.

@mwcraig mwcraig merged commit 754f24f into feder-observatory:main Sep 10, 2024
18 checks passed
@mwcraig mwcraig deleted the add-passband-column-format branch September 10, 2024 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants