Mosviz Retrieve SourceID from meta #1851

duytnguyendtn · 2022-11-16T18:04:49Z

Description

This PR fixes a bug found by a user with a private dataset where mosviz fails to find and populate the SOURCEID column, despite SOURCEID being registered in the metadata plugin. The main fix does this by introducing a new, preferred method of finding the SOURCEID by searching through the .meta attached to the data object rather than re-reading the file and searching through the header. This requires glue-viz/glue-jupyter#336 to ensure the .meta is available before the metadata is parsed

This new strategy requires the data to be parsed and loaded first, to populate the data collection and associated .meta entries. Then, after all the data is loaded, then search for the metadata afterwards. Because the data already has its metadata attached to it, no need to reopen/pass around hdus anymore.

This PR also cleans up the metadata detection by explicitly requiring the parsers to specify the data_type of the data's metadata being parsed. We were kind of already requiring this, but in a convoluted way (relying on a spectra and sp1d flag, where if spectra and sp1d were both true, it's 1D spectra, if spectra was true, but not sp1d, it's 2D spectra, and if neither were true, it was an image 😖)

As a stretch goal, I also switch both the NIRISS and Generic/NIRspec parsers to using this new metadata parsing strategy. Notice how the metadata block at the bottom looks almost identical to the mos_metadata_parser used for the generic/nirspec strategy.... how interesting.... 😉 (Future PR)

Tests are failing due to the need of this upstream patch. Please install glue-viz/glue-jupyter#336 when doing your local testing!

Blocked by

Allow empty Data to be loaded into the Table Viewer glue-viz/glue-jupyter#336

Need to bump glue-jupyter minversion in setup.cfg after upstream fix is merged and released.

Change log entry

Is a change log needed? If yes, is it added to CHANGES.rst? If you want to avoid merge conflicts,
list the proposed change log here for review and add to CHANGES.rst before merge. If no, maintainer
should add a no-changelog-entry-needed label.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
Do the proposed changes follow the STScI Style Guides?
Are tests added/updated as required? If so, do they follow the STScI Style Guides?
Are docs added/updated as required? If so, do they follow the STScI Style Guides?
Did the CI pass? If not, are the failures related?
Is a milestone set? Set this to bugfix milestone if this is a bug fix and needs to be released ASAP; otherwise, set this to the next major release milestone.
After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

jdaviz/configs/mosviz/plugins/parsers.py

pllim · 2022-11-17T16:36:32Z

Since this needs glue-viz/glue-jupyter#336 , should we convert this back to draft until that patch is merged and released? Then this PR would also need to bump glue-jupyter minversion.

Also since the original bug was found in private, how should we really review this?

duytnguyendtn · 2022-11-17T16:40:51Z

should we convert this back to draft until that patch is merged and released?

This is my first PR I've authored that requires an upstream fix. I opened this to for review because I figured we could review this simultaneously as we are getting the upstream miniversion bump ready? But I'm happy to move this back to draft if that's the policy!

original bug was found in private, how should we really review this?

The data was from our Viz Stress Test notes. See JDAT-2736 for the internal data

pllim · 2022-11-17T16:45:23Z

Moving this to draft PR status would prevent accidental merge even after someone approves it.

duytnguyendtn · 2022-11-17T16:46:07Z

Got it; moved back to draft

camipacifici · 2023-01-05T02:59:20Z

The glue-jupyter patch has been merged. Can this be resumed? @duytnguyendtn

Remove unnecessary book keeping fix hdu parser Consolidate meta/hdu sourceid methods Switch meta parser to common sourceid finder Initialize table at config load Generic parser: parse image data before image metadata to make image metadata available for metadata parser Explicitly define which data type is being parsed Parse IDs from proper data type Fix meta parser detection Make meta parser language generic to data type Rely on 1D spectra entirely for identifier info Parse 1D spectra metadata first to put Identifier as first column Rewrite sourceid meta finder to be generic for any keyword Meta parser use generic metadata searcher rather than hdu Refactor NIRISS parser to start using generic metadata finder Codestyle Cleanup Remove expected warning due to increased robustness Modify tests to expect mos table as first data object codestyle Change test to expect mos table to be first Modify linking assumption to use first real data for Mosviz (accomodate mos table being dc[0]) Change test to expect mos table to be first Simplify iterable check Change test to expect mos table to be first Change nirspec tests to expect new load order Codestyle Increase sourceid fallback robustness Add Docstrings Revive and move hdu parsing to source id by hdu method

duytnguyendtn · 2023-01-10T17:18:26Z

Almost ready for review; a note for @rosteen; this new strategy seems works out-of-the-box for the level 2 and nircam test case you included in #1835 because it actually searches the data.meta loaded for all entries loaded for the sourceid; hence, no need to manually specify the repeat number. Specifically:

https://github.com/rosteen/jdaviz/blob/30452e47d19bf8b06b509c52d5b45fc2fa631ce2/jdaviz/configs/mosviz/plugins/parsers.py#L191-L198

I did my best to gracefully remove it from this PR, but you may want to double check that I didn't misinterpret something or remove something that's actually necessary.

codecov · 2023-01-10T17:23:36Z

Codecov Report

Base: 91.81% // Head: 91.78% // Decreases project coverage by -0.02% ⚠️

Coverage data is based on head (9186dab) compared to base (ffd343e).
Patch coverage: 92.85% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1851      +/-   ##
==========================================
- Coverage   91.81%   91.78%   -0.03%     
==========================================
  Files         140      140              
  Lines       15045    15066      +21     
==========================================
+ Hits        13814    13829      +15     
- Misses       1231     1237       +6

Impacted Files	Coverage Δ
jdaviz/configs/mosviz/helper.py	`87.38% <81.81%> (+0.08%)`	⬆️
jdaviz/configs/mosviz/plugins/parsers.py	`89.83% <90.66%> (-0.85%)`	⬇️
jdaviz/app.py	`94.18% <100.00%> (-0.12%)`	⬇️
jdaviz/configs/mosviz/tests/test_data_loading.py	`100.00% <100.00%> (ø)`
jdaviz/configs/mosviz/tests/test_parsers.py	`99.07% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

duytnguyendtn · 2023-01-10T20:25:57Z

Okay, I think I've cleared this PR as ready for review! Most of the changes are infrastructure behind-the-scenes; the coverage dropped a tiny bit, but weirdly seems unrelated to this PR (the patch coverage is passing)

javerbukh

Code looks good and the application runs well, nice work! I did have some difficulty testing all aspects of mosviz because of existing bugs/recent changes in main (slit overlay not appearing, images switching to a black screen after row select), but everything else works well and I think this is definitely an improvement over how the metaparser operated previously.

rosteen

Left a couple comments, still need to actually test it. Looks like a good improvement overall.

rosteen · 2023-01-13T14:38:41Z

jdaviz/configs/mosviz/plugins/parsers.py

-            _add_to_table(app, filters_gratings, "Filter/Grating")
-        elif spectra and sp1d:
-            _add_to_table(app, names, "Identifier")
+        # Search all given keys to see if they exist. Return the first hit


So is the use case for this "the metadata I want might be under one of multiple keys" rather than "give me the metadata for every key in this list"?

I think that's correct? Definitely the first part is correct. The "Identifier" field is the easiest to trace the logic path through; we previously pulled the Identifier through either "SOURCEID" or in some cases "OBJECT". Rather than manually hardcoding that loop, you can just provide this method a list of ['SOURCEID', 'OBJECT'] and it will search those keys for a value.

On top of that, the order you provide the list also specifies the priority order to return. In the above example, if both SOURCEID and OBJECT are present, then it will return SOURCEID

Cool, just checking that my understanding was correct.

rosteen · 2023-01-13T14:47:32Z

jdaviz/configs/mosviz/plugins/parsers.py

+            filters = query_metadata_by_component(app, "FILTER", "2D Spectra", FALLBACK_NAME)
+            gratings = query_metadata_by_component(app, "GRATING", "2D Spectra", FALLBACK_NAME)

-    if np.all([isinstance(x, fits.HDUList) for x in data_obj]):
+            filters_gratings = [(f+'/'+g) for f, g in zip(filters, gratings)]


I noticed when working on getting NIRCam to load that there is also some of this information in the "PUPIL" header keyword. It seems like the different instruments put different information in FILTER/GRATING/PUPIL - maybe this would be a good opportunity to get all three and display whichever ones are populated in their own separate columns. I think it should be as simple as adding pupil = query_metadata... and having three separate add_to_table later.

This is a point I've been thinking of as well. We have historically combined Filter/Grating, but I've been thinking of whether it would be useful to split this out to separate columns in the table. (AKA, why are we even combining them in the first place?). This would also give room to include another Pupil field as well, and yes it would be as simple as what you describe above.

I hesitated on doing that here, as I wanted this PR to mainly keep the same functionality, but demonstrate the new infrastructure. I'd advocate a separate discussion with a PO/scientist to confirm whether this is what we want to do, and to a separate PR to make those changes, so that this infrastructure PR doesn't become even more bloated than it already is

Alright, that would be a small follow-up PR anyway. Let me test this and make sure it looks good in-notebook, hopefully I'll approve shortly.

If in reality I actually coded this to half as elegantly as I architected it, it should hopefully be a very small change

rosteen · 2023-01-13T14:50:03Z

jdaviz/configs/mosviz/plugins/parsers.py

+        meta_filters = query_metadata_by_component(app, 'FILTER', "2D Spectra")
+        _add_to_table(app, meta_filters, "Filter/Grating")


I think I'm not understanding the context that these would be called in vs the add_to_table calls up in mos_meta_parser - do they not conflict?

Aha isn't that interesting... 😉 From the description:

As a stretch goal, I also switch both the NIRISS and Generic/NIRspec parsers to using this new metadata parsing strategy. Notice how the metadata block at the bottom looks almost identical to the mos_metadata_parser used for the generic/nirspec strategy.... how interesting.... 😉 (Future PR)

The NIRISS parser here never used the mos_meta_parser and still doesn't here. It used to have its own parsing logic. In this PR, I gutted out the NIRISS meta logic and replaced it with code that looks STRANGELY close to the mos_meta_parser as you discovered here. It's almost like we might be able to combine them soon~!

Got it 😄

rosteen

I confirmed that the table populates for NIRCam/NIRISS/NIRSpec, approving.

duytnguyendtn · 2023-01-13T16:48:37Z

Thanks for the eyes Jesse and Ricky!

duytnguyendtn added mosviz Upstream fix required labels Nov 16, 2022

pllim added this to the 3.2 milestone Nov 16, 2022

pllim reviewed Nov 16, 2022

View reviewed changes

jdaviz/configs/mosviz/plugins/parsers.py Outdated Show resolved Hide resolved

duytnguyendtn force-pushed the metameta branch from 1eae366 to 1288021 Compare November 17, 2022 15:08

duytnguyendtn marked this pull request as ready for review November 17, 2022 16:07

duytnguyendtn requested review from rosteen, javerbukh, ojustino, kecnry and bmorris3 as code owners November 17, 2022 16:07

duytnguyendtn requested a review from pllim November 17, 2022 16:10

duytnguyendtn marked this pull request as draft November 17, 2022 16:45

pllim mentioned this pull request Nov 18, 2022

Update Mosviz parser to load level 2 data #1835

Merged

rosteen modified the milestones: 3.2, 3.3 Jan 4, 2023

duytnguyendtn force-pushed the metameta branch from 5d96012 to 74af722 Compare January 10, 2023 17:02

duytnguyendtn added 2 commits January 10, 2023 12:06

Codestyle

decdf83

Raise error for security audit

5786ccc

Changelog

9186dab

duytnguyendtn marked this pull request as ready for review January 10, 2023 20:21

javerbukh approved these changes Jan 12, 2023

View reviewed changes

rosteen reviewed Jan 13, 2023

View reviewed changes

rosteen approved these changes Jan 13, 2023

View reviewed changes

duytnguyendtn merged commit 10085eb into spacetelescope:main Jan 13, 2023

Jdaviz-Triage-Bot mentioned this pull request Apr 13, 2023

Refactor Mosviz parser #2151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mosviz Retrieve SourceID from meta #1851

Mosviz Retrieve SourceID from meta #1851

duytnguyendtn commented Nov 16, 2022 •

edited

Loading

pllim commented Nov 17, 2022

duytnguyendtn commented Nov 17, 2022

pllim commented Nov 17, 2022

duytnguyendtn commented Nov 17, 2022

camipacifici commented Jan 5, 2023

duytnguyendtn commented Jan 10, 2023 •

edited

Loading

codecov bot commented Jan 10, 2023 •

edited

Loading

duytnguyendtn commented Jan 10, 2023 •

edited

Loading

javerbukh left a comment

rosteen left a comment

rosteen Jan 13, 2023

duytnguyendtn Jan 13, 2023

rosteen Jan 13, 2023

rosteen Jan 13, 2023

duytnguyendtn Jan 13, 2023

rosteen Jan 13, 2023

duytnguyendtn Jan 13, 2023

rosteen Jan 13, 2023

duytnguyendtn Jan 13, 2023

rosteen Jan 13, 2023

rosteen left a comment

duytnguyendtn commented Jan 13, 2023

		meta_filters = query_metadata_by_component(app, 'FILTER', "2D Spectra")
		_add_to_table(app, meta_filters, "Filter/Grating")

Mosviz Retrieve SourceID from meta #1851

Mosviz Retrieve SourceID from meta #1851

Conversation

duytnguyendtn commented Nov 16, 2022 • edited Loading

Description

Blocked by

Change log entry

Checklist for package maintainer(s)

pllim commented Nov 17, 2022

duytnguyendtn commented Nov 17, 2022

pllim commented Nov 17, 2022

duytnguyendtn commented Nov 17, 2022

camipacifici commented Jan 5, 2023

duytnguyendtn commented Jan 10, 2023 • edited Loading

codecov bot commented Jan 10, 2023 • edited Loading

Codecov Report

duytnguyendtn commented Jan 10, 2023 • edited Loading

javerbukh left a comment

Choose a reason for hiding this comment

rosteen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosteen left a comment

Choose a reason for hiding this comment

duytnguyendtn commented Jan 13, 2023

duytnguyendtn commented Nov 16, 2022 •

edited

Loading

duytnguyendtn commented Jan 10, 2023 •

edited

Loading

codecov bot commented Jan 10, 2023 •

edited

Loading

duytnguyendtn commented Jan 10, 2023 •

edited

Loading