Disable 1D linking and Simultaneous Row Plotting in Mosviz #1790

duytnguyendtn · 2022-10-28T02:44:46Z

Description

You may be asking yourself, why in {insert deity name of choice}'s name would you want that!? I think a picture is worth 1000 words (even if it's a picture of words :P)

You're reading that right; 200 targets, which used to take 1 hour and 11 minutes (!!) loaded in under two minutes.

This PR is part of a 3 part strategy in changing the linking strategy in Mosviz, developed with @camipacifici. In Mosviz, data navigation is mainly intended to be done with the Table Viewer. I recall us having discussions about removing the data dropdowns from Mosviz because of this. Though it may have been stopped due to a desire to plot multiple 1D spectrum in the same viewer.

After some profiling, I discovered supporting multi-plotting 1D spectrum was a VAST majority of the load time issues with Mosviz parser (in the GLASS dataset's case, 97.5% of the load time!). This is because the linking tree grows much faster than N for N number of targets, since we need to back link each target to each other one. This PR disables the intra 1D spectra linking to vastly improve loading times. It then hides other rows' data and disables the ability to load them simultaneously:

What about the multi-plotting spectrum? That's the last part of this 3-part strategy: (EDIT: Modified to include #1790 (comment))

Disable 1D linking
Investigate whether batch linking produces a similar benefit to removing linking entirely (Disable 1D linking and Simultaneous Row Plotting in Mosviz #1790 (comment))
Re-enable the data dropdown with filtering to only the current row's data
Benchmark the remaining two minutes to see whether lazy data loading could help
If the above comes back true, investigate lazy data loading to further improve data loading/parsing time
Investigate Dynamic linking of data, and re-enable intra-row/simultaneous-row plotting

I propose we investigate the possibility of dynamic linking of data. That being we only link data together IF and WHEN the user actually requests to show the data at the same time. Effectively, when the user clicks on the data dropdown and tries to add a second spectrum to the spectrum viewer, link AT THAT TIME, not in advance. That way, we only end up with links that we actually need. Overall, this will result in a link tree that is "sub-optimal" as compared to one created with "batch loading". But what we sacrifice in optimal linking, we gain in efficiency and parsing speed. I have no idea if this is possible, as it's just an idea I came up with much past my bedtime. But here we are!

I got the green light from @camipacifici to proceed with 1 and 2 now, and investigate 3 later. So I invite eyes and comments!

Change log entry

Is a change log needed? If yes, is it added to CHANGES.rst? If you want to avoid merge conflicts,
list the proposed change log here for review and add to CHANGES.rst before merge. If no, maintainer
should add a no-changelog-entry-needed label.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
Do the proposed changes follow the STScI Style Guides?
Are tests added/updated as required? If so, do they follow the STScI Style Guides?
Are docs added/updated as required? If so, do they follow the STScI Style Guides?
Did the CI pass? If not, are the failures related?
Is a milestone set? Set this to bugfix milestone if this is a bug fix and needs to be released ASAP; otherwise, set this to the next major release milestone.
After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

codecov · 2022-10-28T02:57:54Z

Codecov Report

Base: 88.10% // Head: 88.13% // Increases project coverage by +0.02% 🎉

Coverage data is based on head (cfa9e14) compared to base (9e4acb8).
Patch coverage: 100.00% of modified lines in pull request are covered.

❗ Current head cfa9e14 differs from pull request most recent head f85fdde. Consider uploading reports for the commit f85fdde to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1790      +/-   ##
==========================================
+ Coverage   88.10%   88.13%   +0.02%     
==========================================
  Files          95       95              
  Lines       10241    10254      +13     
==========================================
+ Hits         9023     9037      +14     
+ Misses       1218     1217       -1

Impacted Files	Coverage Δ
jdaviz/configs/mosviz/plugins/parsers.py	`91.11% <ø> (-0.13%)`	⬇️
jdaviz/app.py	`94.25% <100.00%> (+0.03%)`	⬆️
jdaviz/configs/cubeviz/helper.py	`98.41% <0.00%> (+0.05%)`	⬆️
jdaviz/configs/mosviz/helper.py	`86.95% <0.00%> (+0.05%)`	⬆️
jdaviz/configs/specviz/helper.py	`58.13% <0.00%> (+0.18%)`	⬆️
jdaviz/configs/specviz/plugins/parsers.py	`90.52% <0.00%> (+1.50%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

kecnry · 2022-10-28T12:56:52Z

I don't think disabling the data menu entirely is the way to go (won't be able to toggle the visibility of subsets or models, etc)... what if we instead skipped entries in the data menu from other rows? Even then, that would disable the use-case of comparing spectra from different targets - do we want to do that?

duytnguyendtn · 2022-10-28T14:00:53Z

Even then, that would disable the use-case of comparing spectra from different targets - do we want to do that?

Sorry for not being clear. This is what I meant by "multi-plotting spectra." This is exactly what Cami and I are proposing right now, since without 1D linking, this isn't possible. Cami and I agreed the "flipping through the table viewer" use case is probably more likely than the desire to compare spectra between targets, so we should remove the ability to do that from the UI, at least temporarily until we can figure out a way to allow it without crippling parse times.

Good point on preserving the data dropdown for models and other "non data artifacts." I agree with your strategy of trying to remove entries from other rows; I'll investigate

duytnguyendtn · 2022-10-28T15:22:01Z

Thanks team for the feedback this morning! Here's a summary of next steps:

Investigate whether batch linking produces a similar benefit to removing linking entirely
Re-enable the data dropdown with filtering to only the current row's data
Benchmark the remaining two minutes to see whether lazy data loading could help
If the above comes back true, investigate lazy data loading to further improve data loading/parsing time

duytnguyendtn · 2022-10-28T22:21:08Z

Ok so after a thorough archaeological expedition, I wanted to loop back on the first point:

Investigate whether batch linking produces a similar benefit to removing linking entirely

I swore I remembered at the time that this was already implemented, but I wanted to be sure before I embarrassed myself. So in #762 (comment), @ojustino does a good job describing the current strategy. Let me try to describe what I understand of the linking logic at this point:

The linking logic is already self contained within an entirely separate parser:

@data_parser_registry("mosviz-link-data")
def link_data_in_table(app, data_obj=None):

Linking is already deferred in each of the other parsers for the image, 1D, and 2D data. Only after those parsers have completed their work, does this link parser step in.

In this linking parser, we already employ with app.data_collection.delay_link_manager_update(): to stop the automatic linking. We instead define the links manually by linking each 1D spectra to the first spectrum with a series of LinkSame's. NOTICE: We aren't defining all the interlinks from each spectra to each other, only to the first spectrum. Also note that we aren't officially registering the link JUST YET. The logic appends all the LinkSame's into a list containing all the pending links to be registered, and then ONLY AT THE END, applies all of them in a single add_link() call. It's only when we add the links at the end to glue that the rest of the links are discovered and applied.

So this mosviz-link-parser I think is already our attempt to batch link. Searching through the glue docs, I can't find any other interpretation of "batch linking." Considering Tom's approval in #762 (comment), I think this produces the most efficient linking tree possible. I tried searching for some glue linking mechanism to provide an array of IDs and tell Glue "These are all the same thing," rather than creating separate LinkSame's in a loop, but I couldn't find any. I think we're at the point to acknowledge we just have a lot of data to link and now need to reconsider which links we want to intentionally make, rather than blind linking everything

Please let me know if I'm misinterpreting "batch linking" and someone has a secret that I've been out of the loop on!

pllim · 2022-10-28T23:58:19Z

Looks like batch linking to me...

jdaviz/jdaviz/configs/mosviz/plugins/parsers.py

Lines 117 to 120 in c319c30

    
           # Optimize linking speed through a) delaying link manager updates with a 
        
           # context manager, b) handling intra-row linkage of 1D and 2D spectra in a 
        
           # loop, and c) handling inter-row linkage after that in one fell swoop. 
        
           with app.data_collection.delay_link_manager_update():

jdaviz/components/viewer_data_select.vue

javerbukh · 2022-11-04T14:25:49Z

I'm testing the NIRISS example notebook and getting the following error when clicking a different row, anyone else? Traceback ends with:

File ~/Documents/jdaviz_dev/jdaviz/jdaviz/app.py:1031, in Application.add_data_to_viewer(self, viewer_reference, data_label, visible, clear_other_data)
   1013 """
   1014 Plots a data set from the data collection in the specific viewer.
   1015 
   (...)
   1027     defined data set.
   1028 """
   1029 if 'mosviz_row' in self.state.settings:
   1030     if not ((self.get_viewer("table-viewer").row_selection_in_progress) and
-> 1031             (self.data_collection[data_label].meta['mosviz_row'] !=
   1032                 self.state.settings['mosviz_row'])):
   1033         raise NotImplementedError("Intra-row plotting not supported. "
   1034                                   "Please use the table viewer to change rows")
   1036 viewer_item = self._get_viewer_item(viewer_reference)

KeyError: 'mosviz_row'

duytnguyendtn · 2022-11-04T15:21:13Z

Fixed row selection in 4012700! You'll still see this traceback in NIRISS datasets when manually calling add_data_to_viewer with a different row's spectra due to #1805. But I think this can be handled as a separate bug

javerbukh · 2022-11-04T15:52:03Z

Works well! I did want to ask if one interaction behaves as expected: if I use gaussian smooth on a 1d spectrum, new data is added called "... smooth stddev x". If I make this data invisible or remove it from the viewer, then click a different row, then return to the first row, that "... smooth stddev x" data is now visible and in the viewer. Is this the behavior we would expect? If not, is it a big enough issue to handle in this PR or instead should we create a new issue for follow up? If it's a simple enough fix I think we should handle it in this PR but I'm interested to hear thoughts from the team and @camipacifici.

duytnguyendtn · 2022-11-04T15:53:52Z

@javerbukh Great eyes! That's yet another bug that Cami and I have been talking about, and plan to address. I was able to reproduce this error in main so it doesn't appear connected to the changes here

duytnguyendtn · 2022-11-04T15:55:23Z

That particular bug is #1816. I don't immediately know where this behavior is coming from, so it will require at least a little investigative work. I'd advocate for keeping it a separate effort

javerbukh

@duytnguyendtn Ah ok, in that case I think this is good to get in. Nice work!

CHANGES.rst

jdaviz/app.py

jdaviz/configs/mosviz/tests/test_data_loading.py

Co-authored-by: P. L. Lim <[email protected]>

pllim

LGTM. Since @kecnry commented before, do you want an approval from him as well?

duytnguyendtn · 2022-11-04T18:54:50Z

@kecnry gave me a silent approval over slack, just couldn't do a full review. I'm happy to move forward here! Thanks Kyle, Jesse, Pey-Lian and Cami for your efforts/thoughts through this PR!

kecnry · 2022-11-07T13:38:00Z

If I make this data invisible or remove it from the viewer, then click a different row, then return to the first row, that "... smooth stddev x" data is now visible and in the viewer.

I don't immediately know where this behavior is coming from, so it will require at least a little investigative work.

The row changing removes/adds data to the viewer. When data is removed from the viewer, all associated layers are destroyed/forgotten, so if we want any of the layer information to be restored when returning to the row, we will need to store that somewhere. This same thing happens if you manually remove and add the same data entry into a viewer (all layer options are lost) which was the motivation for changing the check box in the data menu to control visibility instead of loaded-ness... but that solution obviously won't work for mosviz, so if the expected behavior is that visibility and/or layer state persists, we'll need to write that logic.

duytnguyendtn requested review from rosteen, javerbukh, ojustino, pllim, kecnry and bmorris3 as code owners October 28, 2022 02:44

github-actions bot added the mosviz label Oct 28, 2022

pllim added this to the 3.2 milestone Oct 28, 2022

duytnguyendtn added 4 commits November 2, 2022 13:47

Disable data viewer in mosviz

01e7ec1

Remove 1D linking from Mosviz Parser

b1a592c

Changelog

2ad5a61

Only show current row data in data dropdown

668a2bb

duytnguyendtn force-pushed the mo_linking_mo_problems branch from 8690ac2 to 668a2bb Compare November 2, 2022 19:19

kecnry reviewed Nov 2, 2022

View reviewed changes

jdaviz/components/viewer_data_select.vue Outdated Show resolved Hide resolved

duytnguyendtn added 3 commits November 3, 2022 11:26

Keep unloaded row-associated data in other dropdown

a0f5181

Prevent user from adding another row's data

85718b5

Codestyle

9931f49

duytnguyendtn changed the title ~~Disable data dropdowns and 1D linking in Mosviz~~ Disable 1D linking and Simultaneous Row Plotting in Mosviz Nov 3, 2022

duytnguyendtn added the Ready for final review label Nov 3, 2022

duytnguyendtn requested review from kecnry and camipacifici November 3, 2022 20:12

duytnguyendtn added 2 commits November 3, 2022 16:20

Changelog

761c4f1

Make intra-row test more reliable

3451041

Fix row detection logic

4012700

javerbukh approved these changes Nov 4, 2022

View reviewed changes

pllim reviewed Nov 4, 2022

View reviewed changes

CHANGES.rst Outdated Show resolved Hide resolved

jdaviz/app.py Outdated Show resolved Hide resolved

jdaviz/configs/mosviz/tests/test_data_loading.py Outdated Show resolved Hide resolved

duytnguyendtn and others added 2 commits November 4, 2022 14:30

Remove debug statement

cfa9e14

Co-authored-by: P. L. Lim <[email protected]>

Combine intra-row catch conditions

f85fdde

Co-authored-by: P. L. Lim <[email protected]>

pllim approved these changes Nov 4, 2022

View reviewed changes

duytnguyendtn merged commit 60fc1bd into spacetelescope:main Nov 4, 2022

duytnguyendtn deleted the mo_linking_mo_problems branch November 4, 2022 19:13

kecnry mentioned this pull request Nov 14, 2022

Move changelog for failed backport 1836 #1840

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable 1D linking and Simultaneous Row Plotting in Mosviz #1790

Disable 1D linking and Simultaneous Row Plotting in Mosviz #1790

duytnguyendtn commented Oct 28, 2022 •

edited by pllim

Loading

codecov bot commented Oct 28, 2022 •

edited

Loading

kecnry commented Oct 28, 2022

duytnguyendtn commented Oct 28, 2022 •

edited

Loading

duytnguyendtn commented Oct 28, 2022

duytnguyendtn commented Oct 28, 2022

pllim commented Oct 28, 2022

javerbukh commented Nov 4, 2022 •

edited

Loading

duytnguyendtn commented Nov 4, 2022

javerbukh commented Nov 4, 2022

duytnguyendtn commented Nov 4, 2022

duytnguyendtn commented Nov 4, 2022 •

edited

Loading

javerbukh left a comment

pllim left a comment

duytnguyendtn commented Nov 4, 2022

kecnry commented Nov 7, 2022

Disable 1D linking and Simultaneous Row Plotting in Mosviz #1790

Disable 1D linking and Simultaneous Row Plotting in Mosviz #1790

Conversation

duytnguyendtn commented Oct 28, 2022 • edited by pllim Loading

Description

Change log entry

Checklist for package maintainer(s)

codecov bot commented Oct 28, 2022 • edited Loading

Codecov Report

kecnry commented Oct 28, 2022

duytnguyendtn commented Oct 28, 2022 • edited Loading

duytnguyendtn commented Oct 28, 2022

duytnguyendtn commented Oct 28, 2022

pllim commented Oct 28, 2022

javerbukh commented Nov 4, 2022 • edited Loading

duytnguyendtn commented Nov 4, 2022

javerbukh commented Nov 4, 2022

duytnguyendtn commented Nov 4, 2022

duytnguyendtn commented Nov 4, 2022 • edited Loading

javerbukh left a comment

Choose a reason for hiding this comment

pllim left a comment

Choose a reason for hiding this comment

duytnguyendtn commented Nov 4, 2022

kecnry commented Nov 7, 2022

duytnguyendtn commented Oct 28, 2022 •

edited by pllim

Loading

codecov bot commented Oct 28, 2022 •

edited

Loading

duytnguyendtn commented Oct 28, 2022 •

edited

Loading

javerbukh commented Nov 4, 2022 •

edited

Loading

duytnguyendtn commented Nov 4, 2022 •

edited

Loading