Cubeviz slider performance improvements #1550

duytnguyendtn · 2022-08-08T14:42:31Z

Description

This PR attempts to improve the performance of the cubeviz slider by using the helper rather than passing messages around.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
Do the proposed changes follow the STScI Style Guides?
Are tests added/updated as required? If so, do they follow the STScI Style Guides?
Are docs added/updated as required? If so, do they follow the STScI Style Guides?
Did the CI pass? If not, are the failures related?
Is a change log needed? If yes, is it added to CHANGES.rst?
Is a milestone set?
After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

pllim · 2022-08-08T16:03:25Z

Did you benchmark the improvement?

codecov · 2022-08-08T16:11:58Z

Codecov Report

Merging #1550 (1a66a10) into main (c523cf6) will decrease coverage by 0.04%.
The diff coverage is 66.66%.

@@            Coverage Diff             @@
##             main    #1550      +/-   ##
==========================================
- Coverage   85.48%   85.44%   -0.05%     
==========================================
  Files          93       94       +1     
  Lines        8749     9054     +305     
==========================================
+ Hits         7479     7736     +257     
- Misses       1270     1318      +48

Impacted Files	Coverage Δ
jdaviz/core/events.py	`93.44% <ø> (+1.33%)`	⬆️
jdaviz/configs/cubeviz/plugins/tools.py	`89.23% <50.00%> (+1.35%)`	⬆️
jdaviz/configs/cubeviz/helper.py	`96.07% <100.00%> (+1.63%)`	⬆️
jdaviz/configs/specviz2d/plugins/parsers.py	`35.48% <0.00%> (-51.62%)`	⬇️
jdaviz/app.py	`91.79% <0.00%> (-0.60%)`	⬇️
...igs/specviz/plugins/line_analysis/line_analysis.py	`97.68% <0.00%> (-0.34%)`	⬇️
jdaviz/core/template_mixin.py	`91.25% <0.00%> (-0.10%)`	⬇️
jdaviz/configs/specviz2d/plugins/__init__.py	`100.00% <0.00%> (ø)`
...plugins/spectral_extraction/spectral_extraction.py	`88.23% <0.00%> (ø)`
...nfigs/default/plugins/plot_options/plot_options.py	`99.21% <0.00%> (+0.03%)`	⬆️
... and 4 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

duytnguyendtn · 2022-08-09T14:11:08Z

Just about to mark this PR as ready for review; here's all I've learned from profiling Cubeviz:

There is a divide here in where I was able to profile. The front-end part which I couldn't find a way to profile details going from actual mouse click to the underlying Jdaviz code. The best way to detail this section I think came from Kyle: "This is where, for example, the browser has to wait a moment to determine if the user has double-clicked." The profiling I was able to do comes from after receiving the instruction to change wavelength, to actually changing the wavelength

So after testing the two techniques for selecting the wavelength, the time it took to select 200 random wavelengths dropped from 53.577 seconds to 51.470 seconds, or an improvement of 4%. We're really scratching the bottom of the barrel here without touching the actual strategy itself (which would be more than 3 points of effort 😅.

Attached below are the top calls from the profiler for 100 runs:

         4212627 function calls (4180466 primitive calls) in 28.430 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     9808    9.218    0.001    9.228    0.001 {built-in method numpy.array}
      100    2.946    0.029   19.719    0.197 array.py:431(compute_statistic)
     6868    2.903    0.000    2.903    0.000 {method 'reduce' of 'numpy.ufunc' objects}
      100    2.121    0.021   10.511    0.105 array.py:399(nansum_with_nan_for_empty)
     2730    1.448    0.001    1.448    0.001 _methods.py:106(_clip_dep_invoke_with_casting)
      100    1.443    0.014    4.927    0.049 nanfunctions.py:68(_replace_nan)
26404/15366    1.439    0.000   11.116    0.001 {built-in method numpy.core._multiarray_umath.implement_array_function}

What's actually taking so long is the process of converting the wavelength input to an actual indexed-slice. To do this, select_wavelength needs to grab the whole spectral_axis and find the closest wavelength to the selected value. But to do this, it has to grab the full data and therefore create a Spectrum1D EACH TIME this is called. The true killer here is that it isn't just grabbing the data from disk; it must actually COMPUTE the data, since the spectrum is autogenerated/autocollapsed. If it matters, the specific thing that's taking so long is glue needing to compute the statistic (the auto-collapsed spectrum) each time, but I think the issue is more our current strategy to begin with. I don't immediately know how to solve this, but I've already exhausted the 3 points on this ticket, so I'd recommend starting here for more improvements next time. Possible suggestions:

Cache the Spectrum1D, or maybe even the spectral axis, and return it each time it's requested. This might be the easiest
Find some way to calculate the spectral_axis without needing to generate the flux. Since the actual computation of the statistic isn't really necessary here, it could be theoretically skipped (we only need the x-axis, not the y-axis)

kecnry · 2022-08-09T18:27:57Z

@duytnguyendtn - that is very useful investigative work! I agree that caching might be quite useful, but I'm just curious if accessing the marks object for the displayed spectrum would be sufficiently faster. Something like the following in select_wavelength (which assumes that the first Line - not subclass of Line - entry corresponds to the reference data):

x_all = [m for m in sv.figure.marks if m.__class__.__name__ in ['Lines', 'LinesGL']][0].x

instead of the existing

x_all = self.app.get_viewer('spectrum-viewer').data()[0].spectral_axis.value

duytnguyendtn · 2022-08-09T18:45:49Z

@kecnry You're a genius!!!

         3682352 function calls (3654486 primitive calls) in 7.079 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2580    1.425    0.001    1.425    0.001 _methods.py:106(_clip_dep_invoke_with_casting)
      516    0.617    0.001    3.701    0.007 composite_array.py:78(__call__)
      258    0.263    0.001    0.265    0.001 component.py:82(__getitem__)
     5399    0.252    0.000    0.252    0.000 {built-in method builtins.dir}
      516    0.251    0.000    0.251    0.000 {method 'take' of 'numpy.ndarray' objects}
22390/12242    0.230    0.000    2.190    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
   591379    0.228    0.000    0.425    0.000 {built-in method builtins.getattr}

Co-authored-by: Kyle Conroy <[email protected]>

pllim · 2022-08-09T18:55:24Z

Can we replace this

jdaviz/jdaviz/configs/cubeviz/helper.py

Lines 88 to 90 in f7ee912

    
           x_all = self.app.get_viewer('spectrum-viewer').data()[0].spectral_axis.value 
        
           index = np.argmin(abs(x_all - wavelength)) 
        
           return self.select_slice(int(index))

with this?

wavelength = wavelength * wave_unit  # We store the unit somewhere, right?
index = self.app.data_collection[0].coords.spectral_wcs.world_to_pixel(wavelength)  # Assume [0] is always FLUX
return self.select_slice(int(index))

The WCS is already there. The only thing I am not sure about is whether spectral_wcs is always available as a property. It is not specified in APE 14. If not, we have to find a different way to query that WCS, but the point is you don't need to collapse any spectrum to begin with.

kecnry

Noticeably snappier on the default cube! Thanks for tracking this down!

pllim · 2022-08-09T18:56:37Z

Ah, I just saw Kyle had the same idea but different solution. I guess that works too if we can trust the marks.

pllim · 2022-08-09T18:58:34Z

jdaviz/configs/cubeviz/plugins/tools.py

@@ -41,8 +41,7 @@ def on_mouse_event(self, data):
            # throttle to 200ms
            return

-        msg = SliceSelectWavelengthMessage(wavelength=data['domain']['x'], sender=self)


Do we even need SliceSelectWavelengthMessage with this removal? Should we remove this message from the code base altogether?

@kecnry , any reason we still need to listen to SliceSelectWavelengthMessage in app at all?

Quickly looking through the code, and it looks like the slider was the only thing that used it? I'll remove it

Agreed, if its not used anywhere else, let's get rid of it here.

kecnry · 2022-08-09T18:58:48Z

I think since we don't allow unloading the reference data from the UI (technically the user could manually remove it from the API), then the marks should be reliable. But trying both and seeing which performs better never hurts! For the marks approach, it might eventually be worth caching the marks entry so we don't have to do that ugly loop to find it though...

pllim · 2022-08-09T18:59:09Z

jdaviz/configs/cubeviz/helper.py

@@ -85,7 +85,10 @@ def select_wavelength(self, wavelength):
            wavelength = float(wavelength.wavelength)
        if not isinstance(wavelength, (int, float)):
            raise TypeError("wavelength must be a float or int")
-        x_all = self.app.get_viewer('spectrum-viewer').data()[0].spectral_axis.value
+        # Retrieve the x slices from the spectrum viewer's marks
+        x_all = [m for m in self.app.get_viewer('spectrum-viewer').figure.marks


I am not familiar with marks. Is it always in the unit we think it is in, especially with all the unit conversion going on via plugin, etc?

the marks are in the plotted units, the click/drag even is guaranteed to be in plotted units, and the select_wavelength method says that it assumes the user input for wavelength is provided in plotted units, so I think we're safe (for now).

I think the marks will always be in the same units of the spectrum viewer, since they are the literal marks that are plotted on screen. When using the slice tool, it will naturally request the wavelength in units of the viewer it's acting on, so I think that should be consistent? @kecnry is that a fair statement?

Yes, I think that's fair. The only possible problem I can see would be when the user calls the API manually, but the API docs state that no unit conversion is done, so I think that can be kicked down the road for the unit conversion refactor (if at all)

pllim · 2022-08-09T19:00:25Z

This is basically a performance bug fix, so I think we need a change log.

CHANGES.rst

Co-authored-by: P. L. Lim <[email protected]>

jdaviz/core/events.py

pllim

LGTM. Thanks!

duytnguyendtn and others added 2 commits August 5, 2022 16:38

Set wavelength directly rather than msg

68ae507

Remove unneeded message

ea03ca3

duytnguyendtn added this to the 2.9 milestone Aug 8, 2022

github-actions bot added the cubeviz label Aug 8, 2022

duytnguyendtn added no-changelog-entry-needed changelog bot directive and removed cubeviz labels Aug 8, 2022

duytnguyendtn marked this pull request as ready for review August 9, 2022 14:11

duytnguyendtn requested review from rosteen, javerbukh, ojustino, pllim and kecnry as code owners August 9, 2022 14:11

duytnguyendtn and others added 2 commits August 9, 2022 14:49

Retrieve x slices from viewer marks rather than data

498bcb0

Co-authored-by: Kyle Conroy <[email protected]>

Codestyle

2dbfbde

kecnry approved these changes Aug 9, 2022

View reviewed changes

pllim reviewed Aug 9, 2022

View reviewed changes

pllim added bug Something isn't working cubeviz performance Performance related and removed no-changelog-entry-needed changelog bot directive labels Aug 9, 2022

Add Changelog

10dae01

pllim reviewed Aug 9, 2022

View reviewed changes

CHANGES.rst Outdated Show resolved Hide resolved

duytnguyendtn and others added 2 commits August 9, 2022 15:12

Strip metric from changelog

05d1234

Co-authored-by: P. L. Lim <[email protected]>

Remove SlideSelectWavelengthMessage

e71de80

kecnry reviewed Aug 9, 2022

View reviewed changes

jdaviz/core/events.py Outdated Show resolved Hide resolved

pllim approved these changes Aug 9, 2022

View reviewed changes

Remove straggler property

1a66a10

pllim merged commit aa86ecb into spacetelescope:main Aug 9, 2022

duytnguyendtn deleted the slideperf branch August 9, 2022 20:05

pllim mentioned this pull request Aug 16, 2022

Spatial-Spectral highlighting #1528

Merged

9 tasks

kecnry mentioned this pull request Aug 26, 2022

future-proof slicing logic for glue-jupyter as_steps support #1599

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cubeviz slider performance improvements #1550

Cubeviz slider performance improvements #1550

duytnguyendtn commented Aug 8, 2022 •

edited by pllim

Loading

pllim commented Aug 8, 2022

codecov bot commented Aug 8, 2022 •

edited

Loading

duytnguyendtn commented Aug 9, 2022

kecnry commented Aug 9, 2022

duytnguyendtn commented Aug 9, 2022

pllim commented Aug 9, 2022

kecnry left a comment

pllim commented Aug 9, 2022

pllim Aug 9, 2022

pllim Aug 9, 2022

duytnguyendtn Aug 9, 2022

kecnry Aug 9, 2022

kecnry commented Aug 9, 2022

pllim Aug 9, 2022

kecnry Aug 9, 2022

duytnguyendtn Aug 9, 2022

kecnry Aug 9, 2022

pllim commented Aug 9, 2022

pllim left a comment

Cubeviz slider performance improvements #1550

Cubeviz slider performance improvements #1550

Conversation

duytnguyendtn commented Aug 8, 2022 • edited by pllim Loading

Description

Checklist for package maintainer(s)

pllim commented Aug 8, 2022

codecov bot commented Aug 8, 2022 • edited Loading

Codecov Report

duytnguyendtn commented Aug 9, 2022

kecnry commented Aug 9, 2022

duytnguyendtn commented Aug 9, 2022

pllim commented Aug 9, 2022

kecnry left a comment

Choose a reason for hiding this comment

pllim commented Aug 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kecnry commented Aug 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pllim commented Aug 9, 2022

pllim left a comment

Choose a reason for hiding this comment

duytnguyendtn commented Aug 8, 2022 •

edited by pllim

Loading

codecov bot commented Aug 8, 2022 •

edited

Loading