`test_plot_cdfs` test is extremely slow #298

EwoutH · 2023-10-29T12:48:39Z

This test takes extremely long, in the order of minutes. I wouldn't be surprised if it's over half of all testing time.

EMAworkbench/test/test_analysis/test_regional_sa.py

Lines 14 to 23 in b76b487

    
           class Test(unittest.TestCase): 
        
               def test_plot_cdfs(self): 
        
                   x, outcomes = utilities.load_flu_data() 
        
                   y = outcomes["deceased population region 1"][:, -1] > 1000000 
        
                   regional_sa.plot_cdfs(x, y) 
        
                   regional_sa.plot_cdfs(x, y, ccdf=True) 
        
                   x = x.drop("scenario", axis=1) 
        
                   regional_sa.plot_cdfs(x, y, ccdf=True)

quaquel · 2023-10-29T12:50:59Z

Should be easy to replace the data with a smaller dataset.

EwoutH · 2023-10-29T13:09:15Z

Performance profile:

Looks like matplotlib's autoscale_view is called a lot of times, which seems to be the most expensive operation.

For the test itself, smaller dataset would indeed work.

quaquel · 2023-10-29T13:40:29Z

No idea why autoscale_view is called so often. There are 63 calls to plot_individual_cdf, so that is close 400 calls to autoscale per cdf (this is not the only place where autoscale might be called, but likely the dominant one). It's also strange to autoscale because at least the x-axis has a clearly defined hard-coded limit. It seems this arises from somewhere deep within matplotlib.

EwoutH · 2023-10-29T13:45:10Z

Here the tree:

It seems that from plot_discrete_cdf (called 6 times) inner is the first function that's called that 24024 times, all the way to autoscale_view.

quaquel · 2023-10-29T13:49:51Z

The calls to scatter in plot_discrete_cdf might be redundant and could be replaced by using marker in the plot command. Would need some checking, however.

quaquel · 2023-10-31T15:29:58Z

Just ran a quick test. With scatter enabled timeit gives

52 s ± 732 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With scatter disabled, and replaced with the marker kwarg in plot we get

621 ms ± 3.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

So, disabling scatter gives us orders of magnitude improvement, and no discernable difference in the visual.

EwoutH · 2023-10-31T15:35:42Z

That's 2 orders of magnitude, or a 99% reduction in runtime. Seems worth it.

Are there any figures where the scatter is likely required to provide a correct/useful visual? If so we should test those.

quaquel · 2023-10-31T16:35:41Z

I did a visual comparison. It is really marginal.

EwoutH · 2023-10-31T17:11:25Z

Then the speedup sounds worth it!

EwoutH added testing performance labels Oct 29, 2023

EwoutH mentioned this issue Nov 1, 2023

Speed up of plot_discrete_cdfs by 2 orders of magnitude #306

Merged

quaquel closed this as completed in #306 Nov 3, 2023

EwoutH added this to the 2.5.0 milestone Nov 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`test_plot_cdfs` test is extremely slow #298

`test_plot_cdfs` test is extremely slow #298

EwoutH commented Oct 29, 2023

quaquel commented Oct 29, 2023

EwoutH commented Oct 29, 2023 •

edited

Loading

quaquel commented Oct 29, 2023

EwoutH commented Oct 29, 2023

quaquel commented Oct 29, 2023

quaquel commented Oct 31, 2023

EwoutH commented Oct 31, 2023 •

edited

Loading

quaquel commented Oct 31, 2023

EwoutH commented Oct 31, 2023

test_plot_cdfs test is extremely slow #298

test_plot_cdfs test is extremely slow #298

Comments

EwoutH commented Oct 29, 2023

quaquel commented Oct 29, 2023

EwoutH commented Oct 29, 2023 • edited Loading

quaquel commented Oct 29, 2023

EwoutH commented Oct 29, 2023

quaquel commented Oct 29, 2023

quaquel commented Oct 31, 2023

EwoutH commented Oct 31, 2023 • edited Loading

quaquel commented Oct 31, 2023

EwoutH commented Oct 31, 2023

`test_plot_cdfs` test is extremely slow #298

`test_plot_cdfs` test is extremely slow #298

EwoutH commented Oct 29, 2023 •

edited

Loading

EwoutH commented Oct 31, 2023 •

edited

Loading