Filter null values before plotting #8535

Illviljan · 2023-12-10T17:31:27Z

I noticed that seaborn's plot was responding much faster than xarray's version with the same data.
Turn's out seaborn drops any nulls: https://github.com/mwaskom/seaborn/blob/056413d7393e3daec597d430c076e45938d53376/seaborn/relational.py#L399

Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst

max-sixty

Good idea!

headtr1ck · 2023-12-10T20:45:50Z

Isn't matplotlib dropping nans anyway? I never had any issues with nan values in my data...

Illviljan · 2023-12-10T21:07:06Z

No, they aren't dropped. That's what the test shows. Do you use an interactive version when you look at plots? It was painfully slow when I was zooming in and out.

I was able to reproduce the problem with a raw matplotlib version as well, removing the nans was indeed the solution.

headtr1ck · 2023-12-10T21:08:27Z

No, they aren't dropped. That's what the test shows. Do you use an interactive version when you look at plots? It was painfully slow when I was zooming in and out.

I was able to reproduce the problem with a raw matplotlib version as well, removing the nans was indeed the solution.

ok, never look at the plots interactively...
Then its fine to merge this :)

dcherian · 2023-12-10T21:45:45Z

xarray/plot/dataarray_plot.py

@@ -948,6 +948,12 @@ def newplotfunc(
            size_ = kwargs.pop("_size", linewidth)
            size_r = _LINEWIDTH_RANGE

+        # Remove any nulls, .where(m, drop=True) doesn't work when m is a dask array,


This can only be done for scatter plots. A line plot would totally change.

Currently only scatter uses plot1d so it should be fine for now.
Seaborn doesn't seem to filter nan on line however.
Can't move it later in the chain as this nan-trick is used to split on different dimensions (for a future line plot implementation), can move it to the if-check above though.

NaN's are usually not wanted in line plots either in my experience. Do you have a usecase in mind?

Moved the null filter inside a scatter check.

NaN's are usually not wanted in line plots either in my experience. Do you have a usecase in mind?

Plenty. it's important to know where the gaps are. That's useful information. It is annoying that matplotlib will skip NaNs at the beginning and ends of a line plot.

* main: (26 commits) Filter null values before plotting (pydata#8535) Update concat.py (pydata#8538) Add getitem to array protocol (pydata#8406) Added option to specify weights in xr.corr() and xr.cov() (pydata#8527) Filter out doctest warning (pydata#8539) Bump actions/setup-python from 4 to 5 (pydata#8540) Point users to where in their code they should make mods for Dataset.dims (pydata#8534) Add Cumulative aggregation (pydata#8512) dev whats-new Whats-new for 2023.12.0 (pydata#8532) explicitly skip using `__array_namespace__` for `numpy.ndarray` (pydata#8526) Add `eval` method to Dataset (pydata#7163) Deprecate ds.dims returning dict (pydata#8500) test and fix empty xindexes repr (pydata#8521) Remove PR labeler bot (pydata#8525) Hypothesis strategy for generating Variable objects (pydata#8404) Use numbagg for `rolling` methods (pydata#8493) Bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 (pydata#8514) fix RTD docs build (pydata#8519) Fix type of `.assign_coords` (pydata#8495) ...

* main: (58 commits) Adapt map_blocks to use new Coordinates API (pydata#8560) add xeofs to ecosystem.rst (pydata#8561) Offer a fixture for unifying DataArray & Dataset tests (pydata#8533) Generalize cumulative reduction (scan) to non-dask types (pydata#8019) Filter null values before plotting (pydata#8535) Update concat.py (pydata#8538) Add getitem to array protocol (pydata#8406) Added option to specify weights in xr.corr() and xr.cov() (pydata#8527) Filter out doctest warning (pydata#8539) Bump actions/setup-python from 4 to 5 (pydata#8540) Point users to where in their code they should make mods for Dataset.dims (pydata#8534) Add Cumulative aggregation (pydata#8512) dev whats-new Whats-new for 2023.12.0 (pydata#8532) explicitly skip using `__array_namespace__` for `numpy.ndarray` (pydata#8526) Add `eval` method to Dataset (pydata#7163) Deprecate ds.dims returning dict (pydata#8500) test and fix empty xindexes repr (pydata#8521) Remove PR labeler bot (pydata#8525) Hypothesis strategy for generating Variable objects (pydata#8404) ...

commit 0a0f800 Merge: 33c8033 41d33f5 Author: Deepak Cherian <[email protected]> Date: Tue Jan 2 20:42:51 2024 -0700 Merge branch 'main' into depr-groupby-squeeze-2 commit 33c8033 Author: Deepak Cherian <[email protected]> Date: Tue Jan 2 20:40:42 2024 -0700 Don't skip for resampling commit d7be352 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed Jan 3 03:24:13 2024 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit d13fa0e Author: Deepak Cherian <[email protected]> Date: Tue Jan 2 20:23:43 2024 -0700 Apply suggestions from code review Co-authored-by: Michael Niklas <[email protected]> commit dd6ea53 Author: Deepak Cherian <[email protected]> Date: Thu Dec 21 19:29:40 2023 -0700 Silence more warnings commit 44e5a41 Author: Deepak Cherian <[email protected]> Date: Thu Dec 21 19:21:06 2023 -0700 minimize test mods commit 94c1c1f Author: Deepak Cherian <[email protected]> Date: Thu Dec 21 18:55:46 2023 -0700 Add tests for pydata#8263 commit 0ab4eb6 Author: Deepak Cherian <[email protected]> Date: Thu Dec 21 18:47:41 2023 -0700 Fix typing commit a064430 Merge: d6a3f2d 03ec3cb Author: Deepak Cherian <[email protected]> Date: Thu Dec 21 18:47:04 2023 -0700 Merge branch 'main' into depr-groupby-squeeze-2 * main: Fix mypy type ignore (pydata#8564) Support for the new compression arguments. (pydata#7551) FIX: reverse index output of bottleneck move_argmax/move_argmin functions (pydata#8552) Adapt map_blocks to use new Coordinates API (pydata#8560) add xeofs to ecosystem.rst (pydata#8561) Offer a fixture for unifying DataArray & Dataset tests (pydata#8533) Generalize cumulative reduction (scan) to non-dask types (pydata#8019) commit d6a3f2d Author: Deepak Cherian <[email protected]> Date: Thu Dec 21 18:46:50 2023 -0700 Fix generator for aggregations commit 97f1695 Author: Deepak Cherian <[email protected]> Date: Tue Dec 19 10:58:11 2023 -0700 Fix docs commit 5b33b98 Author: Deepak Cherian <[email protected]> Date: Sun Dec 17 20:35:53 2023 -0700 fix whats-new commit 80b2b36 Author: Deepak Cherian <[email protected]> Date: Sun Dec 17 20:26:17 2023 -0700 Reduce more warnings commit 5f6f4ea Merge: a57d4ae 2971994 Author: Deepak Cherian <[email protected]> Date: Sat Dec 16 20:33:13 2023 -0700 Merge branch 'main' into depr-groupby-squeeze-2 * main: (26 commits) Filter null values before plotting (pydata#8535) Update concat.py (pydata#8538) Add getitem to array protocol (pydata#8406) Added option to specify weights in xr.corr() and xr.cov() (pydata#8527) Filter out doctest warning (pydata#8539) Bump actions/setup-python from 4 to 5 (pydata#8540) Point users to where in their code they should make mods for Dataset.dims (pydata#8534) Add Cumulative aggregation (pydata#8512) dev whats-new Whats-new for 2023.12.0 (pydata#8532) explicitly skip using `__array_namespace__` for `numpy.ndarray` (pydata#8526) Add `eval` method to Dataset (pydata#7163) Deprecate ds.dims returning dict (pydata#8500) test and fix empty xindexes repr (pydata#8521) Remove PR labeler bot (pydata#8525) Hypothesis strategy for generating Variable objects (pydata#8404) Use numbagg for `rolling` methods (pydata#8493) Bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 (pydata#8514) fix RTD docs build (pydata#8519) Fix type of `.assign_coords` (pydata#8495) ... commit a57d4ae Author: Deepak Cherian <[email protected]> Date: Fri Dec 1 21:36:04 2023 -0700 Test one more warning commit bf8139d Author: Deepak Cherian <[email protected]> Date: Fri Dec 1 21:33:45 2023 -0700 Update xarray/tests/test_groupby.py commit 4e9a063 Author: Deepak Cherian <[email protected]> Date: Fri Dec 1 21:10:14 2023 -0700 Set squeeze=None for Dataset too commit c2e576e Author: Deepak Cherian <[email protected]> Date: Fri Dec 1 20:54:17 2023 -0700 Fix first, last commit 6d8e822 Author: Deepak Cherian <[email protected]> Date: Fri Dec 1 20:46:21 2023 -0700 better warning commit 62c334b Author: Deepak Cherian <[email protected]> Date: Fri Dec 1 20:45:17 2023 -0700 silence warnings commit b7805a8 Author: dcherian <[email protected]> Date: Tue Aug 15 10:54:25 2023 -0600 Deprecate `squeeze` in GroupBy. Closes pydata#2157

Illviljan added 4 commits December 10, 2023 18:10

Remove nulls when plotting.

c3c5efb

Update test_plot.py

76aa8df

Update test_plot.py

68b48f1

Update whats-new.rst

7041c26

Illviljan added the topic-plotting label Dec 10, 2023

Update test_plot.py

c059bd9

Illviljan marked this pull request as ready for review December 10, 2023 17:48

max-sixty approved these changes Dec 10, 2023

View reviewed changes

headtr1ck approved these changes Dec 10, 2023

View reviewed changes

headtr1ck added the plan to merge Final call for comments label Dec 10, 2023

dcherian reviewed Dec 10, 2023

View reviewed changes

dcherian removed the plan to merge Final call for comments label Dec 10, 2023

only filter on scatter plots as that is safe.

bdd4d9d

Illviljan added the plan to merge Final call for comments label Dec 12, 2023

Merge branch 'main' into plot1d_filter_null

483d538

dcherian merged commit 2971994 into pydata:main Dec 13, 2023
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter null values before plotting #8535

Filter null values before plotting #8535

Illviljan commented Dec 10, 2023 •

edited

Loading

max-sixty left a comment

headtr1ck commented Dec 10, 2023

Illviljan commented Dec 10, 2023

headtr1ck commented Dec 10, 2023

dcherian Dec 10, 2023

Illviljan Dec 10, 2023 •

edited

Loading

Illviljan Dec 12, 2023

dcherian Dec 13, 2023

Filter null values before plotting #8535

Filter null values before plotting #8535

Conversation

Illviljan commented Dec 10, 2023 • edited Loading

max-sixty left a comment

Choose a reason for hiding this comment

headtr1ck commented Dec 10, 2023

Illviljan commented Dec 10, 2023

headtr1ck commented Dec 10, 2023

dcherian Dec 10, 2023

Choose a reason for hiding this comment

Illviljan Dec 10, 2023 • edited Loading

Choose a reason for hiding this comment

Illviljan Dec 12, 2023

Choose a reason for hiding this comment

dcherian Dec 13, 2023

Choose a reason for hiding this comment

Illviljan commented Dec 10, 2023 •

edited

Loading

Illviljan Dec 10, 2023 •

edited

Loading