Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Dask computations in DayNightCompositor #2617

Merged
merged 8 commits into from
Oct 27, 2023

Conversation

pnuu
Copy link
Member

@pnuu pnuu commented Oct 25, 2023

This PR gets rid of unnecessary Dask computations in DayNightCompositor. This is enforced with a custom scheduler for the compositor tests.

@pnuu pnuu added the bug label Oct 25, 2023
@pnuu pnuu self-assigned this Oct 25, 2023
@pnuu
Copy link
Member Author

pnuu commented Oct 25, 2023

The tests fail (at least) locally, even though I get identical results with real composites with the three composite types we have:

  natural_color_day_only:
    compositor: !!python/name:satpy.composites.DayNightCompositor
    standard_name: natural_color_day_only
    day_night: day_only
    include_alpha: true
    lim_low: 87.0
    lim_high: 93.0
    prerequisites:
      - name: natural_color

  night_microphysical_night_only:
    compositor: !!python/name:satpy.composites.DayNightCompositor
    standard_name: night_microphysical_night_only
    day_night: night_only
    include_alpha: True
    lim_low: 87.0
    lim_high: 87.0
    prerequisites:
      - name: night_microphysical

  vis06_with_ir108:
    compositor: !!python/name:satpy.composites.DayNightCompositor
    standard_name: vis06_with_ir108
    lim_low: 85.0
    lim_high: 88.0
    prerequisites:
      - vis06
      - ir108

My test script (adapted from the original by @gerritholl):

#!/usr/bin/env python

import dask.config
from satpy.tests.utils import CustomScheduler
from satpy import Scene
from glob import glob
seviri_files = glob("/home/lahtinep/data/satellite/geo/0deg/*202303080900*")
sc = Scene(filenames={"seviri_l1b_hrit": seviri_files})
with dask.config.set(scheduler=CustomScheduler(max_computes=1)):
    sc.load(["natural_color_day_only", "night_microphysical_night_only", "vis06_with_ir108"])
    ls = sc.resample("eurol")
    ls.save_datasets(base_dir="/tmp")

@pnuu
Copy link
Member Author

pnuu commented Oct 25, 2023

Three of the five failing tests are with keep_alpha: False, which my composites didn't include.

FAILED satpy/tests/test_composites.py::TestDayNightCompositor::test_day_only_area_without_alpha - AssertionError: 
FAILED satpy/tests/test_composites.py::TestDayNightCompositor::test_day_only_sza_without_alpha - AssertionError: 
FAILED satpy/tests/test_composites.py::TestDayNightCompositor::test_daynight_area - satpy.composites.IncompatibleAreas
FAILED satpy/tests/test_composites.py::TestDayNightCompositor::test_night_only_sza_without_alpha - AssertionError: 
FAILED satpy/tests/test_composites.py::TestColorizeCompositor::test_colorize_with_interpolation - AssertionError: 

@pnuu
Copy link
Member Author

pnuu commented Oct 25, 2023

There was a simple logic error when I converted the calls, still trying to figure out the IncompatibleAreas exception.

@pnuu
Copy link
Member Author

pnuu commented Oct 25, 2023

I'm a bit lost with the last failing test satpy/tests/test_composites.py::TestDayNightCompositor::test_daynight_area, I'll have another look tomorrow.

I've now added the context manager to make sure only one computation is triggered.

@djhoese
Copy link
Member

djhoese commented Oct 25, 2023

I'm a little concerned we have some magic conversion of DataArray to dask and/or dask to DataArray and/or DataArray to numpy and it was just getting hidden before and now is causing an error. My other worry about be the compositor overwriting DataArray.data = new_dask_array and it breaking things between tests.

@pnuu
Copy link
Member Author

pnuu commented Oct 26, 2023

I don't see any places in the compositor where DataArray.data are replaced.

The failure is in the only test where SZA data are not provided and both day and night data are used. This works with real data (I'm testing with SEVIRI HRIT), so I'm I guess the fake data in the test class is somehow wrong.

@pnuu
Copy link
Member Author

pnuu commented Oct 26, 2023

Ok, my real data test was using a composite where both day and night parts were single channels. With RGB sub-composites I get

RuntimeError: None of the requested datasets have been generated or could not be loaded. Requested composite inputs may need to have matching dimensions (eg. through resampling).

The DataArrays passed in the datalist have identical areas and shapes when passed tosuper(DayNightCompositor, self).call(data, **kwargs)`.

Switching the mock data in the test to 1D makes the failing test pass.

@pnuu
Copy link
Member Author

pnuu commented Oct 26, 2023

Found something that works, not entirely sure why: use .data when applying the weights to day and night composites, not the DataArray. I don't know why this makes any difference when comparing the areas, the attributes are handled separately in any case. So might be there's a try/except somewhere that masks the original exception.

@pnuu pnuu marked this pull request as ready for review October 26, 2023 08:01
@codecov
Copy link

codecov bot commented Oct 26, 2023

Codecov Report

Merging #2617 (3d6041a) into main (652a236) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #2617   +/-   ##
=======================================
  Coverage   95.18%   95.18%           
=======================================
  Files         354      354           
  Lines       51256    51270   +14     
=======================================
+ Hits        48789    48803   +14     
  Misses       2467     2467           
Flag Coverage Δ
behaviourtests 4.24% <11.84%> (+<0.01%) ⬆️
unittests 95.81% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
satpy/composites/__init__.py 91.25% <100.00%> (+0.02%) ⬆️
satpy/modifiers/_crefl_utils.py 95.13% <100.00%> (ø)
satpy/tests/test_composites.py 100.00% <100.00%> (ø)

@coveralls
Copy link

coveralls commented Oct 26, 2023

Pull Request Test Coverage Report for Build 6657105421

  • 75 of 75 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.001%) to 95.758%

Totals Coverage Status
Change from base Build 6621809349: 0.001%
Covered Lines: 48923
Relevant Lines: 51090

💛 - Coveralls

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@djhoese
Copy link
Member

djhoese commented Oct 26, 2023

This might be a case where type annotations would make it more clear what type of objects are being passed where and if we can expect them to work. I'll try to review this in detail this morning and merge if it all looks good. Thanks for tackling this @pnuu!

@djhoese
Copy link
Member

djhoese commented Oct 26, 2023

Ok I did some debugging into this. What happens is we call this method:

    def _mask_weights_with_data(self, weights, day_data, night_data):
        data_a = _get_single_channel(day_data)
        data_b = _get_single_channel(night_data)
        if "only" in self.day_night:
            mask = _get_weight_mask_for_single_side_product(data_a, data_b)
        else:
            mask = _get_weight_mask_for_daynight_product(weights, data_a, data_b)

The data_a and data_b both have .coords["bands"] = ["R"]. This results in mask also having bands = ["R"]. This then results in weights later on acquiring the bands = ["R"] coordinate too. Then when we finally apply the mask the first (R) band has .coords["bands"] = ["R"] but the G and B bands don't have this coordinate (because it wasn't equal). Finally, when the base compositor class concatenates the individual R, G, and B bands together it errors out because it is told to concat on the "bands" dimension but not all of the inputs have that coordinate. I'll see if I can come up with a clean and obvious solution to this.

And add a lot more type annotations
# Conflicts:
#	satpy/composites/__init__.py
#	satpy/tests/test_composites.py
@djhoese
Copy link
Member

djhoese commented Oct 26, 2023

@pnuu I ended up making more changes than I anticipated after adding and fixing some type annotations. Please review and merge if you're OK with everything.

@pnuu
Copy link
Member Author

pnuu commented Oct 27, 2023

Thank you for explaining what really was happening! Also thanks for the type annotations, some day I need to start looking at them my self...

@pnuu pnuu merged commit 86c075a into pytroll:main Oct 27, 2023
18 of 19 checks passed
@pnuu pnuu deleted the bugfix-dnc-compute branch October 27, 2023 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DayNightCompositor triggers early dask computation
4 participants