Change `start_time` and `end_time` handling in `combine_metadata` #2737

pnuu · 2024-02-01T13:17:45Z

The times of datasets used in composites were averaged to get the final time values for the composite. With this PR, the start_time and end_time attributes are instead changed to use the earliest and latest values, respectively. In addition, for StaticImageCompositor the default start_time and end_time values are set to None if they are not available in the filename.

Closes wrong start_time with BackgroundCompositor #2630
Closes Using a static image alters time information #2734
Closes add more options to time handling in combine_metadata #2447
Closes Wrong start_time, end_time attributes after MultiScene.blend(blend_function=timeseries) #2427
Closes combine_metadata only supports the average of time attrs #1174
Closes combine metadata in MultiFiller #2446
Fully documented

pnuu · 2024-02-01T14:32:26Z

Going through the failing tests. The others are easy to fix, but I'm not sure what combine_times=False behaviour should be in satpy.multiscene.stack(). Looks like then the average is used, but would it ok to just skip that option and always use min/max of start/end times?

pnuu · 2024-02-01T14:35:26Z

The satpy.multiscene.stack() option combine_times is not documented, so that would indicate that the deletion of it would be safe.

…efault behaviour

codecov · 2024-02-01T15:06:32Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (eb4ac0b) 95.40% compared to head (b8a47a9) 95.89%.
Report is 12 commits behind head on main.

Files	Patch %	Lines
satpy/dataset/metadata.py	97.14%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2737      +/-   ##
==========================================
+ Coverage   95.40%   95.89%   +0.48%     
==========================================
  Files         371      371              
  Lines       52825    52826       +1     
==========================================
+ Hits        50399    50656     +257     
+ Misses       2426     2170     -256

Flag	Coverage Δ
behaviourtests	`4.16% <9.75%> (+<0.01%)`	⬆️
unittests	`95.99% <98.78%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coveralls · 2024-02-01T15:23:02Z

Pull Request Test Coverage Report for Build 7897556037

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

-1 of 82 (98.78%) changed or added relevant lines in 6 files are covered.
5 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.003%) to 95.971%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
satpy/dataset/metadata.py	34	35	97.14%

Files with Coverage Reduction	New Missed Lines	%
satpy/tests/test_readers.py	1	99.36%
satpy/readers/init.py	4	98.65%

Totals
Change from base Build 7726219656:	0.003%
Covered Lines:	50528
Relevant Lines:	52649

💛 - Coveralls

djhoese

Awesome job @pnuu! This looks good to me. It is obviously backwards incompatible, but I think this is the right path forward. I had a couple inline questions. The biggest one is what to do with time_parameters (see comment). Otherwise, I think the MultiScene removal of the combine_times makes sense. The only reason it existed was because it wasn't done in combine_metadata so if it happens in combine_metadata then great. It was only done in MultiScene because I wasn't sure if we wanted that behavior everywhere.

Some other concerns: What happens in Scene.save_datasets if start_time is None? I believe the default filename pattern includes {start_time:%Y%m%d_%H%M%S}. How do we want to handle that? Let it fail?

satpy/composites/__init__.py

satpy/dataset/metadata.py

pnuu · 2024-02-05T09:30:13Z

What happens in Scene.save_datasets if start_time is None?
How do we want to handle that? Let it fail?

The only case this should happen if the user is saving the plain data from generic_image reader that didn't have the time available in the filename. I guess we could add some kind of error handling for this case, but I'm not sure it's worth the effort 🤔

djhoese · 2024-02-05T15:03:04Z

The only case this should happen if the user is saving the plain data from generic_image reader that didn't have the time available in the filename. I guess we could add some kind of error handling for this case, but I'm not sure it's worth the effort 🤔

Good point. I guess my only other fear would be odd situations with the MultiScene where you need to resample and have a static image, but the MultiScene wants to do something with ordering by start time...nah this shouldn't be a problem. Ok sounds good to not worry about it.

satpy/dataset/metadata.py

djhoese

I think this looks good. I think we should get @mraspaud, @sfinkens, @gerritholl, and @ameraner or @strandgren's opinions since this will affect all granule and segment based readers. I assume this group of developers have the widest experience with potential time-based edge cases.

djhoese · 2024-02-09T18:49:11Z

Oh @pnuu I think this min/max code (including the time_parameters method it calls) could be removed:

satpy/satpy/readers/file_handlers.py

Lines 115 to 118 in cd33a0a

    
           new_dict = self._combine(all_infos, min, "start_time", "start_orbit") 
        
           new_dict.update(self._combine(all_infos, max, "end_time", "end_orbit")) 
        
           new_dict.update(self._combine_orbital_parameters(all_infos)) 
        
           new_dict.update(self._combine_time_parameters(all_infos))

sfinkens

Nice work, thanks!

satpy/dataset/metadata.py

pnuu · 2024-02-12T09:24:20Z

Oh @pnuu I think this min/max code (including the time_parameters method it calls) could be removed:

satpy/satpy/readers/file_handlers.py

Lines 115 to 118 in cd33a0a

new_dict = self._combine(all_infos, min, "start_time", "start_orbit")

new_dict.update(self._combine(all_infos, max, "end_time", "end_orbit"))

new_dict.update(self._combine_orbital_parameters(all_infos))

new_dict.update(self._combine_time_parameters(all_infos))

Removed the duplicate handling of times and adjusted the file handler test to actually use datetimes.

ameraner

LGTM, thanks for sorting this out!
I don't see any problem regarding the segmented readers with this, since it seems to me that the previous behaviour in terms of min/max calculations for start, nominal and observation times is preserved. The segment sorting etc. is anyway not impacted by this, as it's based on chunk numbering from the single filehandlers.

Note: As discussed above, what still worries me a little bit is indeed the generic_image reader possibly returning datasets without a valid start_time... I think there are users that use satpy for simple operations like opening a geotiff, resample it and save it again, which could fail. Or maybe also applications like SIFT that may rely on a dataset having a start_time. But this goes outside the scope of this PR, which at least correctly fixed the composites misbehaviours.

gerritholl

Thanks for this work! There is a small risk that users will notice this backwards-incompatibility, so I have made a suggestion on explicitly mentioning in the documentation that the behaviour has changed, and (optionally) on raising a DeprecrationWarning or similar if a user does still pass combine_times.

gerritholl · 2024-02-13T14:16:43Z

satpy/dataset/metadata.py

@@ -27,33 +27,37 @@
 from satpy.writers.utils import flatten_dict


-def combine_metadata(*metadata_objects, average_times=True):


Do we need a deprecation path, where a warning is raised if code passes average_times?

I'm not sure any people are actually using this, but given that it exists means we thought someone might want to control it so I agree that it should be documented at the very least. A specific deprecation warning would be nice to have.

The changes in the multiscene code are also backwards incompatible, but very very unlikely to be used by anyone except maybe Adam and Ernst. If I remember correctly the default behavior is preserved and was changed when the related kwarg was added to the multiscene stacking function. So my vote is no deprecation warning on the multiscene stuff, but warning on the metadata.py average_times would be nice to have.

I'll see about the deprecation warning, hopefully tomorrow.

I added a UserWarning if someone tries to use the average_times kwarg.

satpy/dataset/metadata.py

djhoese · 2024-02-13T18:16:19Z

Note: As discussed above, what still worries me a little bit is indeed the generic_image reader possibly returning datasets without a valid start_time... I think there are users that use satpy for simple operations like opening a geotiff, resample it and save it again, which could fail.

@ameraner good point, but skimming the changes in this PR again, I don't think the generic_image reader's behavior has changed at all. It was already returning a start_time of None and it was up to the user to override that...right?

Co-authored-by: Gerrit Holl <[email protected]>

satpy/dataset/metadata.py

Co-authored-by: David Hoese <[email protected]>

ameraner · 2024-02-14T09:02:28Z

@ameraner good point, but skimming the changes in this PR again, I don't think the generic_image reader's behavior has changed at all. It was already returning a start_time of None and it was up to the user to override that...right?

Yes, indeed. Changing that would be outside the scope of this PR, and I'm not sure what the best solution would be anyway (since giving a "dummy" start_time can mess up other calculations, as we see here).

gerritholl · 2024-02-14T09:14:00Z

what still worries me a little bit is indeed the generic_image reader possibly returning datasets without a valid start_time...

I am not convinced that the generic_image reader should guarantee a (valid) start_time. We could possibly expose whatever times are in image metadata (such as EXIF headers) or file metadata, but the type of imagery read by the generic_image reader is too diverse for any of those to be generally valid as a start_time, as elsewhere in Satpy, this refers to the measurement time, not to the image creation time. Dealing with images that don't have a start_time would rather seem to be the responsibility of downstream users.

gerritholl

Thanks!

pnuu · 2024-02-14T09:14:47Z

@ameraner good point, but skimming the changes in this PR again, I don't think the generic_image reader's behavior has changed at all. It was already returning a start_time of None and it was up to the user to override that...right?

Yes, indeed. Changing that would be outside the scope of this PR, and I'm not sure what the best solution would be anyway (since giving a "dummy" start_time can mess up other calculations, as we see here).

This could be handled in the writer (in another PR) with a simple (pseudo-code)

if self.start_time is None:
    self.start_time = dt.datetime.utcnow()
fname = self.compose_fname_from_stuff()

djhoese · 2024-02-14T16:02:22Z

This could be handled in the writer (in another PR) with a simple (pseudo-code)

Eh, too much magic. If the start_time being None is a problem then the user should have to work around it. For example, if the filename generation is the problem then they should save it with a different filename template string.

mraspaud · 2024-02-15T12:13:45Z

Everybody seems happy about this one, merging

lahtinep added 3 commits February 1, 2024 14:48

Change time attribute averaging to min/max for start/end times

8319b4f

Change time attribute averaging to min/max for start/end times

782ac68

Allow None as time value

cf98aa2

pnuu added bug enhancement code enhancements, features, improvements component:compositors labels Feb 1, 2024

pnuu requested a review from gerritholl February 1, 2024 13:17

pnuu self-assigned this Feb 1, 2024

pnuu requested review from djhoese and mraspaud as code owners February 1, 2024 13:17

Update combine_metadata docstring

ab90428

pnuu requested a review from zxdawn February 1, 2024 13:30

lahtinep added 2 commits February 1, 2024 15:33

Merge branch 'main' into min-max-dataset-times

cbe2549

Do not include top-level non-time objects in shared_info as times

942be9c

Remove combine_times kwarg from multiscene.stack and default to its d…

db2bf31

…efault behaviour

Remove obsolete private function

e0fe845

djhoese approved these changes Feb 2, 2024

View reviewed changes

satpy/composites/__init__.py Outdated Show resolved Hide resolved

satpy/dataset/metadata.py Outdated Show resolved Hide resolved

Remove unnecessary setting of start/end time attributes

6e1342f

lahtinep added 4 commits February 5, 2024 22:16

Remove assertions that don't hold when not loading an actual image

13168cf

Combine also values of 'time_parameters' dictionary items

f7e7570

Separate value combination to a new function

4e85a8f

Reword docstring of combine_metadata() to include time_parameters dict

91dbc8e

djhoese reviewed Feb 8, 2024

View reviewed changes

satpy/dataset/metadata.py Outdated Show resolved Hide resolved

Clarify combine_metadata() docstring

9d5665c

djhoese reviewed Feb 9, 2024

View reviewed changes

satpy/dataset/metadata.py Outdated Show resolved Hide resolved

Update satpy/dataset/metadata.py

83f6bc8

djhoese approved these changes Feb 9, 2024

View reviewed changes

djhoese requested review from sfinkens, strandgren and ameraner February 9, 2024 18:46

sfinkens approved these changes Feb 12, 2024

View reviewed changes

satpy/dataset/metadata.py Outdated Show resolved Hide resolved

lahtinep added 3 commits February 12, 2024 11:23

Remove douple handling of start/end times and time parameters

b68920f

Use datetimes when testing times

d1c33a1

Refactor to remove unnecessary return

3e050b0

ameraner approved these changes Feb 13, 2024

View reviewed changes

gerritholl requested changes Feb 13, 2024

View reviewed changes

Update satpy/dataset/metadata.py

aee140a

Co-authored-by: Gerrit Holl <[email protected]>

djhoese reviewed Feb 13, 2024

View reviewed changes

satpy/dataset/metadata.py Outdated Show resolved Hide resolved

pnuu and others added 2 commits February 13, 2024 20:40

Update satpy/dataset/metadata.py

52b2d41

Co-authored-by: David Hoese <[email protected]>

Add a warning when trying to use removed 'average_times' kwarg

b8a47a9

gerritholl approved these changes Feb 14, 2024

View reviewed changes

mraspaud merged commit 25d5357 into pytroll:main Feb 15, 2024
18 of 19 checks passed

pnuu deleted the min-max-dataset-times branch February 15, 2024 14:18

ameraner mentioned this pull request Aug 14, 2024

Scene.start_time computation fails for datasets/composites with start_time=None (and same for end_time) #2883

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change `start_time` and `end_time` handling in `combine_metadata` #2737

Change `start_time` and `end_time` handling in `combine_metadata` #2737

pnuu commented Feb 1, 2024 •

edited

Loading

pnuu commented Feb 1, 2024

pnuu commented Feb 1, 2024

codecov bot commented Feb 1, 2024 •

edited

Loading

coveralls commented Feb 1, 2024 •

edited

Loading

djhoese left a comment

pnuu commented Feb 5, 2024

djhoese commented Feb 5, 2024

djhoese left a comment

djhoese commented Feb 9, 2024

sfinkens left a comment

pnuu commented Feb 12, 2024

ameraner left a comment

gerritholl left a comment

gerritholl Feb 13, 2024

djhoese Feb 13, 2024

pnuu Feb 13, 2024

pnuu Feb 14, 2024

djhoese commented Feb 13, 2024

ameraner commented Feb 14, 2024

gerritholl commented Feb 14, 2024

gerritholl left a comment

pnuu commented Feb 14, 2024

djhoese commented Feb 14, 2024

mraspaud commented Feb 15, 2024

		@@ -27,33 +27,37 @@
		from satpy.writers.utils import flatten_dict


		def combine_metadata(*metadata_objects, average_times=True):

Change start_time and end_time handling in combine_metadata #2737

Change start_time and end_time handling in combine_metadata #2737

Conversation

pnuu commented Feb 1, 2024 • edited Loading

pnuu commented Feb 1, 2024

pnuu commented Feb 1, 2024

codecov bot commented Feb 1, 2024 • edited Loading

Codecov Report

coveralls commented Feb 1, 2024 • edited Loading

Pull Request Test Coverage Report for Build 7897556037

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

djhoese left a comment

Choose a reason for hiding this comment

pnuu commented Feb 5, 2024

djhoese commented Feb 5, 2024

djhoese left a comment

Choose a reason for hiding this comment

djhoese commented Feb 9, 2024

sfinkens left a comment

Choose a reason for hiding this comment

pnuu commented Feb 12, 2024

ameraner left a comment

Choose a reason for hiding this comment

gerritholl left a comment

Choose a reason for hiding this comment

gerritholl Feb 13, 2024

Choose a reason for hiding this comment

djhoese Feb 13, 2024

Choose a reason for hiding this comment

pnuu Feb 13, 2024

Choose a reason for hiding this comment

pnuu Feb 14, 2024

Choose a reason for hiding this comment

djhoese commented Feb 13, 2024

ameraner commented Feb 14, 2024

gerritholl commented Feb 14, 2024

gerritholl left a comment

Choose a reason for hiding this comment

pnuu commented Feb 14, 2024

djhoese commented Feb 14, 2024

mraspaud commented Feb 15, 2024

Change `start_time` and `end_time` handling in `combine_metadata` #2737

Change `start_time` and `end_time` handling in `combine_metadata` #2737

pnuu commented Feb 1, 2024 •

edited

Loading

codecov bot commented Feb 1, 2024 •

edited

Loading

coveralls commented Feb 1, 2024 •

edited

Loading