-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change start_time
and end_time
handling in combine_metadata
#2737
Conversation
Going through the failing tests. The others are easy to fix, but I'm not sure what |
The |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2737 +/- ##
==========================================
+ Coverage 95.40% 95.89% +0.48%
==========================================
Files 371 371
Lines 52825 52826 +1
==========================================
+ Hits 50399 50656 +257
+ Misses 2426 2170 -256
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Pull Request Test Coverage Report for Build 7897556037Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome job @pnuu! This looks good to me. It is obviously backwards incompatible, but I think this is the right path forward. I had a couple inline questions. The biggest one is what to do with time_parameters
(see comment). Otherwise, I think the MultiScene removal of the combine_times makes sense. The only reason it existed was because it wasn't done in combine_metadata so if it happens in combine_metadata then great. It was only done in MultiScene because I wasn't sure if we wanted that behavior everywhere.
Some other concerns: What happens in Scene.save_datasets
if start_time is None? I believe the default filename pattern includes {start_time:%Y%m%d_%H%M%S}
. How do we want to handle that? Let it fail?
The only case this should happen if the user is saving the plain data from |
Good point. I guess my only other fear would be odd situations with the MultiScene where you need to resample and have a static image, but the MultiScene wants to do something with ordering by start time...nah this shouldn't be a problem. Ok sounds good to not worry about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. I think we should get @mraspaud, @sfinkens, @gerritholl, and @ameraner or @strandgren's opinions since this will affect all granule and segment based readers. I assume this group of developers have the widest experience with potential time-based edge cases.
Oh @pnuu I think this min/max code (including the time_parameters method it calls) could be removed: satpy/satpy/readers/file_handlers.py Lines 115 to 118 in cd33a0a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks!
Removed the duplicate handling of times and adjusted the file handler test to actually use datetimes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for sorting this out!
I don't see any problem regarding the segmented readers with this, since it seems to me that the previous behaviour in terms of min/max calculations for start
, nominal
and observation
times is preserved. The segment sorting etc. is anyway not impacted by this, as it's based on chunk numbering from the single filehandlers.
Note: As discussed above, what still worries me a little bit is indeed the generic_image
reader possibly returning datasets without a valid start_time
... I think there are users that use satpy for simple operations like opening a geotiff, resample it and save it again, which could fail. Or maybe also applications like SIFT that may rely on a dataset having a start_time
. But this goes outside the scope of this PR, which at least correctly fixed the composites misbehaviours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work! There is a small risk that users will notice this backwards-incompatibility, so I have made a suggestion on explicitly mentioning in the documentation that the behaviour has changed, and (optionally) on raising a DeprecrationWarning
or similar if a user does still pass combine_times
.
@@ -27,33 +27,37 @@ | |||
from satpy.writers.utils import flatten_dict | |||
|
|||
|
|||
def combine_metadata(*metadata_objects, average_times=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a deprecation path, where a warning is raised if code passes average_times
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure any people are actually using this, but given that it exists means we thought someone might want to control it so I agree that it should be documented at the very least. A specific deprecation warning would be nice to have.
The changes in the multiscene code are also backwards incompatible, but very very unlikely to be used by anyone except maybe Adam and Ernst. If I remember correctly the default behavior is preserved and was changed when the related kwarg was added to the multiscene stacking function. So my vote is no deprecation warning on the multiscene stuff, but warning on the metadata.py average_times
would be nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see about the deprecation warning, hopefully tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a UserWarning
if someone tries to use the average_times
kwarg.
@ameraner good point, but skimming the changes in this PR again, I don't think the generic_image reader's behavior has changed at all. It was already returning a start_time of None and it was up to the user to override that...right? |
Co-authored-by: Gerrit Holl <[email protected]>
Co-authored-by: David Hoese <[email protected]>
Yes, indeed. Changing that would be outside the scope of this PR, and I'm not sure what the best solution would be anyway (since giving a "dummy" start_time can mess up other calculations, as we see here). |
I am not convinced that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This could be handled in the writer (in another PR) with a simple (pseudo-code) if self.start_time is None:
self.start_time = dt.datetime.utcnow()
fname = self.compose_fname_from_stuff() |
Eh, too much magic. If the start_time being None is a problem then the user should have to work around it. For example, if the filename generation is the problem then they should save it with a different filename template string. |
Everybody seems happy about this one, merging |
The times of datasets used in composites were averaged to get the final time values for the composite. With this PR, the
start_time
andend_time
attributes are instead changed to use the earliest and latest values, respectively. In addition, forStaticImageCompositor
the defaultstart_time
andend_time
values are set toNone
if they are not available in the filename.MultiFiller
#2446