Skip to content
This repository has been archived by the owner on Aug 28, 2023. It is now read-only.

Cross Year Aggregations #274

Merged
merged 12 commits into from
Feb 23, 2017
Merged

Cross Year Aggregations #274

merged 12 commits into from
Feb 23, 2017

Conversation

rmartz
Copy link
Contributor

@rmartz rmartz commented Feb 10, 2017

Overview

Introduces the ability to perform time aggregations across year boundaries, allowing winter measurements to be handled as a single segment using the timespan value offset_yearly.

Prerequisite for #169

Demo

"2097-2098": {
  "max": 30.30512817382882,
  "avg": 18.430609043666067,
  "min": -0.6856005859384425
},
"2051-2052": {
  "max": 22.797236328124978,
  "avg": 12.2029003906249,
  "min": -2.618150634765442
},
"2029-2030": {
  "max": 21.10369384765582,
  "avg": 11.077372000558032,
  "min": -15.550047607421233
},
"2022-2023": {
  "max": 17.06918457031156,
  "avg": 7.002739083426256,
  "min": -3.1465655517586018
},
"2099-2100": {
  "max": 27.82869995117114,
  "avg": 22.212894461495313,
  "min": 10.794453124999979

Notes

  • When performing a cross-year aggregation, partial year data at the earliest and latest data points are discarded as there is not another year's data to pair them with.
  • The query for cross-year aggregation is currently very slow, and requests may take 3-10 seconds to complete
  • This completely removes the daily indicator aggregation that was kinda sorta removed in Remove Daily Raw Data Indicators #172 because it's not trivial to construct the ISO-8601 format in Django querysets

Testing Instructions

  • Ensure tests still pass
  • Run two requests for min_low_temperature side by side, one for time_aggregation=offset_yearly and the other for time_aggregation=yearly
  • Pick years from the yearly instance and compare their min value to the min in offset_yearly for winters ending or starting with that year
    • In many instances, the min should be in the starting year, others in the ending year
    • In some instances, the min for the winter span may be a completely different value higher or lower, based on the fact that the coldest point in two sequential years of yearly aggregation can skip an entire winter by occurring in late winter of one and early winter of the other.

Checklist

  • Do tests pass?
  • Does this PR require an update to the API Documentation?

Connects #198

@rmartz rmartz force-pushed the feature/cross-year-aggregations branch from 4b9e84b to ce24d26 Compare February 15, 2017 21:35
@rmartz rmartz requested a review from KlaasH February 15, 2017 21:35
@rmartz rmartz changed the title [WIP] Cross Year Aggregations Cross Year Aggregations Feb 15, 2017
Copy link
Contributor

@KlaasH KlaasH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small issue: I'm getting 2006-2007 and 2099-2100 values from http://localhost:8080/api/climate-data/31/RCP85/indicator/frost_days?time_aggregation=offset_yearly (i.e. it's not excluding spans that are missing one side of the data)

Other than that, just a few small comments below.


@classmethod
def make_ranges(cls, label):
""" Takes the values of get_intervals and wraps them in CaseRange objects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment. This one doesn't use get_intervals.

""" QuerysetGenerator based on a list of period lengths

Assumes that the periods are consecutive, and that each period takes place immediately
following the previous period.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding an __init__ with an assert that the intervals add up to a year? It would just be protecting against errors on our part, but it seems like the sort of error that could crop up and it would be nice for it to present in an obvious way if it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good thought, I think I'll try to do that in a unit test so we don't have to wait to hit a runtime error to detect a class definition problem.



class CustomQuerysetGenerator(QuerysetGenerator):
custom_spans = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a docstring note here that "Custom" means "custom within a year".

Copy link
Contributor Author

@rmartz rmartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2006 and 2100 should be valid dates, they're included in the GDDP data file list for RCP45 and RCP85. Is the data for those intervals coming back weird? If it's getting cut off an indicator like heating_degree_days will be very low because it only has data for half the winter.

""" QuerysetGenerator based on a list of period lengths

Assumes that the periods are consecutive, and that each period takes place immediately
following the previous period.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good thought, I think I'll try to do that in a unit test so we don't have to wait to hit a runtime error to detect a class definition problem.

@rmartz rmartz force-pushed the feature/cross-year-aggregations branch from ddf3e8d to 1cea353 Compare February 20, 2017 21:43
@rmartz rmartz force-pushed the feature/cross-year-aggregations branch from 1cea353 to dc44877 Compare February 20, 2017 22:08
@rmartz
Copy link
Contributor Author

rmartz commented Feb 20, 2017

The edge year problem should be fixed, turns out some scenarios have data to 2100 and others only to 2099, so we had to add some logic to detect what the year boundaries are for a given data source and filter for the days we can use within that. A bit complex, but thankfully not as bad as feared.

Copy link
Contributor

@KlaasH KlaasH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I've somehow destroyed or lost track of my database container, so re-testing that last little change would be a major production. But it looks good. 👍

@rmartz rmartz merged commit 9aff403 into develop Feb 23, 2017
@rmartz rmartz deleted the feature/cross-year-aggregations branch February 23, 2017 15:51
@rmartz rmartz removed the in review label Feb 23, 2017
@rmartz rmartz mentioned this pull request May 23, 2017
3 tasks
ddohler pushed a commit that referenced this pull request Aug 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants