Skip to content
This repository has been archived by the owner on Aug 28, 2023. It is now read-only.

Cross-year Aggregations #198

Closed
rmartz opened this issue Dec 27, 2016 · 0 comments
Closed

Cross-year Aggregations #198

rmartz opened this issue Dec 27, 2016 · 0 comments

Comments

@rmartz
Copy link
Contributor

rmartz commented Dec 27, 2016

For #169, we would like to be able to aggregate across years. This is not a simple task, but should be workable with two modifications performed in conjunction

Single-stage aggregation keys

Within our indicator system, we perform two steps for filtering and aggregating by time. In the first step we filter by year, and then within each year we annotate a sub-key used to annotate the results within a common time aggregation.

We should streamline this and only aggregated by the annotated sharding key, which would be defined to be the key used in the final result. That way, database results could look like this:

{'key': '2056-Q4', 'model': 17, 'value': 297}

Time aggregation would only have to be handled while determining what process to construct the key, and once database results are returned we can treat the key as a black box.

Filter by year using aggregation range

This one is going to be more tricky. If we want to have yearly aggregations that don't track with the calendar year, we will need to have a way for data points within a common range to be filtered by whether the range fits within the criteria, not whether the data point necessarily does.

For instance, assuming we want to accumulate ranges by the year they start (Rather than the year they conclude), if a user requests data for 2045-2055 using aggregation spanning from 6/1 to 5/31, we would want to include 4/17/2056 because it fits in the range starting in 2055, but not 1/1/2045 because it fits in the range starting in 2044, outside the year filter.

To do this, we would need to:

  1. Detect if a range spans the new year
  2. If it does, split it into two ranges, one stopping at 12/31, the other starting on 1/1
  3. Apply an offset to the year for points in the orphaned-year range (For instance, the latter, post-1/1 range if we want to key aggregations by the year they begin)
  4. Filter data points by the calculated year
@CloudNiner CloudNiner modified the milestone: Sprint Ending: 1/26/2017 Jan 13, 2017
@CloudNiner CloudNiner removed this from the Sprint Ending: 1/26/2017 milestone Feb 1, 2017
@rmartz rmartz added in progress and removed to-do labels Feb 6, 2017
@fungjj92 fungjj92 added in progress and removed to-do labels Feb 8, 2017
@rmartz rmartz mentioned this issue Feb 10, 2017
2 tasks
@rmartz rmartz added done and removed in review labels Feb 23, 2017
@sharph sharph closed this as completed Feb 23, 2017
@sharph sharph removed the done label Feb 23, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants