Implement Aggregation of Increment and Time measurements #55

cypressious · 2018-04-27T15:09:01Z

Fixes #9

nblumhardt

Thanks for the PR! Just added a couple of thoughts.

nblumhardt · 2018-04-27T21:54:50Z

src/InfluxDB.Collector/Pipeline/Aggregate/AggregatePointEmitter.cs

+// Bus Portal (busliniensuche.de)
+// ==========================================================================
+// All rights reserved.
+// ========================================================================== 


Contributions need to be made under the same license as the rest of the codebase

nblumhardt · 2018-04-27T22:00:03Z

test/InfluxDB.LineProtocol.Tests/Collector/AggregationTests.cs

+                .WriteTo.Emitter(pts => list.AddRange(pts))
+                .CreateCollector();
+
+            collector.Emit(new[]


Does this design require memory space for all of the points in an interval until they're aggregated at emit time? Just clarifying my quick initial reading of the code ... If so it might need a re-think; one of the goals of aggregation, at the client side, would be to maintain a low bound on memory usage in the presence of vast numbers of points. Apologies if I'm misunderstanding, here!

You're right. Currently, the aggregation happens at emit time, i.e. if you're using the batcher, it happens when the batcher emits. My goal with this implementation was to reduce the amount of sent data. But your point is totally valid. Reducing the memory footprint would be a nice goal, too.

What is the scenario you're thinking about, i.e. what would be the batch and aggregation timespans?

One aspect we should keep in mind is not to accidentally delay the sending of the measurements twice be buffering the aggregation and then buffering the sending.

Thanks for the follow-up!

Just seems like a high-allocation metrics implementation is not ideal - generally apps that collect a lot of metrics need to be wary of this kind of caveat. It's not unusual to collect metrics in the multiple-thousands-of-hits/second, so if we're going to try optimizing for this case, aggregating at the point of collection seems pretty appealing (c.f. other metrics implementations).

nblumhardt · 2018-05-07T22:58:26Z

@cypressious thanks again for the time spent on this. It's a nicely-implemented PR, but I'm not keen to move forward with it because I don't think we'll ultimately be able to build on a buffer-then-aggregate design without an eventual reset to avoid the allocations. Let me know if you're keen to dig into it further 👍 Thanks!

cypressious · 2018-05-08T10:36:06Z

@nblumhardt I will rework the PR. I'll implement a similar mechanism to the current batching approach where data will be periodically aggregated and then flushed down the pipeline.

cypressious · 2018-05-08T11:45:20Z

@nblumhardt I've finished the rework.

I've reimplemented the aggregation to run periodically, just like the batching. They also both derrive from the same base class now.

Because the aggregation now runs periodically, it needs to run before the batching in the pipeline.

I have one concern with the current implementation. Because points are allowed to have arbitrary timestamps, points that are buffered during each tick aren't guaranteed to really be from the same interval. That's why I still need to group them into time buckets. However, time buckets aren't necessarily aligned at the times when the aggregation happens. So points from one timer interval will most likely fall into two time buckets, even though the interval and bucket "length" is the same.

In practice, this means that we will (almost) always have twice the number of transmitted points.

I'll see if I can come up with a solution.

~~I've squashed the commits and rebased the PR on #59 so you should probably merge that one first.~~

Fixes influxdata#9

cypressious · 2018-05-09T09:30:33Z

@nblumhardt I think I've found a solution for the problem described above. Points within the timer interval, i.e. inside [now - interval, now] now always land inside the same bucket. Remaining points are grouped into buckets like before.

I think this PR is ready for review.

cypressious · 2018-05-15T07:52:14Z

@nblumhardt did you have time to look at the PR?

nblumhardt · 2018-05-15T12:04:32Z

Hi @cypressious - thanks for the follow-up and sorry about the slow response; my time for this project is currently low.

Just to clarify my last comment:

I'm not keen to move forward with it because I don't think we'll ultimately be able to build on a buffer-then-aggregate design without an eventual reset to avoid the allocations. Let me know if you're keen to dig into it further

By buffer-then-aggregate, I mean, collecting up the PointData and then aggregating in memory with collection operations, rather than allocating a bucket per aggregation key and performing the aggregation incrementally in-place. It won't always be able to be done without overhead, but for counters/sums, and gauges, the total memory usage required to hold these can potentially just be a value, not a collection.

By an eventual reset, I mean, if we don't accommodate this goal into the design from the start, we probably won't be able to do it in a future version without starting afresh.

I think to dig in further requires some discussion of the goals/design space etc., and probably needs another contributor or two to join the conversation to provide some perspective.

cypressious · 2018-05-15T19:43:03Z

@nblumhardt Thanks for the reply. I misunderstood your last comment but I think I get your point now.

For sums, the bucket approach seems ideal.

Other kinds of aggregations can be done too. For averages, we could store the sum and the count to compute sum/count.

I'm still interested in implementing this feature. Do you have suggestions on which contributors to get involved?

nblumhardt · 2018-05-15T22:52:02Z

👍 I'll put the call out over on the original (#9) ticket. Cheers!

cypressious mentioned this pull request Apr 27, 2018

Sampling support for counter metrics (i.e. aggregate values within a sampling interval) #9

Open

nblumhardt reviewed Apr 27, 2018

View reviewed changes

cypressious force-pushed the feature-aggregate branch 2 times, most recently from 199aaab to ab0b8f1 Compare May 8, 2018 11:29

cypressious mentioned this pull request May 8, 2018

Feature single point #61

Open

cypressious force-pushed the feature-aggregate branch from ab0b8f1 to dca3b4a Compare May 9, 2018 09:02

Implement Aggregation of Increment and Time measurements

b47e85f

Fixes influxdata#9

cypressious force-pushed the feature-aggregate branch from dca3b4a to b47e85f Compare May 9, 2018 09:04

aggregation: always treat points inside timer interval as one bucket

f708be2

cypressious force-pushed the feature-aggregate branch from 1d44eff to f708be2 Compare May 9, 2018 09:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Aggregation of Increment and Time measurements #55

Implement Aggregation of Increment and Time measurements #55

cypressious commented Apr 27, 2018

nblumhardt left a comment

nblumhardt Apr 27, 2018

cypressious Apr 28, 2018

nblumhardt Apr 27, 2018

cypressious Apr 28, 2018

cypressious Apr 28, 2018

nblumhardt Apr 29, 2018

nblumhardt commented May 7, 2018

cypressious commented May 8, 2018

cypressious commented May 8, 2018 •

edited

Loading

cypressious commented May 9, 2018

cypressious commented May 15, 2018

nblumhardt commented May 15, 2018

cypressious commented May 15, 2018

nblumhardt commented May 15, 2018

Implement Aggregation of Increment and Time measurements #55

Are you sure you want to change the base?

Implement Aggregation of Increment and Time measurements #55

Conversation

cypressious commented Apr 27, 2018

nblumhardt left a comment

Choose a reason for hiding this comment

nblumhardt Apr 27, 2018

Choose a reason for hiding this comment

cypressious Apr 28, 2018

Choose a reason for hiding this comment

nblumhardt Apr 27, 2018

Choose a reason for hiding this comment

cypressious Apr 28, 2018

Choose a reason for hiding this comment

cypressious Apr 28, 2018

Choose a reason for hiding this comment

nblumhardt Apr 29, 2018

Choose a reason for hiding this comment

nblumhardt commented May 7, 2018

cypressious commented May 8, 2018

cypressious commented May 8, 2018 • edited Loading

cypressious commented May 9, 2018

cypressious commented May 15, 2018

nblumhardt commented May 15, 2018

cypressious commented May 15, 2018

nblumhardt commented May 15, 2018

cypressious commented May 8, 2018 •

edited

Loading