Skip to content
This repository has been archived by the owner on Jul 18, 2023. It is now read-only.

Sampling support for counter metrics (i.e. aggregate values within a sampling interval) #9

Open
nblumhardt opened this issue Apr 19, 2016 · 6 comments

Comments

@nblumhardt
Copy link
Contributor

If multiple Count() calls increment the same measurement in the sampling interval, these should be aggregated into single values before sending.

@bnayae
Copy link
Contributor

bnayae commented Mar 7, 2018

I'm considering to contribute this functionality.
My thoughts are:

  • sampling is good for Time
  • aggregation is good for Increment
    Write and Measure may use non numeric fields, therefore less likely to be aggregate.

For both sampling and aggregation, it should group only measurements which have the same tags.
Configuration can define IgnoreTag(string tag) in order to omit tags from the aggregation or sampling.

SAMPLING:
can use different strategies:

  • Take single event with-in a time-frame: this can be fine for lots of short time-frames, the statistics of big numbers should make it meaningful.
  • Assign aggregation function like Min, Max, Mean, etc.. , this can be more accurate, but less performant.
    The API can let the user define the strategy at configuration time.

The aggregation and sampling client should be disposable in-order to flush the data on shutdown.

In any case this should be separate assembly which can extend the current functionality.
It should be consume by different NuGet.

@LordMike
Copy link

This is related to #46, as I assume that if Increments were grouped together in the 100ns buckets that DateTime can provide, data wouldn't have been lost?

@bnayae
Copy link
Contributor

bnayae commented Mar 19, 2018

TNX, I will check the #46 status

@cypressious
Copy link
Contributor

Any updates on this feature? I'm transitioning to InfluxDB from a custom solution where we had something similar built in.

One note, sampling duration should not be the same as the batching duration. Because I might want to set the batching duration to a high value like 10 minutes but I want my increments only summed to 1 minute buckets.

I'm willing to contribute if @bnayae isn't planing to do it himself.

@cypressious
Copy link
Contributor

I went ahead and drafted the feature in #55. Please take a look.

cypressious added a commit to GreenParrotGmbH/influxdb-csharp that referenced this issue May 8, 2018
cypressious added a commit to GreenParrotGmbH/influxdb-csharp that referenced this issue May 8, 2018
cypressious added a commit to GreenParrotGmbH/influxdb-csharp that referenced this issue May 9, 2018
cypressious added a commit to GreenParrotGmbH/influxdb-csharp that referenced this issue May 9, 2018
@nblumhardt
Copy link
Contributor Author

Hi @influxdata/c-sharp /all,

@cypressious has been exploring this feature and is keen to work on an implementation 🎉

I'm not a great collaborator on this right now due to other commitments, but it is a substantial and challenging feature, so having other perspectives and help with it seems important.

Is anyone with experience in this library keen to help, or to shepherd the feature through via discussion/feedback/reviews?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants