Add sum stat to basicstats aggregator #3797

cpacey · 2018-02-16T16:46:16Z

Implements #3467. sum is not included by default to maintain backwards compatibility.

Required for all PRs:

Signed CLA.
Associated README.md updated.
Has appropriate unit tests.

Is not included by default to maintain backwards compatibility.

cpacey · 2018-02-16T16:47:37Z

plugins/aggregators/basicstats/basicstats.go

@@ -146,6 +147,9 @@ func (m *BasicStats) Push(acc telegraf.Accumulator) {
 			if config.mean {
 				fields[k+"_mean"] = v.mean
 			}
+			if config.sum {
+				fields[k+"_sum"] = (v.mean * v.count)


This seemed the simplest way to implement it.

And cheapest for backwards compatibility case

cpacey · 2018-02-16T16:48:33Z

plugins/aggregators/basicstats/basicstats_test.go

+// stdev, and s2.  We purposely exclude sum for backwards compatability,
+// otherwise user's working systems will suddenly (and surprisingly) start
+// capturing sum without their input.
+func TestBasicStatsWithDefaultStats(t *testing.T) {


This might be overkill, but I like the idea of verifying "default configuration" separately from verifying "aggregations are working".

JeffAshton · 2018-02-21T21:41:07Z

Hey @danielnelson , I would love to use this feature to help simplify my continuous queries. Do you have time to take a look?

danielnelson · 2018-02-21T22:16:17Z

One thing I'm worried about is that we might see some weird floating point artifacts that wouldn't appear if we kept a running sum:

cpu value=1 1519251291871635057
cpu value=1 1519251291871803032
cpu value=1 1519251291871857634
cpu value=1 1519251291871884218
cpu value=2 1519251291871907024
cpu value=1 1519251291871931726
cpu value_sum=6.999999999999999 1519251294000000000

cpacey · 2018-02-21T22:26:05Z

@danielnelson I'll switch to tracking sum as we go. Would you like to keep tracking mean as it's done now, or would you want to switch to calculating the mean based on sum and count? The former simplifies the code change and reduces the likelihood of changing previous behaviour; the latter simplifies the code and reduces calculation cost.

danielnelson · 2018-02-21T22:31:10Z

Let's switch to calculating the mean based on sum and count.

cpacey · 2018-02-22T16:52:15Z

@danielnelson I've switched to keeping a running sum (and added a test for the floating-point problem you were concerned about). However, when I tried to remove tracking the mean, the naive approach added another floating-point divide when calculating variance (example here). Do we care about a potential performance impact of this? I haven't measured anything yet - I wanted to get a sense of whether you'd be concerned before investing time.

If you are concerned, options I see are: continue tracking the mean; or rearranging the variance equations to avoid the extra floating-point divide, but at the cost of an extra floating-point multiplication (example here).

danielnelson · 2018-02-25T16:34:17Z

In the first example, doesn't it equal out since we can remove this divide?

mean = mean + delta/n

cpacey · 2018-02-26T14:22:21Z

Not without changing the stdev calculation, which I'm not excited to do.

The online variance algorithm produces its estimate by using both the current mean and the previous mean. In the current code, the previous mean is tracked, and the new mean is calculated with a single divide (mean + delta/n). If we stop tracking the mean, then we need to compute both the previous mean and the current mean, which (naively) requires 2 divides: either oldSum / oldCount and newSum / newCount; or oldSum / oldCount and oldMean + delta/n.

(Note that I'd be very happy to be wrong about this.)

cpacey · 2018-03-01T14:58:01Z

@danielnelson, could we keep tracking mean and merge as-is?

Add sum stat to basicstats aggregator

eb89ae5

Is not included by default to maintain backwards compatibility.

cpacey commented Feb 16, 2018

View reviewed changes

Keep a running sum

ae83e44

cpacey force-pushed the basicstats_sum branch from bbc2930 to ae83e44 Compare February 22, 2018 16:37

danielnelson added this to the 1.6.0 milestone Mar 5, 2018

danielnelson added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Mar 5, 2018

danielnelson merged commit 0a37386 into influxdata:master Mar 5, 2018

danielnelson mentioned this pull request Mar 5, 2018

Add "sum" to basicstats aggregator #3467

Closed

maxunt pushed a commit that referenced this pull request Jun 26, 2018

Add sum stat to basicstats aggregator (#3797)

54f4a5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sum stat to basicstats aggregator #3797

Add sum stat to basicstats aggregator #3797

cpacey commented Feb 16, 2018 •

edited

Loading

cpacey Feb 16, 2018

JeffAshton Feb 16, 2018

cpacey Feb 16, 2018

JeffAshton commented Feb 21, 2018

danielnelson commented Feb 21, 2018

cpacey commented Feb 21, 2018

danielnelson commented Feb 21, 2018

cpacey commented Feb 22, 2018 •

edited

Loading

danielnelson commented Feb 25, 2018

cpacey commented Feb 26, 2018 •

edited

Loading

cpacey commented Mar 1, 2018

Add sum stat to basicstats aggregator #3797

Add sum stat to basicstats aggregator #3797

Conversation

cpacey commented Feb 16, 2018 • edited Loading

Required for all PRs:

cpacey Feb 16, 2018

Choose a reason for hiding this comment

JeffAshton Feb 16, 2018

Choose a reason for hiding this comment

cpacey Feb 16, 2018

Choose a reason for hiding this comment

JeffAshton commented Feb 21, 2018

danielnelson commented Feb 21, 2018

cpacey commented Feb 21, 2018

danielnelson commented Feb 21, 2018

cpacey commented Feb 22, 2018 • edited Loading

danielnelson commented Feb 25, 2018

cpacey commented Feb 26, 2018 • edited Loading

cpacey commented Mar 1, 2018

cpacey commented Feb 16, 2018 •

edited

Loading

cpacey commented Feb 22, 2018 •

edited

Loading

cpacey commented Feb 26, 2018 •

edited

Loading