Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Add quantile_over_time #1367

Merged
merged 7 commits into from
Feb 12, 2019
Merged

Conversation

benraskin92
Copy link
Collaborator

No description provided.

@codecov
Copy link

codecov bot commented Feb 11, 2019

Codecov Report

Merging #1367 into master will decrease coverage by 7.5%.
The diff coverage is 2%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1367     +/-   ##
========================================
- Coverage    70.6%     63%   -7.6%     
========================================
  Files         823     817      -6     
  Lines       70369   70004    -365     
========================================
- Hits        49692   44169   -5523     
- Misses      17455   22797   +5342     
+ Partials     3222    3038    -184
Flag Coverage Δ
#aggregator 69.2% <ø> (-13.1%) ⬇️
#cluster 67.7% <ø> (-18.2%) ⬇️
#collector 47.9% <ø> (-15.8%) ⬇️
#dbnode 79.9% <ø> (-1%) ⬇️
#m3em 66.7% <ø> (-6.5%) ⬇️
#m3ninx 70.8% <ø> (-3.5%) ⬇️
#m3nsch 28.4% <ø> (-22.8%) ⬇️
#metrics 17.6% <ø> (ø) ⬆️
#msg 75.2% <ø> (+0.1%) ⬆️
#query 45.7% <2%> (-18.8%) ⬇️
#x 71.5% <ø> (-4.6%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f3fd6e...e0532c9. Read the comment docs.

@codecov
Copy link

codecov bot commented Feb 11, 2019

Codecov Report

Merging #1367 into master will increase coverage by <.1%.
The diff coverage is 78.1%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1367     +/-   ##
========================================
+ Coverage    70.6%   70.6%   +<.1%     
========================================
  Files         823     823             
  Lines       70369   70421     +52     
========================================
+ Hits        49692   49746     +54     
+ Misses      17455   17448      -7     
- Partials     3222    3227      +5
Flag Coverage Δ
#aggregator 82.3% <ø> (ø) ⬆️
#cluster 85.9% <ø> (ø) ⬆️
#collector 63.7% <ø> (ø) ⬆️
#dbnode 80.9% <ø> (ø) ⬆️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.1% <ø> (-0.1%) ⬇️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 75% <ø> (-0.1%) ⬇️
#query 64.5% <78.1%> (ø) ⬆️
#x 76% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f3fd6e...75a1262. Read the comment docs.

@benraskin92 benraskin92 changed the title [WIP][DON'T REVIEW] Add quantile_over_time Add quantile_over_time Feb 11, 2019
@benraskin92 benraskin92 changed the title Add quantile_over_time [query] Add quantile_over_time Feb 11, 2019
@@ -85,28 +92,42 @@ func NewAggOp(args []interface{}, optype string) (transform.Params, error) {
aggFunc: aggregationFunc,
}

if optype == QuantileType {
if len(args) != 2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty messy; do a similar approach to what's there for holt winters instead

@@ -204,7 +204,7 @@ func NewFunctionExpr(

case temporal.AvgType, temporal.CountType, temporal.MinType,
temporal.MaxType, temporal.SumType, temporal.StdDevType,
temporal.StdVarType:
temporal.StdVarType, temporal.QuantileType:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to parse test

op baseOp
controller *transform.Controller
aggFunc func([]float64, float64) float64
quantileScalar float64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to add this quantileScalar at the generic agg func layer? Doesn't seem scalable given how many different types of agg funcs we can have. Can we just do the args checking and parsing once we get to the functions that need them?

}

func avgOverTime(values []float64) float64 {
func avgOverTime(values []float64, _ float64) float64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not great that we modify all the func signatures for a var that isn't used in most of these functions. If it needs to be this way, can we wrap it in an options or something at least so the function signature doesn't have to ever change again? (But only if you have to pass this all the way down, see comment above).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or consider putting this in some sort of context object that you propagate instead of a specific options for agg functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Would it make sense to pass in scalar as a constructor argument to the quantile aggFunc? That way we separate out the "configuration" of the aggFunc from it's actual execution. A note on how that could work:

  1. We can switch aggFunc to be an interface (functions can implement interfaces, so this is actually less work than it seems), to allow for quantile to hold configuration (scalar)
  2. Pass scalar to quantile when we initialize the node instead (threading through NewAggOp and temporal.baseOp.Node)

@@ -192,3 +213,60 @@ func sumAndCount(values []float64) (float64, float64) {

return sum, count
}

func quantileOverTime(values []float64, scalar float64) float64 {
valuesSlice := make(valsSlice, 0, len(values))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment about why copying the array is necessary?

return func(values []float64) float64 {
valuesSlice := make(valsSlice, 0, len(values))
for _, v := range values {
valuesSlice = append(values, v)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this loop trying to achieve? This seems to be overwriting valuesSlice on each iteration with the whole of values + the current value from values. You end up with just the values slice + the very last value in values in valuesSlice. Is that what you want?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1--looks like it should be valuesSlice? You could probably also use copy for this (assuming that's the goal)

if operatorType != QuantileType {
durArg = args[0]
} else {
durArg = args[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you check outside of this function, but is it worth checking len args in this func just in case? There's some check above for len(args) != 1, but does that cover everything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh I think it's fine (we do the same thing in the Holt-Winters function) + newBaseOp() is a private function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just expose a new function NewQuantileOp? Spreading this kind of casing across multiple functions confuses the code imo.

@@ -78,6 +82,30 @@ func (a aggProcessor) Init(op baseOp, controller *transform.Controller, opts tra
}
}

// NewQuantileOp create a new base temporal transform for quantile_over_time func
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: . at end of comment

@@ -78,6 +82,30 @@ func (a aggProcessor) Init(op baseOp, controller *transform.Controller, opts tra
}
}

// NewQuantileOp create a new base temporal transform for quantile_over_time func
func NewQuantileOp(args []interface{}, optype string) (transform.Params, error) {
if optype == QuantileType {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flip the conditional to reduce nesting


scalar, ok := args[0].(float64)
if !ok {
return emptyOp, fmt.Errorf("unable to cast to scalar argument: %v for %s", args[1], QuantileType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prints args[1] but you're checking args[0]; do we ever use args[1]? Does it need two args?

if len(values) == 0 {
return math.NaN()
}
if q < 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newlines

// If 'values' has zero elements, NaN is returned.
// If q<0, -Inf is returned.
// If q>1, +Inf is returned.
func quantile(q float64, values valsSlice) float64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: keep same name for the q variable throughout rather than having it be scalar when you parse it

return emptyOp, fmt.Errorf("unable to cast to scalar argument: %v for %s", args[1], QuantileType)
}

aggregationFunc := makeQuantileOverTimeFn(scalar)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add an optimization here is scalar < 0 || scalar > 1 by instead having a function that looks like:

func makeInvalidQuantileFn(inf float64) aggFunc {
 return func(_ []float64) float64 {
  return inf
 }
}

And passing the correctly signed infinity in

@@ -49,15 +49,27 @@ type baseOp struct {
// skipping lint check for a single operator type since we will be adding more
// nolint : unparam
func newBaseOp(args []interface{}, operatorType string, processorFn MakeProcessor) (baseOp, error) {
if operatorType != HoltWintersType && operatorType != PredictLinearType {
if operatorType != HoltWintersType && operatorType != PredictLinearType && operatorType != QuantileType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconding +arnikola's earlier comment here; why not handle this in the same way that we do HoltWinters and PredictLinear, and handle this casing in NewFunctionExpr? Handling the argument extraction in 2 places is unnecessarily confusing imo.

name: "quantile_over_time",
opType: QuantileType,
afterBlockOne: [][]float64{
{math.NaN(), math.NaN(), math.NaN(), math.NaN(), math.NaN()},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some tests that don't include NaN's? Might be a useful simple case to cover.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should add tests for all of the temporal functions with some NaNs - ill do this in a separate PR.

@@ -78,6 +82,30 @@ func (a aggProcessor) Init(op baseOp, controller *transform.Controller, opts tra
}
}

// NewQuantileOp create a new base temporal transform for quantile_over_time func
func NewQuantileOp(args []interface{}, optype string) (transform.Params, error) {
if optype == QuantileType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is creating a QuantileOp, why does it need optype as input? Anything other than QuantileType is an error, so it doesn't seem worth having the argument (fwiw, HoltWinters doesn't take one).

@@ -192,3 +220,62 @@ func sumAndCount(values []float64) (float64, float64) {

return sum, count
}

func makeQuantileOverTimeFn(scalar float64) aggFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this refactor, much cleaner this way.

if q > 1 {
return math.Inf(+1)
}
sort.Sort(values)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be a problem with NaNs actually; we don't want them to pollute the output since they don't exist in Prom's implementation. Instead of just doing a sort here, it would be better to put the through a sanitize function that will remove NaNs from the values, then sort.

On the bright side, this means we can use the inbuilt int slice sorter, and not have to worry about the valsSlice type

@benraskin92 benraskin92 force-pushed the braskin/quantile_over_time branch from 714c745 to bad88d2 Compare February 12, 2019 21:41
if operatorType != QuantileType {
durArg = args[0]
} else {
durArg = args[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just expose a new function NewQuantileOp? Spreading this kind of casing across multiple functions confuses the code imo.


return newBaseOp(duration, QuantileType, a)
}

// NewAggOp creates a new base temporal transform with a specified node.
func NewAggOp(args []interface{}, optype string) (transform.Params, error) {
if aggregationFunc, ok := aggFuncs[optype]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: flip the conditional, since the if is a lot larger now.

return emptyOp, fmt.Errorf("unable to cast to scalar argument: %v for %s", args[0], operatorType)
}

func newBaseOp(duration time.Duration, operatorType string, processorFn MakeProcessor) (baseOp, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


duration, ok := args[1].(time.Duration)
if !ok {
return emptyOp, fmt.Errorf("unable to cast to scalar argument: %v for %s", args[1], QuantileType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we push all of these parsing functions up into promql/types.go, similar to how this does it?

A small gotcha is that any numeric expression is valid in Prom, i.e. quantile_over_time( (1*2)/3, up[5m] ) is a valid query but would fail here and in many of the other temporal functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really don't wanna change this anymore. Also this query works in ours:

quantile_over_time((1*2)/3, service_writeTaggedBatchRaw_latency{instance=~"$instance",quantile="0.99"}[1m])               

@benraskin92 benraskin92 merged commit 7450997 into master Feb 12, 2019
@justinjc justinjc deleted the braskin/quantile_over_time branch June 17, 2019 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants