Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sum values in unwrapped rate aggregation instead of treating them as counter #6361

Merged
merged 4 commits into from
Jun 16, 2022

Conversation

chaudum
Copy link
Contributor

@chaudum chaudum commented Jun 10, 2022

What this PR does / why we need it:

This PR implements the first part of the RFC described in #6351

It reverts rate() to its previous implementation prior to #5013 That means it calculates the per-second rate from the sum of all extracted values.

Which issue(s) this PR fixes:

#6351

Checklist

  • Documentation added
  • Tests updated
  • Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.3%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
-              logql	-0.2%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

-           ingester	-0.1%
-        distributor	-0.3%
+            querier	0%
+ querier/queryrange	0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
-              logql	-0.2%
+               loki	0%

@chaudum chaudum marked this pull request as ready for review June 10, 2022 08:17
@chaudum chaudum requested review from KMiller-Grafana and a team as code owners June 10, 2022 08:17
@chaudum
Copy link
Contributor Author

chaudum commented Jun 10, 2022

@liguozhong Wanted to give you a heads up because it probably affects you.

@liguozhong
Copy link
Contributor

liguozhong commented Jun 10, 2022

@liguozhong Wanted to give you a heads up because it probably affects you.

Thanks , I got it

Copy link
Contributor

@DylanGuedes DylanGuedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits.

Btw, do you mind renaming the PR title to something like what you have in the CHANGELOG? In the changelog it is described as "Sum values in unwrapped rate aggregation instead of treating them as counter".

CHANGELOG.md Outdated Show resolved Hide resolved
docs/sources/upgrading/_index.md Outdated Show resolved Hide resolved
@chaudum
Copy link
Contributor Author

chaudum commented Jun 13, 2022

Few nits.

Btw, do you mind renaming the PR title to something like what you have in the CHANGELOG? In the changelog it is described as "Sum values in unwrapped rate aggregation instead of treating them as counter".

Let me make two separate pull requests. Then the PR title also matches the changes.

@chaudum chaudum changed the title Revert implementation of unwrapped rate aggregation (RFC #6351) Sum values in unwrapped rate aggregation instead of treating them as counter Jun 13, 2022
@chaudum chaudum requested a review from DylanGuedes June 13, 2022 06:23
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jun 13, 2022
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
-              logql	-0.2%
+               loki	0%

Copy link
Contributor

@DylanGuedes DylanGuedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! (the jsonnet failure isn't related to your change)

This PR reverts the implementation done in #5013 to the original
implementation that sums the extracted values from the log lines instead
of treating them like a Prometheus counter metric.

Signed-off-by: Christian Haudum <[email protected]>
Signed-off-by: Christian Haudum <[email protected]>
Signed-off-by: Christian Haudum <[email protected]>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.3%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
-              logql	-0.2%
+               loki	0%

Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, I don't fully understand this PR. for one, why are all the values we assert against in the tests changing? does this actually change computation behavior? was the previous implementation incorrect?

CHANGELOG.md Outdated Show resolved Hide resolved
Signed-off-by: Christian Haudum <[email protected]>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
-              logql	-0.2%
+               loki	0%

Comment on lines +63 to +65
// SUM(n=47, 61, 1) = 15
// 15 / 30 = 0.5
promql.Vector{promql.Sample{Point: promql.Point{T: 60 * 1000, V: 0.5}, Metric: labels.Labels{labels.Label{Name: "app", Value: "foo"}}}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having some difficulty understanding these tests 😕
What does the SUM(n=47, 61, 1) mean? I assume this is the result of the newSeries method on the data property, but don't understand how these values were calculated 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the SUM(n=47, 61, 1): it just means the sum of the values from item 47 to item 61 where the value is a constant of 1, like https://www.wolframalpha.com/input?i2d=true&i=Sum%5B1%2C%7Bn%2C47%2C61%7D%5D

2022-06-14_21-50

Regarding, why 47, 61, and 1, we have to look at the output of the newSeries() function: It creates a stream {app="foo"} with 300 samples starting at timestamp 46e9 (46s) and ending at timestamp 345e9 (345s). The sample value is a constant 1.

So the query now looks at the time range from ts=30s to ts=60s, where the lower bound is not included. There are 15 items (item 47 to item 61) from the generated series that matches.

Hope this helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! 🙏 This helps a lot.
Would it be possible to add this information in the comments to make the tests easier to understand? e.g.

// create a stream {app="foo"} with 300 samples starting at 46s and ending at 345s with a constant value of 1
[][]logproto.Series{
  {newSeries(testSize, offset(46, constantValue(1)), `{app="foo"}`)},
},
// query between the time range from ts=30s and ts=60s where the lower bound is not included
[]SelectSampleParams{
  {&logproto.SampleQueryRequest{Start: time.Unix(30, 0), End: time.Unix(60, 0), Selector: `rate({app="foo"} | unwrap foo[30s])`}},
},
// SUM(n=47, 61, 1) = 15 - there are 15 samples (from 47 to 61) matched from the generated series
// 15 / 30 = 0.5
promql.Vector{promql.Sample{Point: promql.Point{T: 60 * 1000, V: 0.5}, Metric: labels.Labels{labels.Label{Name: "app", Value: "foo"}}}},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@KMiller-Grafana KMiller-Grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs in this PR look good to me.

@owen-d owen-d merged commit b315ed0 into main Jun 16, 2022
@owen-d owen-d deleted the chaudum/rfc-6351 branch June 16, 2022 13:44
@ssncferreira ssncferreira added the backport release-2.6.x Tag a PR with this label to create a PR which cherry pics it into the release-2.6.x branch label Jun 30, 2022
grafanabot pushed a commit that referenced this pull request Jun 30, 2022
…counter (#6361)

* Revert unwrapped rate aggregation to previous implementation

This PR reverts the implementation done in #5013 to the original
implementation that sums the extracted values from the log lines instead
of treating them like a Prometheus counter metric.

Signed-off-by: Christian Haudum <[email protected]>

* Move changelog entry

Signed-off-by: Christian Haudum <[email protected]>

* Remove unused/dead code

Signed-off-by: Christian Haudum <[email protected]>

* Clean changelog

Signed-off-by: Christian Haudum <[email protected]>
(cherry picked from commit b315ed0)
ssncferreira pushed a commit that referenced this pull request Jun 30, 2022
…counter (#6361) (#6555)

* Revert unwrapped rate aggregation to previous implementation

This PR reverts the implementation done in #5013 to the original
implementation that sums the extracted values from the log lines instead
of treating them like a Prometheus counter metric.

Signed-off-by: Christian Haudum <[email protected]>

* Move changelog entry

Signed-off-by: Christian Haudum <[email protected]>

* Remove unused/dead code

Signed-off-by: Christian Haudum <[email protected]>

* Clean changelog

Signed-off-by: Christian Haudum <[email protected]>
(cherry picked from commit b315ed0)

Co-authored-by: Christian Haudum <[email protected]>
@osg-grafana osg-grafana added type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories and removed area/docs labels Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport release-2.6.x Tag a PR with this label to create a PR which cherry pics it into the release-2.6.x branch size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants