Improve metric queries by computing samples at the edges. #2293

cyriltovena · 2020-07-03T19:27:15Z

This PR pushes the metric extraction/transformation to the edges (ingester/storage) instead of doing it in the engine. This allow to create metric without making a string allocation from the line buffer while decompressing which reduce drastically memory allocations, and speed up those metric queries.

I have observed 2x improvement for all metric queries. Deduping of log line is done by using a hash of the log line and not the content anymore, I'm using xxhash which has shown very good performance and few chances of collisions see https://github.com/Cyan4973/xxHash.

Another interesting changes, the store now implement chunk.Store and logql.Querier this makes it easier to use it with the LogQL engine.

The PR is big as it splits the whole code base in 2.

Next possible steps:

Introduce a Seek function in the SampleIterator allowing range vector iterator to skip through lazy chunk and blocks. Specially when doing query where the range is smaller than the step e.g rate({app="foo"}[1m]) with a step of 5m
Try to merge those duplicated logic, I did intentionally not pushed this too much to avoid refactoring of what was already existing. But some complex logic has been reused such as the batch iterator.
There's some places where we could reuse slice, specially caches, but this is big enough.

When deploying this change, ingester should be fully roll out first, as it introduces a new GRPC service for requesting sample to ingester.

I really wanted to get this change in before we introduce LogQL v2, I believe now is easier. Again I'm sorry for the big PR.

Wondering how we're going to achieve fast mutation of labels. Signed-off-by: Cyril Tovena <[email protected]>

I realize I need hash for deduping lines. going to benchmark somes. Signed-off-by: Cyril Tovena <[email protected]>

Signed-off-by: Cyril Tovena <[email protected]>

…arams. Signed-off-by: Cyril Tovena <[email protected]>

Signed-off-by: Cyril Tovena <[email protected]>

codecov-commenter · 2020-07-03T20:04:12Z

Codecov Report

Merging #2293 into master will increase coverage by 0.26%.
The diff coverage is 70.58%.

@@            Coverage Diff             @@
##           master    #2293      +/-   ##
==========================================
+ Coverage   62.29%   62.55%   +0.26%     
==========================================
  Files         158      159       +1     
  Lines       12766    13361     +595     
==========================================
+ Hits         7952     8358     +406     
- Misses       4201     4363     +162     
- Partials      613      640      +27

Impacted Files	Coverage Δ
pkg/chunkenc/dumb_chunk.go	`0.00% <0.00%> (ø)`
pkg/chunkenc/interface.go	`87.50% <ø> (ø)`
pkg/ingester/stream.go	`73.80% <0.00%> (-3.70%)`	⬇️
pkg/iter/entry_iterator.go	`67.97% <0.00%> (ø)`
pkg/logcli/query/query.go	`0.00% <0.00%> (ø)`
pkg/logproto/types.go	`46.89% <ø> (ø)`
pkg/ingester/instance.go	`53.95% <5.12%> (-8.29%)`	⬇️
pkg/ingester/ingester.go	`51.20% <20.45%> (-8.45%)`	⬇️
pkg/querier/querier.go	`63.26% <28.57%> (-8.07%)`	⬇️
pkg/logql/sharding.go	`59.48% <33.33%> (-0.31%)`	⬇️
... and 15 more

This PR removes mostcommon and sort insert function in the heap iterator. I discovered while working on grafana#2293 that those are actually not helping since we're deduping those lines anyways. There were no tests checking if deduping was correctly working so I did added those. Bonus point this means deduping will run faster and the code is less complex. The only side effect is that the order of entries that are at the same timestamp, before the most common entry would appear first, now we keep the same order as we stored them, which I think is better. I also change the label ordering, I think whether we are forward or backward we should keep the same aphabetical labels ordering not sure why direction was altering this before. Signed-off-by: Cyril Tovena <[email protected]>

* Improve entry deduplication. This PR removes mostcommon and sort insert function in the heap iterator. I discovered while working on #2293 that those are actually not helping since we're deduping those lines anyways. There were no tests checking if deduping was correctly working so I did added those. Bonus point this means deduping will run faster and the code is less complex. The only side effect is that the order of entries that are at the same timestamp, before the most common entry would appear first, now we keep the same order as we stored them, which I think is better. I also change the label ordering, I think whether we are forward or backward we should keep the same aphabetical labels ordering not sure why direction was altering this before. Signed-off-by: Cyril Tovena <[email protected]> * Improve heap iterator backward test. Signed-off-by: Cyril Tovena <[email protected]>

slim-bean

More great work @cyriltovena !! Great work with tests!

# Conflicts: # pkg/ingester/instance.go # pkg/logql/series_extractor_test.go

cyriltovena added 30 commits June 23, 2020 17:44

First pass breaking the code appart.

76712e8

Wondering how we're going to achieve fast mutation of labels. Signed-off-by: Cyril Tovena <[email protected]>

Work in progress.

a023059

I realize I need hash for deduping lines. going to benchmark somes. Signed-off-by: Cyril Tovena <[email protected]>

Tested some hash and decided which one to use.

04a3204

Signed-off-by: Cyril Tovena <[email protected]>

Wip

92aa76f

Signed-off-by: Cyril Tovena <[email protected]>

Starting working on ingester.

4786b30

Signed-off-by: Cyril Tovena <[email protected]>

Trying to find a better hash function.

1b5d1de

Signed-off-by: Cyril Tovena <[email protected]>

More hash testing we have a winner. xxhash it is.

a4d1ca7

Signed-off-by: Cyril Tovena <[email protected]>

Settle on xxhash

0db0d2a

Signed-off-by: Cyril Tovena <[email protected]>

Merge remote-tracking branch 'upstream/master' into logql-edge-sample

600baa7

Signed-off-by: Cyril Tovena <[email protected]>

Better params interfacing.

265ea0f

Signed-off-by: Cyril Tovena <[email protected]>

Add interface for queryparams for things that exist in both type of p…

e22e6d1

…arams. Signed-off-by: Cyril Tovena <[email protected]>

Add storage sample iterator implementations.

683fbb5

Signed-off-by: Cyril Tovena <[email protected]>

Fixing tests and verifying we don't get collions for the hashing method.

aa43f9a

Signed-off-by: Cyril Tovena <[email protected]>

Fixing ingesters tests and refactoring utility function/tests.

57ab4f6

Signed-off-by: Cyril Tovena <[email protected]>

Fixing and testing that stats are still well computed.

31d0e55

Signed-off-by: Cyril Tovena <[email protected]>

Fixing more tests.

fa5e456

Signed-off-by: Cyril Tovena <[email protected]>

More engine tests finished.

0507d6c

Signed-off-by: Cyril Tovena <[email protected]>

Fixes sharding evaluator.

b7cede9

Signed-off-by: Cyril Tovena <[email protected]>

Fixes more engine tests.

d58b056

Signed-off-by: Cyril Tovena <[email protected]>

Fix error tests in the engine.

415f46c

Signed-off-by: Cyril Tovena <[email protected]>

Finish fixing all tests.

add9c8c

Signed-off-by: Cyril Tovena <[email protected]>

Fixes a bug where extractor was not passed in correctly.

18529a0

Signed-off-by: Cyril Tovena <[email protected]>

Add notes about upgrade.

e1d3aa6

Signed-off-by: Cyril Tovena <[email protected]>

Renamed and fix a bug.

421a3eb

Signed-off-by: Cyril Tovena <[email protected]>

Add memchunk tests and starting test for sampleIterator.

7999b20

Signed-off-by: Cyril Tovena <[email protected]>

Test heap sample iterator.

9c0b656

Signed-off-by: Cyril Tovena <[email protected]>

working on test.

28575e9

Signed-off-by: Cyril Tovena <[email protected]>

Finishing testing all new iterators.

7abfe3b

Signed-off-by: Cyril Tovena <[email protected]>

Making sure all store functions are tested.

a5895a8

Signed-off-by: Cyril Tovena <[email protected]>

Benchmark and verify everything is working well.

ca442ab

Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena added 5 commits July 3, 2020 14:48

Make the linter happy.

f3f326d

Signed-off-by: Cyril Tovena <[email protected]>

Merge remote-tracking branch 'upstream/master' into logql-edge-sample

4fa7ea7

Signed-off-by: Cyril Tovena <[email protected]>

use xxhash v2.

6caf45f

Signed-off-by: Cyril Tovena <[email protected]>

Fix a flaky test because of map.

583de7e

Signed-off-by: Cyril Tovena <[email protected]>

go.mod.

99c21f6

Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena mentioned this pull request Jul 6, 2020

Improve entry deduplication. #2302

Merged

slim-bean approved these changes Jul 11, 2020

View reviewed changes

slim-bean added 2 commits July 11, 2020 16:30

Merge branch 'master' into logql-edge-sample

5eaa32b

# Conflicts: # pkg/ingester/instance.go # pkg/logql/series_extractor_test.go

Merge branch 'master' into logql-edge-sample

a01f903

slim-bean merged commit 0be64fc into grafana:master Jul 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve metric queries by computing samples at the edges. #2293

Improve metric queries by computing samples at the edges. #2293

cyriltovena commented Jul 3, 2020

codecov-commenter commented Jul 3, 2020

slim-bean left a comment

Improve metric queries by computing samples at the edges. #2293

Improve metric queries by computing samples at the edges. #2293

Conversation

cyriltovena commented Jul 3, 2020

codecov-commenter commented Jul 3, 2020

Codecov Report

slim-bean left a comment

Choose a reason for hiding this comment