Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uses custom json-iter decoder for log entries. #3163

Merged
merged 2 commits into from
Jan 19, 2021

Conversation

cyriltovena
Copy link
Contributor

Previously we were using json.Unmarshal for each line. However json-iter uses a Pool for each calls and I believe this can cause to increase memory usage.

For each line we would put in a pool the iterator to re-use it, once put in a pool, the last data is retained, since we handle millions of lines, this can cause problem, using a custom extensions, keep using a pool but at the root object only, not for each line.

On top of that we're going to process that json payload 50% faster.

❯ benchcmp  before.txt after.txt2
benchmark                          old ns/op     new ns/op     delta
Benchmark_DecodePushRequest-16     13509236      6677037       -50.57%
benchmark                          old allocs     new allocs     delta
Benchmark_DecodePushRequest-16     106149         38719          -63.52%
benchmark                          old bytes     new bytes     delta
Benchmark_DecodePushRequest-16     10350362      5222989       -49.54%

Signed-off-by: Cyril Tovena [email protected]

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated

Previously we were using json.Unmarshal for each line. However json-iter uses a Pool for each calls and I believe this can cause to increase memory usage.

For each line we would put in a pool the iterator to re-use it, once put in a pool, the last data is retained, since we handle millions of lines, this can cause problem, using a custom extensions, keep using a pool but at the root object only, not for each line.

On top of that we're going to process that json payload 50% faster.

```
❯ benchcmp  before.txt after.txt2
benchmark                          old ns/op     new ns/op     delta
Benchmark_DecodePushRequest-16     13509236      6677037       -50.57%
benchmark                          old allocs     new allocs     delta
Benchmark_DecodePushRequest-16     106149         38719          -63.52%
benchmark                          old bytes     new bytes     delta
Benchmark_DecodePushRequest-16     10350362      5222989       -49.54%
```

Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
@codecov-io
Copy link

Codecov Report

Merging #3163 (1e6544e) into master (c9b85b3) will decrease coverage by 0.15%.
The diff coverage is 8.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3163      +/-   ##
==========================================
- Coverage   63.05%   62.90%   -0.16%     
==========================================
  Files         188      189       +1     
  Lines       16218    16249      +31     
==========================================
- Hits        10226    10221       -5     
- Misses       5051     5090      +39     
+ Partials      941      938       -3     
Impacted Files Coverage Δ
pkg/distributor/http.go 0.00% <ø> (ø)
pkg/loghttp/query.go 47.77% <ø> (+6.43%) ⬆️
pkg/logql/marshal/marshal.go 75.00% <ø> (ø)
pkg/loghttp/entry.go 1.96% <1.96%> (ø)
pkg/logql/unmarshal/unmarshal.go 83.33% <66.66%> (-5.56%) ⬇️
cmd/querytee/response_comparator.go 87.87% <100.00%> (ø)
pkg/canary/comparator/comparator.go 76.36% <0.00%> (-1.82%) ⬇️
pkg/logql/evaluator.go 90.23% <0.00%> (+0.35%) ⬆️
pkg/querier/queryrange/limits.go 95.83% <0.00%> (+4.16%) ⬆️

@cyriltovena
Copy link
Contributor Author

@mizeng You raised an issue I believe around json and distributor memory, I think this might greatly help if not fix it.

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do you have benchmark stats for the encoding, too?

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cyriltovena
Copy link
Contributor Author

LGTM. Do you have benchmark stats for the encoding, too?

I used to but I don't anymore, it's also faster and less memory but it was less of a problem.

@cyriltovena cyriltovena merged commit 070246c into grafana:master Jan 19, 2021
cyriltovena added a commit to cyriltovena/loki that referenced this pull request Jan 21, 2021
Introduce a bug by removing the default marshalling (grafana#3163) but the tail api was using the default json. This fixes it by forcing the usage of jsoniter package.
Added a missing test so that it doesn't happen again.

Signed-off-by: Cyril Tovena <[email protected]>
owen-d pushed a commit that referenced this pull request Jan 22, 2021
Introduce a bug by removing the default marshalling (#3163) but the tail api was using the default json. This fixes it by forcing the usage of jsoniter package.
Added a missing test so that it doesn't happen again.

Signed-off-by: Cyril Tovena <[email protected]>
cyriltovena added a commit that referenced this pull request Nov 4, 2022
**What this PR does / why we need it**:

This allows to reuse stream memory allocations between responses.
Related #3163 but this time this is
for the encoding.

```
❯ benchstat before.txt after.txt                                                                                                              
name        old time/op    new time/op    delta
_Encode-16    29.2µs ±12%    25.2µs ± 1%  -13.85%  (p=0.016 n=5+4)

name        old alloc/op   new alloc/op   delta
_Encode-16    24.9kB ± 6%    16.4kB ± 8%  -34.20%  (p=0.008 n=5+5)

name        old allocs/op  new allocs/op  delta
_Encode-16       145 ± 0%       129 ± 0%  -11.03%  (p=0.008 n=5+5)
```
lxwzy pushed a commit to lxwzy/loki that referenced this pull request Nov 7, 2022
**What this PR does / why we need it**:

This allows to reuse stream memory allocations between responses.
Related grafana#3163 but this time this is
for the encoding.

```
❯ benchstat before.txt after.txt                                                                                                              
name        old time/op    new time/op    delta
_Encode-16    29.2µs ±12%    25.2µs ± 1%  -13.85%  (p=0.016 n=5+4)

name        old alloc/op   new alloc/op   delta
_Encode-16    24.9kB ± 6%    16.4kB ± 8%  -34.20%  (p=0.008 n=5+5)

name        old allocs/op  new allocs/op  delta
_Encode-16       145 ± 0%       129 ± 0%  -11.03%  (p=0.008 n=5+5)
```
changhyuni pushed a commit to changhyuni/loki that referenced this pull request Nov 8, 2022
**What this PR does / why we need it**:

This allows to reuse stream memory allocations between responses.
Related grafana#3163 but this time this is
for the encoding.

```
❯ benchstat before.txt after.txt                                                                                                              
name        old time/op    new time/op    delta
_Encode-16    29.2µs ±12%    25.2µs ± 1%  -13.85%  (p=0.016 n=5+4)

name        old alloc/op   new alloc/op   delta
_Encode-16    24.9kB ± 6%    16.4kB ± 8%  -34.20%  (p=0.008 n=5+5)

name        old allocs/op  new allocs/op  delta
_Encode-16       145 ± 0%       129 ± 0%  -11.03%  (p=0.008 n=5+5)
```
Abuelodelanada pushed a commit to canonical/loki that referenced this pull request Dec 1, 2022
**What this PR does / why we need it**:

This allows to reuse stream memory allocations between responses.
Related grafana#3163 but this time this is
for the encoding.

```
❯ benchstat before.txt after.txt                                                                                                              
name        old time/op    new time/op    delta
_Encode-16    29.2µs ±12%    25.2µs ± 1%  -13.85%  (p=0.016 n=5+4)

name        old alloc/op   new alloc/op   delta
_Encode-16    24.9kB ± 6%    16.4kB ± 8%  -34.20%  (p=0.008 n=5+5)

name        old allocs/op  new allocs/op  delta
_Encode-16       145 ± 0%       129 ± 0%  -11.03%  (p=0.008 n=5+5)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants