Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize value storage in contexts #136581

Open
RaduBerinde opened this issue Dec 3, 2024 · 4 comments
Open

optimize value storage in contexts #136581

RaduBerinde opened this issue Dec 3, 2024 · 4 comments
Assignees
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. o-perf-efficiency Related to performance efficiency

Comments

@RaduBerinde
Copy link
Member

RaduBerinde commented Dec 3, 2024

As of mid November, context.Value and context.WithValue take about 1% of the CPU time in sysbench. This is a proposal to speed this up.

Most of the uses in the hot path are in our code and only involve a small number of keys (tracing, log tags, statement, plan gist, server ID).

Proposal: implement a new "fast value" context which contains all values for these keys in an array. The keys are globally registered at init time (e.g. var tracingContextKey = ctxutil.RegisterFastValue()).

The lower-most fastValueCtx contains all the current values for these keys.

To interoperate with other contexts, the fastValueCtx produces a Value() for a special fastValuesKey{}, containing the values in the most recent fastValueCtx in the chain.

To reduce the number of allocations, we can allocate fastValueCtxes in "magazines" and store the magazines that still have unused slots in a pool.

Jira issue: CRDB-45145

@RaduBerinde RaduBerinde added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. o-perf-efficiency Related to performance efficiency labels Dec 3, 2024
@RaduBerinde RaduBerinde self-assigned this Dec 3, 2024
Copy link

blathers-crl bot commented Dec 3, 2024

Hi @RaduBerinde, please add branch-* labels to identify which branch(es) this C-bug affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@RaduBerinde RaduBerinde added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-performance Perf of queries or internals. Solution not expected to change functional behavior. and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Dec 3, 2024
@tbg
Copy link
Member

tbg commented Dec 4, 2024

x-ref #135904 (which this issue may help alleviate)

also x-ref @dhartunian's comment #133307 (comment)
@dhartunian - let's coalesce these threads (i.e. ignore the overhead of context value chain search in #133307).

@tbg
Copy link
Member

tbg commented Dec 4, 2024

@RaduBerinde: all of these ideas are good IMO. Do you think you could prototype this?

Proposal: implement a new "fast value" context which contains all values for these keys in an array

and we're going to use atomics to access the slots, right?

To reduce the number of allocations, we can allocate fastValueCtxes in "magazines" and store the magazines that still have unused slots in a pool.

That's a good idea, we just need to be careful - some contexts are never released, and we don't want to leak entire magazines over time.

@RaduBerinde
Copy link
Member Author

Do you think you could prototype this?

Yes, already started.

and we're going to use atomics to access the slots, right?

No, the context would still be immutable.

That's a good idea, we just need to be careful - some contexts are never released, and we don't want to leak entire magazines over time.

In the worst case, we would have one live magazine for each live context (so if the magazine size is 8, that's 8x the memory footprint; but the contexts and associated objects are very small)

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Dec 12, 2024
This commit introduces a custom context type that can be used to more
efficiently associate values to contexts.

Instead of using arbitrary objects as keys, the keys are a small set
of integers. Any key must be globally registered first, and there is a
limited number that can be registered.

The `fastValuesCtx` stores *all* values for these keys, so traversing
the context hierarchy is not necessary. We also provide a way to add
multiple values at the same time, which saves on allocations.

I looked at before and after profiles in sysbench. Before:
 - CPU usage:
  context.value 0.6%
  context.WithValue 0.3%
  total 0.9%
 - Allocs:
  context.WithValue: 3.5%

After:
 - CPU usage:
  context.value + ctxutil.FastValue 0.5%
  context.WithValue 0.1%
  ctxutil.WithFastValue(s) 0.1%
  total 0.7%
 - Allocs:
  context.WithValue: 0.2%
  context.WithFastValue: 0.6%
  ctxutil.FastValuesBuilder.Finish: 1.4%
  total 2.2%

I will investigate improving on this, perhaps with providing a
pre-allocated context in codepaths where this is possible.

Informs: cockroachdb#136581
Release note: None
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Dec 13, 2024
This commit introduces a custom context type that can be used to more
efficiently associate values to contexts.

Instead of using arbitrary objects as keys, the keys are a small set
of integers. Any key must be globally registered first, and there is a
limited number that can be registered.

The `fastValuesCtx` stores *all* values for these keys, so traversing
the context hierarchy is not necessary. We also provide a way to add
multiple values at the same time, which saves on allocations.

I looked at before and after profiles in sysbench. Before:
 - CPU usage:
  context.value 0.6%
  context.WithValue 0.3%
  total 0.9%
 - Allocs:
  context.WithValue: 3.5%

After:
 - CPU usage:
  context.value + ctxutil.FastValue 0.5%
  context.WithValue 0.1%
  ctxutil.WithFastValue(s) 0.1%
  total 0.7%
 - Allocs:
  context.WithValue: 0.2%
  context.WithFastValue: 0.6%
  ctxutil.FastValuesBuilder.Finish: 1.4%
  total 2.2%

I will investigate improving on this, perhaps with providing a
pre-allocated context in codepaths where this is possible.

Informs: cockroachdb#136581
Release note: None
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Dec 13, 2024
This commit introduces a custom context type that can be used to more
efficiently associate values to contexts.

Instead of using arbitrary objects as keys, the keys are a small set
of integers. Any key must be globally registered first, and there is a
limited number that can be registered.

The `fastValuesCtx` stores *all* values for these keys, so traversing
the context hierarchy is not necessary. We also provide a way to add
multiple values at the same time, which saves on allocations.

I looked at before and after profiles in sysbench. Before:
 - CPU usage:
  context.value 0.6%
  context.WithValue 0.3%
  total 0.9%
 - Allocs:
  context.WithValue: 3.5%

After:
 - CPU usage:
  context.value + ctxutil.FastValue 0.5%
  context.WithValue 0.1%
  ctxutil.WithFastValue(s) 0.1%
  total 0.7%
 - Allocs:
  context.WithValue: 0.2%
  context.WithFastValue: 0.6%
  ctxutil.FastValuesBuilder.Finish: 1.4%
  total 2.2%

I will investigate improving on this, perhaps with providing a
pre-allocated context in codepaths where this is possible.

Informs: cockroachdb#136581
Release note: None
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Dec 13, 2024
This commit introduces a custom context type that can be used to more
efficiently associate values to contexts.

Instead of using arbitrary objects as keys, the keys are a small set
of integers. Any key must be globally registered first, and there is a
limited number that can be registered.

The `fastValuesCtx` stores *all* values for these keys, so traversing
the context hierarchy is not necessary. We also provide a way to add
multiple values at the same time, which saves on allocations.

I looked at before and after profiles in sysbench. Before:
 - CPU usage:
  context.value 0.6%
  context.WithValue 0.3%
  total 0.9%
 - Allocs:
  context.WithValue: 3.5%

After:
 - CPU usage:
  context.value + ctxutil.FastValue 0.5%
  context.WithValue 0.1%
  ctxutil.WithFastValue(s) 0.1%
  total 0.7%
 - Allocs:
  context.WithValue: 0.2%
  context.WithFastValue: 0.6%
  ctxutil.FastValuesBuilder.Finish: 1.4%
  total 2.2%

I will investigate improving on this, perhaps with providing a
pre-allocated context in codepaths where this is possible.

Informs: cockroachdb#136581
Release note: None
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Dec 13, 2024
This commit introduces a custom context type that can be used to more
efficiently associate values to contexts.

Instead of using arbitrary objects as keys, the keys are a small set
of integers. Any key must be globally registered first, and there is a
limited number that can be registered.

The `fastValuesCtx` stores *all* values for these keys, so traversing
the context hierarchy is not necessary. We also provide a way to add
multiple values at the same time, which saves on allocations.

I looked at before and after profiles in sysbench. Before:
 - CPU usage:
  context.value 0.6%
  context.WithValue 0.3%
  total 0.9%
 - Allocs:
  context.WithValue: 3.5%

After:
 - CPU usage:
  context.value + ctxutil.FastValue 0.5%
  context.WithValue 0.1%
  ctxutil.WithFastValue(s) 0.1%
  total 0.7%
 - Allocs:
  context.WithValue: 0.2%
  context.WithFastValue: 0.6%
  ctxutil.FastValuesBuilder.Finish: 1.4%
  total 2.2%

I will investigate improving on this, perhaps with providing a
pre-allocated context in codepaths where this is possible.

Informs: cockroachdb#136581
Release note: None
craig bot pushed a commit that referenced this issue Dec 15, 2024
137344: ctxutil: introduce fast values r=RaduBerinde a=RaduBerinde

#### go.mod: update logtag dependency

See cockroachdb/logtags#4

Epic: none
Release note: None

#### serverident: unexport ServerIdentificationContextKey

Epic: none
Release note: None

#### ctxutil: introduce fast values

This commit introduces a custom context type that can be used to more
efficiently associate values to contexts.

Instead of using arbitrary objects as keys, the keys are a small set
of integers. Any key must be globally registered first, and there is a
limited number that can be registered.

The `fastValuesCtx` stores *all* values for these keys, so traversing
the context hierarchy is not necessary. We also provide a way to add
multiple values at the same time, which saves on allocations.

I looked at before and after profiles in sysbench. Before:
 - CPU usage:
  context.value 0.6%
  context.WithValue 0.3%
  total 0.9%
 - Allocs:
  context.WithValue: 3.5%

After:
 - CPU usage:
  context.value + ctxutil.FastValue 0.5%
  context.WithValue 0.1%
  ctxutil.WithFastValue(s) 0.1%
  total 0.7%
 - Allocs:
  context.WithValue: 0.2%
  context.WithFastValue: 0.6%
  ctxutil.FastValuesBuilder.Finish: 1.4%
  total 2.2%

I will investigate improving on this, perhaps with providing a
pre-allocated context in codepaths where this is possible.

Informs: #136581
Informs: #135904
Release note: None

Co-authored-by: Radu Berinde <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. o-perf-efficiency Related to performance efficiency
Projects
None yet
Development

No branches or pull requests

2 participants