-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/catalog/descs: optimize immutable access to the system db #71936
sql/catalog/descs: optimize immutable access to the system db #71936
Conversation
This one doesn't quite work. I really want to change catalogkv to hand back a builder instead of a descriptor. |
19bfa37
to
1a06e43
Compare
Okay, I have a new approach here that I'm happy with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is The Way, at least for the time being.
Is there any value in adding checks to ensure that a mutable system database is never returned? Although unlikely, previously it was no big deal if that happened, because the object was transient, but now it's a global var... |
We tend to access the system database a lot and we never cache it in the lease manager. Before this patch, we'd always copy and re-allocate a pair of them. There's no need to do that. ``` name old time/op new time/op delta FlowSetup/vectorize=true/distribute=true-16 141µs ± 3% 132µs ± 4% -6.35% (p=0.000 n=19+18) FlowSetup/vectorize=true/distribute=false-16 138µs ± 4% 129µs ± 3% -6.80% (p=0.000 n=19+18) FlowSetup/vectorize=false/distribute=true-16 134µs ± 2% 124µs ± 4% -7.55% (p=0.000 n=20+17) FlowSetup/vectorize=false/distribute=false-16 129µs ± 3% 120µs ± 3% -6.98% (p=0.000 n=20+18) name old alloc/op new alloc/op delta FlowSetup/vectorize=true/distribute=true-16 38.1kB ± 2% 36.8kB ± 3% -3.53% (p=0.000 n=18+19) FlowSetup/vectorize=true/distribute=false-16 36.2kB ± 0% 34.8kB ± 0% -3.93% (p=0.000 n=17+17) FlowSetup/vectorize=false/distribute=true-16 42.6kB ± 0% 41.2kB ± 0% -3.30% (p=0.000 n=18+16) FlowSetup/vectorize=false/distribute=false-16 41.0kB ± 0% 39.6kB ± 0% -3.44% (p=0.000 n=16+18) name old allocs/op new allocs/op delta FlowSetup/vectorize=true/distribute=true-16 368 ± 0% 345 ± 0% -6.25% (p=0.000 n=16+16) FlowSetup/vectorize=true/distribute=false-16 354 ± 0% 331 ± 0% -6.50% (p=0.000 n=18+18) FlowSetup/vectorize=false/distribute=true-16 337 ± 0% 315 ± 1% -6.69% (p=0.000 n=19+19) FlowSetup/vectorize=false/distribute=false-16 325 ± 0% 302 ± 0% -7.08% (p=0.000 n=17+18) ``` Release note: None
1a06e43
to
bba05aa
Compare
@postamar I revamped this to generate a mutable descriptor on the fly if one is needed. This side-steps any concerns about the singleton being modified. |
bors r+ |
Build succeeded: |
72669: bazel: properly generate `.eg.go` code in `pkg/sql/colconv` via bazel r=rail a=rickystewart Release note: None 72714: sql/catalog/lease: permit gaps in descriptor history r=ajwerner a=ajwerner In #71239, we added a new mechanism to look up historical descriptors. I erroneously informed @jameswsj10 that we would never have gaps in the descriptor history, and, thus, when looking up historical descriptors, we could always use the earliest descriptor's modification time as the bounds for the relevant query. This turns out to not be true. Consider the case where version 3 is a historical version and then version 4 pops up and gets leased. Version 3 will get removed if it is not referenced. In the meantime, version 3 existed when we went to go find version 2. At that point, we'll inject version 2 and have version 4 leased. We need to make sure we can handle the case where we need to go fetch version 3. In the meantime, this change also removes some logic added to support the eventual resurrection of #59606 whereby we'll use the export request to fetch descriptor history to power historical queries even in the face of descriptors having been deleted. Fixes #72706. Release note: None 72740: sql/catalog/descs: fix perf regression r=ajwerner a=ajwerner This commit in #71936 had the unfortunate side-effect of allocating and forcing reads on the `uncommittedDescriptors` set even when we aren't looking for the system database. This has an outsized impact on the performance of the single-node, high-core-count KV runs. Instead of always initializing the system database, just do it when we access it. ``` name old ops/s new ops/s delta KV95-throughput 88.6k ± 0% 94.8k ± 1% +7.00% (p=0.008 n=5+5) name old ms/s new ms/s delta KV95-P50 1.60 ± 0% 1.40 ± 0% -12.50% (p=0.008 n=5+5) KV95-Avg 0.60 ± 0% 0.50 ± 0% -16.67% (p=0.008 n=5+5) ``` The second commit is more speculative and came from looking at a profile where 1.6% of the allocated garbage was due to that `NameInfo` even though we'll never, ever hit it. <img width="2345" alt="Screen Shot 2021-11-15 at 12 57 31 AM" src="https://user-images.githubusercontent.com/1839234/141729924-d00eebab-b35c-42bd-8d0b-ee39f3ac7d46.png"> Fixes #72499 Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Andrew Werner <[email protected]>
We tend to access the system database a lot and we never cache it in the
lease manager. Before this patch, we'd always copy and re-allocate a pair
of them. There's no need to do that.
Release note: None