-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachperf: high core-count regression around February 16th #76738
Comments
Perhaps this was my advice on how many shards to use. |
Hmmm is there a wiki on how to run the bisection ? |
Hmm the new |
Can we treat this with some urgency? A 20% regression has ripple effects across the org, for example @nvanbenschoten was trying to validate a customer workload yesterday (on large machines) and this regression significantly shifts the baseline. |
Hi @tbg, this issue is high on my todo list to work on. Meanwhile, you can disable the offending component via |
Spent the day investigating this. Interesting observation, I think in high core count machines, the writer is producing writes a lot faster than the single background goroutine can keep up. Once the channel is filled, and the goroutine cannot keep up, it produces back pressure, and slowing down the writers. I did a few runs with different configuration, here is the result that I have observed: (note: 16-128-168 here means 16 shards writer, channel of size 128, and batch size of 168)
I haven't tested different # goroutine + buffer size configurations, so this might be a complete overkill, but it seems like it has eliminated the perf drop seen in 32-core machines. @ajwerner thoughts on pursuing down this path to solve the perf issue here? Raw benchmark data.
|
Should we consider disabling this setting by default until we get to the bottom of it? |
@Azhng Thoughts on disabling this? It can mask other performance issues, and as we're moving into stability we'll need to start addressing them across the board. |
See cockroachdb#76738. Release note: None
76973: txnidcache: disable cache by default r=erikgrinaker a=tbg See #76738. Release note: None Co-authored-by: Tobias Grieger <[email protected]>
The regression should now be "fixed" by defaulting the cluster setting to zero (#76973 (comment)). We'll need to add an annotation to roachperf. |
See cockroachdb#76738. Release note: None
The heap profile here is pointing the finger at the eviction list in the FIFO store. Somehow that's creating a lot of objects on the heap. Hmm I was under the impression that using the This is with capacity limit set to 64MB Hmm though this doesn't quite explain how running 3 copies of txnIDCache in 3 different goroutines was able to improve the situation. 🤔 Profile: pprof.cockroach.alloc_objects.alloc_space.inuse_objects.inuse_space.004.pb.gz |
Those |
It seems to me like you're not accounting for the memory properly here. Namely, you need to account for the blocks themselves which are in use |
This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. This is DNM because the tests need to change. Touches cockroachdb#76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None
This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. Resolves cockroachdb#76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None
This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. Resolves cockroachdb#76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None
This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. Resolves cockroachdb#76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None
This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. Resolves cockroachdb#76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None
This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. Resolves cockroachdb#76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None
See cockroachdb#76738. Release note: None
77208: sql: update test that was fooling itself r=ajwerner a=ajwerner I have no clue what is going on in #76843 but this test was fooling itself regarding the existence of separate connections. Release justification: non-production code changes Release note: None 77220: sql/contention/txnidcache: reuse blocks in list, account for space r=maryliag,ajwerner a=ajwerner This change does two things to the txnidcache: 1) It accounts for the space used by the fifo eviction list. Previously we'd use more than double the intended space. We should probably also subtrace out the size of the buffers we're currently filling and the channel we use to communicate them, but I'll leave that for later. 2) It stops trying to compact the blocks. Compacting the blocks ends up being a good deal of overhead because we have to copy across every single message. Instead we can just append the block directly to the list. This does have the hazard of wasting a lot of space when the blocks are sparse. However, if the blocks are sparse, we know that the throughput is low, so it's fine. Resolves #76738 Release justification: bug fixes and low-risk updates to new functionality Release note: None 77363: sql/delegate: avoid extra string->int parsing r=otan a=rafiss Release justification: low risk improvement Release note: None 77438: ui: Remove stray parenthesis in Jobs page r=jocrl a=jocrl Addresses #77440. This commit fixes the stray parenthesis at the end of the duration time for a succeeded job. The parenthesis had been introduced in #76691 and the 21.2 backport #73624. Before: ![image](https://user-images.githubusercontent.com/91907326/157065776-456c8f7d-1958-4192-b38d-dcb40432cf9d.png) After: ![image](https://user-images.githubusercontent.com/91907326/157065785-e3f2db6a-67d1-4ae3-87cb-df71dccf0e5f.png) Release note (ui): Remove stray parenthesis at the end of the duration time for a succeeded job. It had been accidentally introduced to unreleased master and a 21.2 backport. Release justification: Category 2, UI bug fix Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Josephine Lee <[email protected]>
Describe the problem
Please describe the issue you observed, and any steps we can take to reproduce it:
https://roachperf.crdb.dev/?filter=&view=kv0%2Fenc%3Dfalse%2Fnodes%3D3%2Fcpu%3D96&tab=aws
https://roachperf.crdb.dev/?filter=&view=kv95%2Fenc%3Dfalse%2Fnodes%3D3%2Fcpu%3D32%2Fseq&tab=aws
The same day we see marked improvements in the lower core-count workloads. Presumably this is all due to #76350, but we should bisect and profile to understand.
cc @Azhng 😓
Jira issue: CRDB-13255
The text was updated successfully, but these errors were encountered: