sql: commitPrepStmtNamespace is allocation heavy and almost never needed #32427

nvanbenschoten · 2018-11-16T20:49:22Z

Above is a screenshot of a CPU profile from a kv95 run on a single-node m5d.12xlarge cluster. We can see that commitPrepStmtNamespace is responsible for 2.48% of CPU utilization. This is pretty insane considering there's nothing in the prepared stmt namespace that needs rolling back.

We should page this prepStmtNamespace stuff back in (including its interaction with the pgwire protocol) and then audit all uses of it. We should then make it cheaper to use, introduce fast-paths that avoid touching it at all when not necessary, or do some combination of both.

cc. @knz @andreimatei

Jira issue: CRDB-4747

Epic CRDB-42584

The text was updated successfully, but these errors were encountered:

knz · 2018-11-17T13:46:02Z

@jordanlewis @petermattis which team / project is this landing in?

petermattis · 2018-11-17T16:18:48Z

That code is in no-man's land. I think the SQL execution team might have more bandwidth for this, though. @jordanlewis feel free to push back if you disagree.

jordanlewis · 2018-11-17T18:57:05Z

We can take this on eventually, though it doesn't seem particularly high priority, and I wouldn't make a claim about having any extra bandwidth.

nvanbenschoten · 2018-11-20T19:14:07Z

There's a 6.4% speedup sitting here for sysbench's oltp_point_select workload (and presumably kv100). I'll see if I can knock this out later this week.

Fixes cockroachdb#32427. This change reworks the handling of `prepStmtsNamespace` and `prepStmtsNamespaceAtTxnRewindPos` to avoid cloning the former indiscriminately after every statement. This clone was very expensive because it resulted in a series of map allocations and memory copies. The new approach avoids the cloning in almost all cases by performing the copy only when it is needed. In the past, this would translate to cases where `prepStmtsNamespace` and `prepStmtsNamespaceAtTxnRewindPos` actually differed. It does so by introducing a new `checkpointedPrepStmtNamespace` type which abstracts away the details of taking snapshots of the namespace and rewinding when necessary. The change also adds a fast-path that I've wanted for a while which allows us to avoid map accesses entirely for the unnamed prepared statement/portal. A large number of drivers only use the unnamed portal by default, so this should provide a nice speedup. Release note: None

nvanbenschoten · 2018-11-25T23:38:33Z

I took a shot at this in this branch: nvanbenschoten/prepStmtPerf. The approach was to only copy these prepStmtNamespaces when they were actually changed. What I didn't take into consideration is that their portals are almost always mutated because clients repeatedly bind with prepared statements, so the change didn't have the desired effect.

@andreimatei this got me thinking about whether we have the wrong structure here. We need to be able to roll back prepared statements during a txn retry, but do we need to be able to roll back portals? Will clients ever rely on the result of a previously prepared statement binding instead of just rebinding every time they want to use a prepared statement? If not then we could constrain the namespace checkpointing to just prepared statements and fix this easily. If we do need to be able to roll back portals then we'll need to rework the connExecutor's interaction with the prepStmtNamespaces checkpointing. We'll need to push down information about when it's ok to modify the checkpoint directly (e.g. when we're running an implicit txn and can't possibly roll it back).

andreimatei · 2018-11-26T03:03:51Z

We need to be able to roll back prepared statements during a txn retry, but do we need to be able to roll back portals?

I don't know; I don't know what the lifetimes of both portals or prepared statements is, and the docs unfortunately are not very clear.

nvanbenschoten · 2018-11-26T12:23:53Z

From https://www.postgresql.org/docs/10/protocol-flow.html:

If successfully created, a named portal object lasts till the end of the current transaction, unless explicitly destroyed. An unnamed portal is destroyed at the end of the transaction, or as soon as the next Bind statement specifying the unnamed portal as destination is issued. (Note that a simple Query message also destroys the unnamed portal.) Named portals must be explicitly closed before they can be redefined by another Bind message, but this is not required for the unnamed portal. Named portals can also be created and accessed at the SQL command level, using DECLARE CURSOR and FETCH.

So it looks like we're already in violation of this. It hasn't seemed to cause any issues though, but I'm sure it would if clients were expecting portals to be cleaned up after txns, which could lead to an unbounded number of portals.

This gives us some flexibility though, as it implies that portals don't need to persist through txn retries.

github-actions · 2021-06-05T02:15:44Z

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
5 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

knz · 2021-06-05T07:28:34Z

This is still relevant it seems.

133971: raft: enable store liveness in raft unit tests phase 2 r=iskettaneh a=iskettaneh This commit modifies the following tests to allow them to run in both store liveness enabled/disabled configurations: 1) TestPreVoteMigrationCanCompleteElection 2) TestPreVoteMigrationWithFreeStuckPreCandidate 3) TestAddNodeCheckQuorum 4) TestCandidateResetTermMsgHeartbeat 5) TestCandidateResetTermMsgApp Note that these tests create their own Raft instances instead of relying on the network te create ones. References: #132241 Release note: None 134183: sql: don't copy maps in `prepStmtNamespace.resetTo` if identical r=nvanbenschoten a=nvanbenschoten This commit optimizes prepStmtNamespace.resetTo for the common case under commitPrepStmtNamespace where the prepStmtNamespace maps are identical. In these cases, deleting and re-inserting into the hash maps can cause hash map resizing and incur heap allocations. Longer term, we should reconsider the relationship between `prepStmtsNamespace` and `prepStmtsNamespaceAtTxnRewindPos`, ideally using a scheme that allows for a constant-time "reset" operation in the common case. This could be accomplished using a copy-on-write scheme or by maintaining some extra state about how the two have diverged since the last reset. For now, we get 90% of the way there by testing for equality before copying maps. ``` name old time/op new time/op delta Sysbench/SQL/oltp_read_only 36.2ms ± 3% 36.7ms ± 3% ~ (p=0.095 n=10+9) Sysbench/SQL/oltp_write_only 19.0ms ± 2% 19.1ms ± 1% ~ (p=0.297 n=9+9) Sysbench/SQL/oltp_read_write 60.1ms ± 6% 58.2ms ± 4% -3.15% (p=0.004 n=9+10) Sysbench/SQL/oltp_point_select 1.61ms ± 2% 1.57ms ± 2% -2.24% (p=0.001 n=9+9) Sysbench/SQL/oltp_begin_commit 587µs ± 2% 543µs ± 2% -7.58% (p=0.000 n=9+9) name old alloc/op new alloc/op delta Sysbench/SQL/oltp_read_only 1.05MB ± 0% 1.05MB ± 0% ~ (p=0.730 n=9+9) Sysbench/SQL/oltp_write_only 403kB ± 1% 398kB ± 1% -1.17% (p=0.001 n=10+9) Sysbench/SQL/oltp_read_write 1.51MB ± 1% 1.49MB ± 0% -1.20% (p=0.000 n=10+8) Sysbench/SQL/oltp_point_select 35.0kB ± 0% 31.7kB ± 0% -9.28% (p=0.000 n=10+10) Sysbench/SQL/oltp_begin_commit 18.8kB ± 0% 14.8kB ± 0% -21.29% (p=0.000 n=9+9) name old allocs/op new allocs/op delta Sysbench/SQL/oltp_read_only 6.94k ± 0% 6.95k ± 1% ~ (p=0.348 n=9+9) Sysbench/SQL/oltp_write_only 3.62k ± 1% 3.61k ± 0% ~ (p=0.067 n=10+9) Sysbench/SQL/oltp_read_write 11.0k ± 1% 10.9k ± 1% -0.91% (p=0.000 n=9+9) Sysbench/SQL/oltp_point_select 317 ± 0% 309 ± 0% -2.37% (p=0.000 n=10+10) Sysbench/SQL/oltp_begin_commit 139 ± 0% 130 ± 0% -6.74% (p=0.000 n=10+10) ``` Informs: #32427. Epic: None Release note: None Co-authored-by: Ibrahim Kettaneh <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>

rafiss · 2024-12-03T20:58:19Z

Closing since this is an old issue, and it seems that #134183 addressed the current known improvement. Since performance optimization is an active area of effort, this will resurface through future testing if needed.

nvanbenschoten added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-executor SQL txn logic labels Nov 16, 2018

petermattis assigned jordanlewis Nov 17, 2018

nvanbenschoten mentioned this issue Nov 20, 2018

sql/kv: resetForNewSQLTxn is inefficient and allocation heavy #32508

Open

github-actions bot added the no-issue-activity label Jun 5, 2021

knz unassigned jordanlewis Jun 5, 2021

knz removed the no-issue-activity label Jun 5, 2021

RaduBerinde added the E-quick-win Likely to be a quick win for someone experienced. label Jun 8, 2021

jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021

blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label May 25, 2023

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to New Backlog in SQL Queries Jul 24, 2023

exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Oct 10, 2024

yuzefovich mentioned this issue Nov 4, 2024

sql: don't copy maps in prepStmtNamespace.resetTo if identical #134183

Merged

rafiss added the o-perf-efficiency Related to performance efficiency label Dec 2, 2024

rafiss closed this as completed Dec 3, 2024

github-project-automation bot moved this from Backlog to Done in SQL Queries Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: commitPrepStmtNamespace is allocation heavy and almost never needed #32427

sql: commitPrepStmtNamespace is allocation heavy and almost never needed #32427

nvanbenschoten commented Nov 16, 2018 •

edited by exalate-issue-sync bot

Loading

knz commented Nov 17, 2018

petermattis commented Nov 17, 2018

jordanlewis commented Nov 17, 2018

nvanbenschoten commented Nov 20, 2018

nvanbenschoten commented Nov 25, 2018

andreimatei commented Nov 26, 2018

nvanbenschoten commented Nov 26, 2018

github-actions bot commented Jun 5, 2021

knz commented Jun 5, 2021

rafiss commented Dec 3, 2024

sql: commitPrepStmtNamespace is allocation heavy and almost never needed #32427

sql: commitPrepStmtNamespace is allocation heavy and almost never needed #32427

Comments

nvanbenschoten commented Nov 16, 2018 • edited by exalate-issue-sync bot Loading

knz commented Nov 17, 2018

petermattis commented Nov 17, 2018

jordanlewis commented Nov 17, 2018

nvanbenschoten commented Nov 20, 2018

nvanbenschoten commented Nov 25, 2018

andreimatei commented Nov 26, 2018

nvanbenschoten commented Nov 26, 2018

github-actions bot commented Jun 5, 2021

knz commented Jun 5, 2021

rafiss commented Dec 3, 2024

nvanbenschoten commented Nov 16, 2018 •

edited by exalate-issue-sync bot

Loading