Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: commitPrepStmtNamespace is allocation heavy and almost never needed #32427

Closed
nvanbenschoten opened this issue Nov 16, 2018 · 10 comments
Closed
Labels
A-sql-executor SQL txn logic C-performance Perf of queries or internals. Solution not expected to change functional behavior. E-quick-win Likely to be a quick win for someone experienced. o-perf-efficiency Related to performance efficiency T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Nov 16, 2018

screen shot 2018-11-16 at 3 42 42 pm

Above is a screenshot of a CPU profile from a kv95 run on a single-node m5d.12xlarge cluster. We can see that commitPrepStmtNamespace is responsible for 2.48% of CPU utilization. This is pretty insane considering there's nothing in the prepared stmt namespace that needs rolling back.

We should page this prepStmtNamespace stuff back in (including its interaction with the pgwire protocol) and then audit all uses of it. We should then make it cheaper to use, introduce fast-paths that avoid touching it at all when not necessary, or do some combination of both.

cc. @knz @andreimatei

Jira issue: CRDB-4747

Epic CRDB-42584

@nvanbenschoten nvanbenschoten added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-executor SQL txn logic labels Nov 16, 2018
@knz
Copy link
Contributor

knz commented Nov 17, 2018

@jordanlewis @petermattis which team / project is this landing in?

@petermattis
Copy link
Collaborator

That code is in no-man's land. I think the SQL execution team might have more bandwidth for this, though. @jordanlewis feel free to push back if you disagree.

@jordanlewis
Copy link
Member

We can take this on eventually, though it doesn't seem particularly high priority, and I wouldn't make a claim about having any extra bandwidth.

@nvanbenschoten
Copy link
Member Author

There's a 6.4% speedup sitting here for sysbench's oltp_point_select workload (and presumably kv100). I'll see if I can knock this out later this week.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Nov 23, 2018
Fixes cockroachdb#32427.

This change reworks the handling of `prepStmtsNamespace` and
`prepStmtsNamespaceAtTxnRewindPos` to avoid cloning the former
indiscriminately after every statement. This clone was very
expensive because it resulted in a series of map allocations
and memory copies.

The new approach avoids the cloning in almost all cases by
performing the copy only when it is needed. In the past, this
would translate to cases where `prepStmtsNamespace` and
`prepStmtsNamespaceAtTxnRewindPos` actually differed. It
does so by introducing a new `checkpointedPrepStmtNamespace`
type which abstracts away the details of taking snapshots of
the namespace and rewinding when necessary.

The change also adds a fast-path that I've wanted for a while
which allows us to avoid map accesses entirely for the unnamed
prepared statement/portal. A large number of drivers only use
the unnamed portal by default, so this should provide a nice
speedup.

Release note: None
@nvanbenschoten
Copy link
Member Author

I took a shot at this in this branch: nvanbenschoten/prepStmtPerf. The approach was to only copy these prepStmtNamespaces when they were actually changed. What I didn't take into consideration is that their portals are almost always mutated because clients repeatedly bind with prepared statements, so the change didn't have the desired effect.

@andreimatei this got me thinking about whether we have the wrong structure here. We need to be able to roll back prepared statements during a txn retry, but do we need to be able to roll back portals? Will clients ever rely on the result of a previously prepared statement binding instead of just rebinding every time they want to use a prepared statement? If not then we could constrain the namespace checkpointing to just prepared statements and fix this easily. If we do need to be able to roll back portals then we'll need to rework the connExecutor's interaction with the prepStmtNamespaces checkpointing. We'll need to push down information about when it's ok to modify the checkpoint directly (e.g. when we're running an implicit txn and can't possibly roll it back).

@andreimatei
Copy link
Contributor

We need to be able to roll back prepared statements during a txn retry, but do we need to be able to roll back portals?

I don't know; I don't know what the lifetimes of both portals or prepared statements is, and the docs unfortunately are not very clear.

@nvanbenschoten
Copy link
Member Author

From https://www.postgresql.org/docs/10/protocol-flow.html:

If successfully created, a named portal object lasts till the end of the current transaction, unless explicitly destroyed. An unnamed portal is destroyed at the end of the transaction, or as soon as the next Bind statement specifying the unnamed portal as destination is issued. (Note that a simple Query message also destroys the unnamed portal.) Named portals must be explicitly closed before they can be redefined by another Bind message, but this is not required for the unnamed portal. Named portals can also be created and accessed at the SQL command level, using DECLARE CURSOR and FETCH.

So it looks like we're already in violation of this. It hasn't seemed to cause any issues though, but I'm sure it would if clients were expecting portals to be cleaned up after txns, which could lead to an unbounded number of portals.

This gives us some flexibility though, as it implies that portals don't need to persist through txn retries.

@github-actions
Copy link

github-actions bot commented Jun 5, 2021

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
5 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@knz
Copy link
Contributor

knz commented Jun 5, 2021

This is still relevant it seems.

@RaduBerinde RaduBerinde added the E-quick-win Likely to be a quick win for someone experienced. label Jun 8, 2021
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label May 25, 2023
@mgartner mgartner moved this to New Backlog in SQL Queries Jul 24, 2023
@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Oct 10, 2024
craig bot pushed a commit that referenced this issue Nov 4, 2024
133971: raft: enable store liveness in raft unit tests phase 2 r=iskettaneh a=iskettaneh

This commit modifies the following tests to allow them to run in both
store liveness enabled/disabled configurations:

1) TestPreVoteMigrationCanCompleteElection
2) TestPreVoteMigrationWithFreeStuckPreCandidate
3) TestAddNodeCheckQuorum
4) TestCandidateResetTermMsgHeartbeat
5) TestCandidateResetTermMsgApp

Note that these tests create their own Raft instances instead of relying
on the network te create ones.

References: #132241

Release note: None

134183: sql: don't copy maps in `prepStmtNamespace.resetTo` if identical r=nvanbenschoten a=nvanbenschoten

This commit optimizes prepStmtNamespace.resetTo for the common case under commitPrepStmtNamespace where the prepStmtNamespace maps are identical. In these cases, deleting and re-inserting into the hash maps can cause hash map resizing and incur heap allocations.

Longer term, we should reconsider the relationship between `prepStmtsNamespace` and `prepStmtsNamespaceAtTxnRewindPos`, ideally using a scheme that allows for a constant-time "reset" operation in the common case. This could be accomplished using a copy-on-write scheme or by maintaining some extra state about how the two have diverged since the last reset. For now, we get 90% of the way there by testing for equality before copying maps.

```
name                            old time/op    new time/op    delta
Sysbench/SQL/oltp_read_only       36.2ms ± 3%    36.7ms ± 3%     ~     (p=0.095 n=10+9)
Sysbench/SQL/oltp_write_only      19.0ms ± 2%    19.1ms ± 1%     ~     (p=0.297 n=9+9)
Sysbench/SQL/oltp_read_write      60.1ms ± 6%    58.2ms ± 4%   -3.15%  (p=0.004 n=9+10)
Sysbench/SQL/oltp_point_select    1.61ms ± 2%    1.57ms ± 2%   -2.24%  (p=0.001 n=9+9)
Sysbench/SQL/oltp_begin_commit     587µs ± 2%     543µs ± 2%   -7.58%  (p=0.000 n=9+9)

name                            old alloc/op   new alloc/op   delta
Sysbench/SQL/oltp_read_only       1.05MB ± 0%    1.05MB ± 0%     ~     (p=0.730 n=9+9)
Sysbench/SQL/oltp_write_only       403kB ± 1%     398kB ± 1%   -1.17%  (p=0.001 n=10+9)
Sysbench/SQL/oltp_read_write      1.51MB ± 1%    1.49MB ± 0%   -1.20%  (p=0.000 n=10+8)
Sysbench/SQL/oltp_point_select    35.0kB ± 0%    31.7kB ± 0%   -9.28%  (p=0.000 n=10+10)
Sysbench/SQL/oltp_begin_commit    18.8kB ± 0%    14.8kB ± 0%  -21.29%  (p=0.000 n=9+9)

name                            old allocs/op  new allocs/op  delta
Sysbench/SQL/oltp_read_only        6.94k ± 0%     6.95k ± 1%     ~     (p=0.348 n=9+9)
Sysbench/SQL/oltp_write_only       3.62k ± 1%     3.61k ± 0%     ~     (p=0.067 n=10+9)
Sysbench/SQL/oltp_read_write       11.0k ± 1%     10.9k ± 1%   -0.91%  (p=0.000 n=9+9)
Sysbench/SQL/oltp_point_select       317 ± 0%       309 ± 0%   -2.37%  (p=0.000 n=10+10)
Sysbench/SQL/oltp_begin_commit       139 ± 0%       130 ± 0%   -6.74%  (p=0.000 n=10+10)
```

Informs: #32427.
Epic: None
Release note: None

Co-authored-by: Ibrahim Kettaneh <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@rafiss rafiss added the o-perf-efficiency Related to performance efficiency label Dec 2, 2024
@rafiss
Copy link
Collaborator

rafiss commented Dec 3, 2024

Closing since this is an old issue, and it seems that #134183 addressed the current known improvement. Since performance optimization is an active area of effort, this will resurface through future testing if needed.

@rafiss rafiss closed this as completed Dec 3, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in SQL Queries Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-executor SQL txn logic C-performance Perf of queries or internals. Solution not expected to change functional behavior. E-quick-win Likely to be a quick win for someone experienced. o-perf-efficiency Related to performance efficiency T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
Status: Done
Development

No branches or pull requests

8 participants