-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add partial support for pgwire row count limits #39085
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woo. Can you add a note on what vendored deps had to get upgraded?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @jordanlewis, and @mjibson)
pkg/sql/distsql_running.go, line 715 at r1 (raw file):
// LimitedResultErr is an error produced by pgwire indicating an unsupported // feature of row count limits was attempted. var LimitedResultErr = unimplemented.NewWithIssue(4035, "execute row count limits not supported")
I think we should make this error more specific now that we have partial support for the feature.
pkg/sql/pgwire/command_result.go, line 345 at r1 (raw file):
// forward. The work included is things like auditing all of the defers and // post-execution stuff (like stats collection) to have it only execute once // per statement instead of once per portal.
This is great. It would be nice to make sure we don't lose some other parts of the context we gathered during this exercise as well. Did you have more context in that Google doc you were working on? If so, could you move the Google doc that you wrote with the different pros and cons into Confluence and link that from here? You can move it into the public Confluence space (CRDB) so that this link is followable by non-employees.
pkg/sql/pgwire/command_result.go, line 421 at r1 (raw file):
default: // We got some other message, but we only support executing to completion. return errors.WithDetail(sql.LimitedResultErr, "portals must be executed to completion")
What message is sent when a client explicitly closes their result set early? I would like to make sure we support that use case too, if it's possible, because I think it is likely commonly used.
pkg/sql/pgwire/testdata/pgtest/portals_crbugs, line 110 at r1 (raw file):
ReadyForQuery ErrorResponse ReadyForQuery
Is there any way we can assert the error message here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @jordanlewis)
pkg/sql/distsql_running.go, line 715 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
I think we should make this error more specific now that we have partial support for the feature.
Done.
pkg/sql/pgwire/command_result.go, line 345 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
This is great. It would be nice to make sure we don't lose some other parts of the context we gathered during this exercise as well. Did you have more context in that Google doc you were working on? If so, could you move the Google doc that you wrote with the different pros and cons into Confluence and link that from here? You can move it into the public Confluence space (CRDB) so that this link is followable by non-employees.
I intended the commit message here to be that. Do you think the doc I wrote has some more useful color?
pkg/sql/pgwire/command_result.go, line 421 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
What message is sent when a client explicitly closes their result set early? I would like to make sure we support that use case too, if it's possible, because I think it is likely commonly used.
Portal suspension followed by an explicit close is what happens when a result set is closed early (as verified by wireshark with postgres). We don't support this. That's because the close message needs to do some other stuff that the sql state machine handles. We could support this maybe, but it'd make the current hack even worse. I'd prefer to only do this if we really really have to.
pkg/sql/pgwire/testdata/pgtest/portals_crbugs, line 110 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Is there any way we can assert the error message here as well?
Done.
RFAL. Added support for implicit transactions and added some tests for suspension then close, which we don't support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's get this in and start testing it more!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @mjibson)
pkg/sql/pgwire/command_result.go, line 345 at r1 (raw file):
Previously, mjibson (Matt Jibson) wrote…
I intended the commit message here to be that. Do you think the doc I wrote has some more useful color?
Ah, I didn't notice the commit message. Looks good.
pkg/sql/pgwire/command_result.go, line 421 at r1 (raw file):
Previously, mjibson (Matt Jibson) wrote…
Portal suspension followed by an explicit close is what happens when a result set is closed early (as verified by wireshark with postgres). We don't support this. That's because the close message needs to do some other stuff that the sql state machine handles. We could support this maybe, but it'd make the current hack even worse. I'd prefer to only do this if we really really have to.
Gotcha. I think we probably need to do this, but we can double check with some acceptance testing after this PR merges.
bors r+ |
Merge conflict (retrying...) |
Merge conflict |
Previously we supported row count limits as long as the limit was higher than the number of returned rows. Here we add a feature that supports this part of the spec in a partial way. This should unblock simple JDBC usage of this feature. Supported use cases: - implicit transactions (which auto close the portal after suspension) - explicit transactions executed to completion Unsupported use cases (with explicit transactions): - interleaved execution - explicitly closing a portal after partial execution Many options were evaluated during implementation. The one here is based on work where the pgwire package itself takes over the state machine processing during AddRow and looks for further ExecPortal messages. This has a number of problems: there are now two state machines, we can only support a part of the spec. However it also has a number of benefits like it is a simple implementation that is easy to understand. Two other solutions were evaluated. First, teaching distsql how to pause and resume execution (a proof-of-concept branch for this was produced). I did not pursue this route because of my own unfamiliarity with distsql, and I thought that attempting to reach a high level of confidence that all of the new pause, resume, and error logic flows were correct would be very difficult (I may be wrong about this). Also, this approach needed to address how to handle post-distsql execution cleanup and stats gathering. That is, after the call into distsql was paused and returned control back to the sql executor, there's a lot of code that gets run to cleanup stuff and report stats in various places, including some defers. These would need to be audited and only execute once per statement, not once per portal execution. Second, start distsql execution in a new go routine and send exec portal requests to it over a channel. This would avoid teaching distsql how to pause and resume itself, but instead move that complexity into the channel and go routine handling, another area ripe for race conditions and deadlocks. This also needed to deal with the post-distsql execution cleanup and defers handling discussed above. For now, we decided that this limited implementation is good enough for what we need today. In order to one day either support the full spec or pay down some of our technical debt, we can probably do a number of preliminary refactors that will make invoking much of the distsql path multpile times easier. pgx got a version bump to get support for PortalSuspended. The current pgx version had a few new dependencies too. See #4035 Release note (sql change): add partial support for row limits during portal execution in pgwire.
bors r+ |
39085: sql: add partial support for pgwire row count limits r=mjibson a=mjibson Previously we supported row count limits as long as the limit was higher than the number of returned rows. Here we add a feature that supports this part of the spec in a partial way. This should unblock simple JDBC usage of this feature. Supported use cases: - implicit transactions (which auto close the portal after suspension) - explicit transactions executed to completion Unsupported use cases (with explicit transactions): - interleaved execution - explicitly closing a portal after partial execution Many options were evaluated during implementation. The one here is based on work where the pgwire package itself takes over the state machine processing during AddRow and looks for further ExecPortal messages. This has a number of problems: there are now two state machines, we can only support a part of the spec. However it also has a number of benefits like it is a simple implementation that is easy to understand. Two other solutions were evaluated. First, teaching distsql how to pause and resume execution (a proof-of-concept branch for this was produced). I did not pursue this route because of my own unfamiliarity with distsql, and I thought that attempting to reach a high level of confidence that all of the new pause, resume, and error logic flows were correct would be very difficult (I may be wrong about this). Also, this approach needed to address how to handle post-distsql execution cleanup and stats gathering. That is, after the call into distsql was paused and returned control back to the sql executor, there's a lot of code that gets run to cleanup stuff and report stats in various places, including some defers. These would need to be audited and only execute once per statement, not once per portal execution. Second, start distsql execution in a new go routine and send exec portal requests to it over a channel. This would avoid teaching distsql how to pause and resume itself, but instead move that complexity into the channel and go routine handling, another area ripe for race conditions and deadlocks. This also needed to deal with the post-distsql execution cleanup and defers handling discussed above. For now, we decided that this limited implementation is good enough for what we need today. In order to one day either support the full spec or pay down some of our technical debt, we can probably do a number of preliminary refactors that will make invoking much of the distsql path multpile times easier. pgx got a version bump to get support for PortalSuspended. The current pgx version had a few new dependencies too. See #4035 Release note (sql change): add partial support for row limits during portal execution in pgwire. Co-authored-by: Matt Jibson <[email protected]>
Build succeeded |
Is the commercial version only supported? The community version is not |
This feature has no commercial requirement, it should work in either. What cockroach version are you using? It's only supported in 19.2. |
Previously we supported row count limits as long as the limit was higher
than the number of returned rows. Here we add a feature that supports
this part of the spec in a partial way. This should unblock simple JDBC
usage of this feature.
Supported use cases:
Unsupported use cases (with explicit transactions):
Many options were evaluated during implementation. The one here is based
on work where the pgwire package itself takes over the state machine
processing during AddRow and looks for further ExecPortal messages. This
has a number of problems: there are now two state machines, we can only
support a part of the spec. However it also has a number of benefits
like it is a simple implementation that is easy to understand.
Two other solutions were evaluated.
First, teaching distsql how to pause and resume execution (a
proof-of-concept branch for this was produced). I did not pursue this
route because of my own unfamiliarity with distsql, and I thought that
attempting to reach a high level of confidence that all of the new pause,
resume, and error logic flows were correct would be very difficult
(I may be wrong about this). Also, this approach needed to address how
to handle post-distsql execution cleanup and stats gathering. That is,
after the call into distsql was paused and returned control back to
the sql executor, there's a lot of code that gets run to cleanup stuff
and report stats in various places, including some defers. These would
need to be audited and only execute once per statement, not once per
portal execution.
Second, start distsql execution in a new go routine and send exec portal
requests to it over a channel. This would avoid teaching distsql how
to pause and resume itself, but instead move that complexity into the
channel and go routine handling, another area ripe for race conditions
and deadlocks. This also needed to deal with the post-distsql execution
cleanup and defers handling discussed above.
For now, we decided that this limited implementation is good enough for
what we need today. In order to one day either support the full spec
or pay down some of our technical debt, we can probably do a number of
preliminary refactors that will make invoking much of the distsql path
multpile times easier.
pgx got a version bump to get support for PortalSuspended. The current
pgx version had a few new dependencies too.
See #4035
Release note (sql change): add partial support for row limits during
portal execution in pgwire.