-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: do not evaluate AOST timestamp in session migrations #108503
Conversation
NB: I'm only backporting this to v23.1 and not 22.2 since we should prioritize fixing this for Serverless clusters, which are all on 23.1 now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also log the error here instead of swallowing it during session migration?
cockroach/pkg/sql/conn_executor_prepare.go
Lines 291 to 293 in 7fe93d7
if err := prepare(ctx, ex.state.mu.txn); err != nil && origin != PreparedStatementOriginSessionMigration { | |
return nil, err | |
} |
As an aside, @jeffswenson thinks that we should have errored out if we fail to prepare any statements during deserialization instead of having it error when the user uses them.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
Release note (bug fix): Fixed a bug where a session migration performed by SHOW TRANSFER STATE would not handle prepared statements that used the AS OF SYSTEM TIME clause. Users who encountered this bug would see errors such as `expected 1 or 0 for number of format codes, got N`. This bug was present since v22.2.0.
2fad53a
to
5befaa7
Compare
Added the warning log.
I disagree with this - even without session migrations, it is possible for a prepared statement to become invalid. One example is if a schema change occurs on a table referenced by a prepared statement. That was the motivation for this commit: dda2fa9. If we error out eagerly, then that would be contrary to the goal of making session migrations transparent to the user. IMO the missing piece here is that we weren't logging the reason why a statement couldn't be prepared. |
The current behavior only makes sense to me IFF we are confident that all of the errors are the result of schema changes. If there is a bug in the prepare logic that impacts transferred sessions, it would be much better for the transfer to fail than to have the session limp along with corrupted state. The warning log is an improvement, but I'm still worried this will hide hard to track down bugs. Thinking through how applications are developed, I'm skeptical that prepared statements broken by schema changes is actually a common scenario. Usually schema changes are backwards compatible with the running application. |
I have confidence. This bug being fixed seems to be the first issue we've heard about in a over year of this feature being used, so I believe we can get to a place where unexpected errors during session transfers do not happen unless something is majorly wrong. Given how commonly people ask about invalidated prepared statements in bug trackers, I think it is probably more common than you predict, and failing eagerly would be more disruptive. (The user's experience would be that their connection was dropped for no clear reason.)
|
I'm open to keeping the current implementation as long as we periodically check for this log warning. That said, if there is a broken plan, closing the connection is likely the least disruptive thing we can do. We can't migrate sessions in the middle of a transaction and as long as they are using a connection pool, the connection will be reopened for them. |
for sure. i will add it to our SLI dashboard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @rafiss)
tftr! bors r+ |
Build failed (retrying...): |
This PR was included in a batch that was canceled, it will be automatically retried |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 5befaa7 to blathers/backport-release-23.1-108503: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
fixes https://github.com/cockroachlabs/support/issues/2510
refs #108305
Release note (bug fix): Fixed a bug where a session migration performed by SHOW TRANSFER STATE would not handle prepared statements that used the AS OF SYSTEM TIME clause. Users who encountered this bug would see errors such as
expected 1 or 0 for number of format codes, got N
. This bug was present since v22.2.0.