Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: do not close stmt buffer of internal executor in errCallback #80070

Merged
merged 1 commit into from
Apr 26, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Apr 16, 2022

Previously, we would close the stmt buffer of the internal executor in
errCallback, "just to be safe" since it was assumed that the buffer is
already closed when the callback is executed. The callback runs whenever
run() loop of connExecutor exits with an error.

However, it is possible for the following sequence of events to happen:

  • The new goroutine is spun up for the internal executor before any
    commands are pushed into the stmt buffer.
  • The context is canceled before the new goroutine blocks waiting for
    the command to execute, i.e. run() loop is exited before any commands
    are executed.
  • The errCallback with the context cancellation error is evaluated.
    This closes the stmt buffer. The goroutine exits.
  • The main goroutine tries to push some commands into the buffer only to
    find that it was already closed. An assertion error is returned, and
    a sentry event is created.

I think we should just not close the stmt buffer in the errCallback
since this was never necessary and can lead to the scenario described
above where no sentry event should be emitted.

Fixes: #79746.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich marked this pull request as ready for review April 16, 2022 18:49
@yuzefovich yuzefovich requested review from cucaroach and a team April 16, 2022 18:50
Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine but I don't feel qualified to sign off on it, can you get another reviewer more familiar with this code?

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)

@yuzefovich yuzefovich requested a review from jordanlewis April 19, 2022 15:12
@yuzefovich
Copy link
Member Author

Unfortunately, I don't think anyone on our team has any familiarity with this code. @jordanlewis was the main reviewer of #59330 when the code in question was introduced, so maybe you can take a quick look?

@yuzefovich
Copy link
Member Author

@mgartner I think you mentioned that you need to get familiar with connExecutor for UDFs, so maybe could take a look at this too?

@jordanlewis
Copy link
Member

@yuzefovich, is there a way we can test this scenario you describe? It sounds like an interesting edge case, yet clearly it's not tested at all right now

@RichardJCai, aren't you working on internal executor soon? Could you take a look at this (and maybe talk more with Yahor about the change)?

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Apr 26, 2022
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I added a WIP commit with the test that follows exactly the scenario I described in the commit message. Everything behaves exactly as I described. However, without introducing several knobs I don't know how to make such a test non-flaky, and adding the knobs seems not worth it just for this edge case, so I'm inclined to not merge the test.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @jordanlewis)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: fix "the new goroutine is span..." in the PR description

:lgtm: It's unfortunate that a test requires these complex knobs or significantly restructuring the code. i'm find with leaving out the test.

Reviewed 1 of 1 files at r1, 3 of 3 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)


-- commits line 17 at r2:
nit: formatting typo here?

Previously, we would close the stmt buffer of the internal executor in
`errCallback`, "just to be safe" since it was assumed that the buffer is
already closed when the callback is executed. The callback runs whenever
`run()` loop of connExecutor exits with an error.

However, it is possible for the following sequence of events to happen:
- The new goroutine is spun up for the internal executor before any
commands are pushed into the stmt buffer.
- The context is canceled before the new goroutine blocks waiting for
the command to execute, i.e. `run()` loop is exited before any commands
are executed.
- The `errCallback` with the context cancellation error is evaluated.
This closes the stmt buffer. The goroutine exits.
- The main goroutine tries to push some commands into the buffer only to
find that it was already closed. An assertion error is returned, and
a sentry event is created.

I think we should just not close the stmt buffer in the `errCallback`
since this was never necessary and can lead to the scenario described
above where no sentry event should be emitted.

Release note: None
@yuzefovich yuzefovich added backport-22.1.x and removed do-not-merge bors won't merge a PR with this label. labels Apr 26, 2022
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis and @mgartner)

@craig
Copy link
Contributor

craig bot commented Apr 26, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: v21.1.6: attempting to push into closed stmt buffer in the internal executor
5 participants