Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

execinfra: uncertainty error can be incorrectly swallowed during a distributed join stage #51458

Closed
asubiotto opened this issue Jul 15, 2020 · 2 comments · Fixed by #51518
Closed
Assignees
Labels
A-sql-execution Relating to SQL execution. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@asubiotto
Copy link
Contributor

Read uncertainty errors are sometimes swallowed if a limit is hit. This makes it so that a client does not have to retry if the results already satisfy the query (e.g. a limit). This issue is about this swallowing behavior occurring when it shouldn't be (i.e. returning incorrect results to the client).

A customer ran into an issue that I was able to deterministically reproduce in #51375. I think I understand what's going on. Here is the interesting part of the plan for reference:
image

TableReader/6 is the one that emits the uncertainty error. This is then propagated downstream as expected. Note that the hash router sends the error to only one output. In this case the chain looks like TableReader/6 -> MergeJoiner/7 -> HashJoiner/9 -> HashJoiner/13. Everything works fine until we get to HashJoiner/13 which short circuits because its right side, TableReader/12, is empty and then swallows the uncertainty error on the next iteration. This is technically valid because we're doing an inner join.

However, the interesting part is that HashJoiner/13 belongs to a join stage that includes HashJoiner/15 and HashJoiner/14. HashJoiner/15 short circuits because its right side is empty, which again, is valid. The crucial part of this bug is that HashJoiner/14 short-circuits because its left side is empty. Its left side includes HashJoiner/9, which is the last time the uncertainty error was correctly propagated. This is invalid because HashJoiner/9 correctly invalidated the results pushed to HashJoiner/13, but does not do the same with HashJoiner/14 or HashJoiner/15 because of how an error is propagated with the hash router. I think there are several ways to fix this:

  1. Using @rohany's change (sql: wip fix hashjoiner error prop #51324 ). Instead of short-circuiting immediately, a HashJoiner drains the non-empty side and propagates any metadata without swallowing it before moving to draining. The disadvantage of this approach is that it is localized to short-circuiting behavior (not that I know of this issue anywhere else) and having to fully drain the non-empty side, which might defeat the point of short-circuiting in the first place.
  2. When a stream splits (e.g. in a hashRouter), error metadata must be copied and sent to all streams so that it is valid for processors to make the local decision of whether or not to swallow errors. The advantage is that this is a more general solution, the disadvantage is that I don't know if sending multiple errors has side effects. We could also make this behavior specific to uncertainty erros.
  3. Move the decision to swallow uncertainty errors out of ProcessorBase, and make it a more "global"/specific decision. I'm not sure exactly what this would look like because we can't move this logic to a receiver (it doesn't know when a limit has been hit), and might mean that we need to move it to specific processors (e.g. limit, anything else?).
@blathers-crl blathers-crl bot added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 15, 2020
@asubiotto asubiotto added the A-sql-execution Relating to SQL execution. label Jul 15, 2020
@andreimatei
Copy link
Contributor

Good stuff!
It seems to me / IIUC that the problem is that HashJoiner/9 somehow tells HashJoiner/14 that it doesn't have any rows for it, and also no error, thinking that it's sufficient to propagate the error to HashJoiner/13. I think that, even if HJ/13 wouldn't swallow the error, this would still be broken. Making processors downstream of HJ/9 think that HJ/9 had no rows to produce can cause incorrect results to be produced. These incorrect results would race with the error and, if they win and are returned to the client, that'd be bad even if the error is returned afterwards. Right?

So, it seems to me, that that's the thing to address here: like you say in 2) I think we'd do well to have routers propagate errors to all consumers (at least all of them that are not already draining), not just to one. Because, fundamentally, there's a difference between a processor exhausting its inputs and the same processor starting to drain early because of an error.

cc @RaduBerinde for old times sake

@cockroachdb cockroachdb deleted a comment from blathers-crl bot Jul 15, 2020
@yuzefovich
Copy link
Member

Nice investigation! I agree with Andrei that it makes sense that we should be propagating errors to all outputs of the hash router, and I also think that we should be propagating all errors in such fashion, not just RWUI error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-execution Relating to SQL execution. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants