execinfra: uncertainty error can be incorrectly swallowed during a distributed join stage #51458

asubiotto · 2020-07-15T13:57:50Z

Read uncertainty errors are sometimes swallowed if a limit is hit. This makes it so that a client does not have to retry if the results already satisfy the query (e.g. a limit). This issue is about this swallowing behavior occurring when it shouldn't be (i.e. returning incorrect results to the client).

A customer ran into an issue that I was able to deterministically reproduce in #51375. I think I understand what's going on. Here is the interesting part of the plan for reference:

TableReader/6 is the one that emits the uncertainty error. This is then propagated downstream as expected. Note that the hash router sends the error to only one output. In this case the chain looks like TableReader/6 -> MergeJoiner/7 -> HashJoiner/9 -> HashJoiner/13. Everything works fine until we get to HashJoiner/13 which short circuits because its right side, TableReader/12, is empty and then swallows the uncertainty error on the next iteration. This is technically valid because we're doing an inner join.

However, the interesting part is that HashJoiner/13 belongs to a join stage that includes HashJoiner/15 and HashJoiner/14. HashJoiner/15 short circuits because its right side is empty, which again, is valid. The crucial part of this bug is that HashJoiner/14 short-circuits because its left side is empty. Its left side includes HashJoiner/9, which is the last time the uncertainty error was correctly propagated. This is invalid because HashJoiner/9 correctly invalidated the results pushed to HashJoiner/13, but does not do the same with HashJoiner/14 or HashJoiner/15 because of how an error is propagated with the hash router. I think there are several ways to fix this:

Using @rohany's change (sql: wip fix hashjoiner error prop #51324 ). Instead of short-circuiting immediately, a HashJoiner drains the non-empty side and propagates any metadata without swallowing it before moving to draining. The disadvantage of this approach is that it is localized to short-circuiting behavior (not that I know of this issue anywhere else) and having to fully drain the non-empty side, which might defeat the point of short-circuiting in the first place.
When a stream splits (e.g. in a hashRouter), error metadata must be copied and sent to all streams so that it is valid for processors to make the local decision of whether or not to swallow errors. The advantage is that this is a more general solution, the disadvantage is that I don't know if sending multiple errors has side effects. We could also make this behavior specific to uncertainty erros.
Move the decision to swallow uncertainty errors out of ProcessorBase, and make it a more "global"/specific decision. I'm not sure exactly what this would look like because we can't move this logic to a receiver (it doesn't know when a limit has been hit), and might mean that we need to move it to specific processors (e.g. limit, anything else?).

The text was updated successfully, but these errors were encountered:

andreimatei · 2020-07-15T15:17:44Z

Good stuff!
It seems to me / IIUC that the problem is that HashJoiner/9 somehow tells HashJoiner/14 that it doesn't have any rows for it, and also no error, thinking that it's sufficient to propagate the error to HashJoiner/13. I think that, even if HJ/13 wouldn't swallow the error, this would still be broken. Making processors downstream of HJ/9 think that HJ/9 had no rows to produce can cause incorrect results to be produced. These incorrect results would race with the error and, if they win and are returned to the client, that'd be bad even if the error is returned afterwards. Right?

So, it seems to me, that that's the thing to address here: like you say in 2) I think we'd do well to have routers propagate errors to all consumers (at least all of them that are not already draining), not just to one. Because, fundamentally, there's a difference between a processor exhausting its inputs and the same processor starting to drain early because of an error.

cc @RaduBerinde for old times sake

yuzefovich · 2020-07-16T17:37:04Z

Nice investigation! I agree with Andrei that it makes sense that we should be propagating errors to all outputs of the hash router, and I also think that we should be propagating all errors in such fashion, not just RWUI error.

blathers-crl bot added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 15, 2020

asubiotto added the A-sql-execution Relating to SQL execution. label Jul 15, 2020

cockroachdb deleted a comment from blathers-crl bot Jul 15, 2020

yuzefovich self-assigned this Jul 16, 2020

yuzefovich mentioned this issue Jul 16, 2020

rowflow,colexec: make routers propagate errors to all non-closed outputs #51518

Merged

craig bot closed this as completed in 4de665c Jul 28, 2020

This was referenced Jul 29, 2020

release-20.1: rowflow: make routers propagate errors to all non-closed outputs #52045

Merged

release-19.2: rowflow: make routers propagate errors to all non-closed outputs #52252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

execinfra: uncertainty error can be incorrectly swallowed during a distributed join stage #51458

execinfra: uncertainty error can be incorrectly swallowed during a distributed join stage #51458

asubiotto commented Jul 15, 2020

andreimatei commented Jul 15, 2020

yuzefovich commented Jul 16, 2020

execinfra: uncertainty error can be incorrectly swallowed during a distributed join stage #51458

execinfra: uncertainty error can be incorrectly swallowed during a distributed join stage #51458

Comments

asubiotto commented Jul 15, 2020

andreimatei commented Jul 15, 2020

yuzefovich commented Jul 16, 2020