-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Python] Sporadic asof join test failure #40675
Comments
cc @JerAguilon |
pitrou
changed the title
[Python] Sporadic asof join test failure
[C++][Python] Sporadic asof join test failure
Mar 19, 2024
This also happened in one nightly build in the last run: https://github.com/ursacomputing/crossbow/actions/runs/8351957186/job/22861244165 |
Repro in C++, at the chance of about one failure of tens.
|
This is causing wrong result so I'm adding critical label. |
pitrou
pushed a commit
that referenced
this issue
May 14, 2024
### Rationale for this change Sporadic asof join test failures have been frequently and annoyingly observed in pyarrow CI, as recorded in #40675 and #41149. Turns out the root causes are the same - a logical race (as opposed to physical race which can be detected by sanitizers). By injecting special delay in various places in asof join, as shown in zanmato1984@ea3b24c, the issue can be reproduced almost 100%. And I have put some descriptions in that commit to explain how the race happens. ### What changes are included in this PR? Eliminate the logical race of emptiness by combining multiple call-sites of `Empty()`. ### Are these changes tested? Include the UT to reproduce the issue. ### Are there any user-facing changes? None. **This PR contains a "Critical Fix".** In #40675 and #41149 , incorrect results are produced. * GitHub Issue: #41149 * Also closes #40675 Authored-by: Ruoxi Sun <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
vibhatha
pushed a commit
to vibhatha/arrow
that referenced
this issue
May 25, 2024
### Rationale for this change Sporadic asof join test failures have been frequently and annoyingly observed in pyarrow CI, as recorded in apache#40675 and apache#41149. Turns out the root causes are the same - a logical race (as opposed to physical race which can be detected by sanitizers). By injecting special delay in various places in asof join, as shown in zanmato1984@ea3b24c, the issue can be reproduced almost 100%. And I have put some descriptions in that commit to explain how the race happens. ### What changes are included in this PR? Eliminate the logical race of emptiness by combining multiple call-sites of `Empty()`. ### Are these changes tested? Include the UT to reproduce the issue. ### Are there any user-facing changes? None. **This PR contains a "Critical Fix".** In apache#40675 and apache#41149 , incorrect results are produced. * GitHub Issue: apache#41149 * Also closes apache#40675 Authored-by: Ruoxi Sun <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
JerAguilon
pushed a commit
to JerAguilon/arrow
that referenced
this issue
May 29, 2024
### Rationale for this change Sporadic asof join test failures have been frequently and annoyingly observed in pyarrow CI, as recorded in apache#40675 and apache#41149. Turns out the root causes are the same - a logical race (as opposed to physical race which can be detected by sanitizers). By injecting special delay in various places in asof join, as shown in zanmato1984@ea3b24c, the issue can be reproduced almost 100%. And I have put some descriptions in that commit to explain how the race happens. ### What changes are included in this PR? Eliminate the logical race of emptiness by combining multiple call-sites of `Empty()`. ### Are these changes tested? Include the UT to reproduce the issue. ### Are there any user-facing changes? None. **This PR contains a "Critical Fix".** In apache#40675 and apache#41149 , incorrect results are produced. * GitHub Issue: apache#41149 * Also closes apache#40675 Authored-by: Ruoxi Sun <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug, including details regarding any error messages, version, and platform.
I sporadically get this failure when running the PyArrow tests locally:
Component(s)
C++, Python
The text was updated successfully, but these errors were encountered: