-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trino-hive -P test-fault-tolerant-execution times out #10773
Comments
I will take a look |
I spent a couple of hours yesterday trying to reproduce it locally. I tried to throttle CPU and play with OS scheduling parameters and it still passed more than 100000 times. Reproducing concurrency problems is a real challenge. I found this research from Microsoft when they explore a scheduling algorithm designed to maximize a chance of concurrency bug being discovered: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/asplos277-pct.pdf Ideally it would be great if we had tooling to help us discover concurrency related problems. I will try to give it a read during the weekend and see if it is anyhow applicable: |
full thread dump @ https://github.com/trinodb/trino/runs/5033675081?check_suite_focus=true |
Hmm, it looks like something could be wrong specifically when these two tests are running in parallel
The other build got stuck in exactly same way. |
I think I got to the bottom of it
|
Nice!. So this is really pre-existing bug then. Can it happen in non-fault tolerant execution? |
@sopel39 It can, but it is significantly less likely, as in normal execution there's usually multiple task updates being sent. |
https://github.com/trinodb/trino/runs/4928406693?check_suite_focus=true
The text was updated successfully, but these errors were encountered: