-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix](merge-on-write) Fix FE may use the staled response to wrongly commit txn #39018
[Fix](merge-on-write) Fix FE may use the staled response to wrongly commit txn #39018
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
c60d7c4
to
a6110ad
Compare
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add regression cases
clang-tidy review says "All clean, LGTM! 👍" |
2 similar comments
clang-tidy review says "All clean, LGTM! 👍" |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 39793 ms
|
…tinel mark in _ms_base_compaction_cnt
...ion-test/suites/fault_injection_p0/cloud/test_cloud_mow_stale_resp_load_load_conflict.groovy
Outdated
Show resolved
Hide resolved
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
TPC-H: Total hot run time: 39598 ms
|
TPC-DS: Total hot run time: 205221 ms
|
ClickBench: Total hot run time: 30.79 s
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
run performance |
TPC-H: Total hot run time: 39424 ms
|
TPC-DS: Total hot run time: 200924 ms
|
ClickBench: Total hot run time: 31.01 s
|
…ommit txn (#39018) ## Problem consider the following scenarios for merge-on-write table in cloud mode ### Scenario 1: Load-Load Conflict 1. load txn1 tries to commit version n and gets the delete bitmap update lock 2. load txn1 begins to calculate delete bitmap on BEs, this is a heavy calculating process and lasts long 3. load txn2 tries to commit version n and gets the delete bitmap update lock because load txn1's delete bitmap update lock has expired 4. load txn1's delete bitmap update lock expires and load txn2 get the delete bitmap update lock 5. load txn2 commits successfully with version n and release the delete bitmap update lock 6. load txn1 fails to commit due to timeout of the calculation of delete bitmap 7. load txn1 retries the commit process with version n+1, gets the bitmap update lock and sends delete bitmap calculation task to BEs 8. BE fails to register this new calculation task because there is a task with the same signatrure(txn_id) running in the task_worker_pool 9. BE finishes the calculation of delete bitmap and report success status to FE 10. load txn1 commits successfully with n+1 Finally, load txn1 failed to calculate delete bitmap for version n from load txn2 ### Scenario 2: Load-Compaction Conflict 1. load txn tries to commit and gets the delete bitmap update lock 2. load txn collects rowset_ids and submit a delete bitmap calculation task to the threadpool for the diff rowsets. But the theadpool is full, so the task is queued in the threadpool. 3. load txn's delete bitmap update lock expired and a compaction job on the same tablet finished successfully. 4. load txn fails to commit due to timeout of the calculation of delete bitmap 5. load txn retries the commit process, gets the bitmap update lock and sends delete bitmap calculation task to BEs 6. BE fails to register this new calculation task because there is a task with the same signatrure(txn_id) running in the task_worker_pool 7. BE finishes the calculation of delete bitmap and report success status to FE 8. load txn1 commits successfully Finally, load txn failed to calculate delete bitmap for the compaction produced by compaction ## Solution The root cause of the above failures is that when the commit process is retried many times, FE may use the previous stale success response from BEs and commit txns. One solution for that problem is that FE attaches an unique id within the delete bitmap calculation task sent to BE and BE takes it in the response for FE to check if the response is for the current latest task. However, if the delete bitmap calculation always consumes more time than the timeout of the delete bitmap calculation, FE will retry the commit process infinitely which causes live lock. This PR let the BE's response take the compaction stats(to avoid load-compaction conflict) and versions(to avoid load-load conflict) from the task request and let the FE compares it with the current task's to know that if there is any compaction or load finished during the time periods since the current load get the delete bitmap lock due to lock expiration. If so, the current txn should retry or abort. If not, the current txn can commit successfully.
…ommit txn (apache#39018) ## Problem consider the following scenarios for merge-on-write table in cloud mode ### Scenario 1: Load-Load Conflict 1. load txn1 tries to commit version n and gets the delete bitmap update lock 2. load txn1 begins to calculate delete bitmap on BEs, this is a heavy calculating process and lasts long 3. load txn2 tries to commit version n and gets the delete bitmap update lock because load txn1's delete bitmap update lock has expired 4. load txn1's delete bitmap update lock expires and load txn2 get the delete bitmap update lock 5. load txn2 commits successfully with version n and release the delete bitmap update lock 6. load txn1 fails to commit due to timeout of the calculation of delete bitmap 7. load txn1 retries the commit process with version n+1, gets the bitmap update lock and sends delete bitmap calculation task to BEs 8. BE fails to register this new calculation task because there is a task with the same signatrure(txn_id) running in the task_worker_pool 9. BE finishes the calculation of delete bitmap and report success status to FE 10. load txn1 commits successfully with n+1 Finally, load txn1 failed to calculate delete bitmap for version n from load txn2 ### Scenario 2: Load-Compaction Conflict 1. load txn tries to commit and gets the delete bitmap update lock 2. load txn collects rowset_ids and submit a delete bitmap calculation task to the threadpool for the diff rowsets. But the theadpool is full, so the task is queued in the threadpool. 3. load txn's delete bitmap update lock expired and a compaction job on the same tablet finished successfully. 4. load txn fails to commit due to timeout of the calculation of delete bitmap 5. load txn retries the commit process, gets the bitmap update lock and sends delete bitmap calculation task to BEs 6. BE fails to register this new calculation task because there is a task with the same signatrure(txn_id) running in the task_worker_pool 7. BE finishes the calculation of delete bitmap and report success status to FE 8. load txn1 commits successfully Finally, load txn failed to calculate delete bitmap for the compaction produced by compaction ## Solution The root cause of the above failures is that when the commit process is retried many times, FE may use the previous stale success response from BEs and commit txns. One solution for that problem is that FE attaches an unique id within the delete bitmap calculation task sent to BE and BE takes it in the response for FE to check if the response is for the current latest task. However, if the delete bitmap calculation always consumes more time than the timeout of the delete bitmap calculation, FE will retry the commit process infinitely which causes live lock. This PR let the BE's response take the compaction stats(to avoid load-compaction conflict) and versions(to avoid load-load conflict) from the task request and let the FE compares it with the current task's to know that if there is any compaction or load finished during the time periods since the current load get the delete bitmap lock due to lock expiration. If so, the current txn should retry or abort. If not, the current txn can commit successfully.
Problem
consider the following scenarios for merge-on-write table in cloud mode
Scenario 1: Load-Load Conflict
Finally, load txn1 failed to calculate delete bitmap for version n from load txn2
Scenario 2: Load-Compaction Conflict
Finally, load txn failed to calculate delete bitmap for the compaction produced by compaction
Solution
The root cause of the above failures is that when the commit process is retried many times, FE may use the previous stale success response from BEs and commit txns. One solution for that problem is that FE attaches an unique id within the delete bitmap calculation task sent to BE and BE takes it in the response for FE to check if the response is for the current latest task. However, since
calculate_delete_bitmap_task_timeout_seconds
can not change adaptively based on actual computation time currently, if the delete bitmap calculation always consumes more time than the timeout of the delete bitmap calculation, FE will retry the commit process infinitely which causes live lock.This PR let the BE's response take the compaction stats(to avoid load-compaction conflict) and versions(to avoid load-load conflict) from the task request and let the FE compares it with the current task's to know that if there is any compaction or load finished during the time periods since the current load get the delete bitmap lock due to lock expiration. If so, the current txn should retry or abort. If not, the current txn can commit successfully.