Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](merge-on-write) Fix FE may use the staled response to wrongly commit txn #39018

Merged

Conversation

bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Aug 7, 2024

Problem

consider the following scenarios for merge-on-write table in cloud mode

Scenario 1: Load-Load Conflict

  1. load txn1 tries to commit version n and gets the delete bitmap update lock
  2. load txn1 begins to calculate delete bitmap on BEs, this is a heavy calculating process and lasts long
  3. load txn2 tries to commit version n and gets the delete bitmap update lock because load txn1's delete bitmap update lock has expired
  4. load txn1's delete bitmap update lock expires and load txn2 get the delete bitmap update lock
  5. load txn2 commits successfully with version n and release the delete bitmap update lock
  6. load txn1 fails to commit due to timeout of the calculation of delete bitmap
  7. load txn1 retries the commit process with version n+1, gets the bitmap update lock and sends delete bitmap calculation task to BEs
  8. BE fails to register this new calculation task because there is a task with the same signatrure(txn_id) running in the task_worker_pool
  9. BE finishes the calculation of delete bitmap and report success status to FE
  10. load txn1 commits successfully with n+1

Finally, load txn1 failed to calculate delete bitmap for version n from load txn2

Scenario 2: Load-Compaction Conflict

  1. load txn tries to commit and gets the delete bitmap update lock
  2. load txn collects rowset_ids and submit a delete bitmap calculation task to the threadpool for the diff rowsets. But the theadpool is full, so the task is queued in the threadpool.
  3. load txn's delete bitmap update lock expired and a compaction job on the same tablet finished successfully.
  4. load txn fails to commit due to timeout of the calculation of delete bitmap
  5. load txn retries the commit process, gets the bitmap update lock and sends delete bitmap calculation task to BEs
  6. BE fails to register this new calculation task because there is a task with the same signatrure(txn_id) running in the task_worker_pool
  7. BE finishes the calculation of delete bitmap and report success status to FE
  8. load txn1 commits successfully

Finally, load txn failed to calculate delete bitmap for the compaction produced by compaction

Solution

The root cause of the above failures is that when the commit process is retried many times, FE may use the previous stale success response from BEs and commit txns. One solution for that problem is that FE attaches an unique id within the delete bitmap calculation task sent to BE and BE takes it in the response for FE to check if the response is for the current latest task. However, since calculate_delete_bitmap_task_timeout_seconds can not change adaptively based on actual computation time currently, if the delete bitmap calculation always consumes more time than the timeout of the delete bitmap calculation, FE will retry the commit process infinitely which causes live lock.

This PR let the BE's response take the compaction stats(to avoid load-compaction conflict) and versions(to avoid load-load conflict) from the task request and let the FE compares it with the current task's to know that if there is any compaction or load finished during the time periods since the current load get the delete bitmap lock due to lock expiration. If so, the current txn should retry or abort. If not, the current txn can commit successfully.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 7, 2024

run buildall

Copy link
Contributor

github-actions bot commented Aug 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

@bobhan1 bobhan1 force-pushed the fix-commit-txn-with-wrong-calc-result branch from c60d7c4 to a6110ad Compare August 7, 2024 06:29
@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 7, 2024

run buildall

Copy link
Contributor

github-actions bot commented Aug 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Aug 7, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Aug 7, 2024
Copy link
Contributor

github-actions bot commented Aug 7, 2024

PR approved by anyone and no changes requested.

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add regression cases

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 7, 2024
Copy link
Contributor

github-actions bot commented Aug 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

2 similar comments
Copy link
Contributor

github-actions bot commented Aug 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

github-actions bot commented Aug 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 7, 2024

run buildall

Copy link
Contributor

github-actions bot commented Aug 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39793 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 675fd6188d93c973136a2f93d311fffff3dde4da, data reload: false

------ Round 1 ----------------------------------
q1	17796	4436	4320	4320
q2	2034	180	174	174
q3	10552	1238	1109	1109
q4	10156	725	673	673
q5	7527	2533	2553	2533
q6	225	138	138	138
q7	998	596	586	586
q8	9231	1972	1969	1969
q9	8921	6647	6618	6618
q10	7063	2176	2253	2176
q11	468	238	243	238
q12	390	221	218	218
q13	18124	2992	2993	2992
q14	285	246	235	235
q15	548	486	483	483
q16	510	387	384	384
q17	1002	701	721	701
q18	8231	7546	7429	7429
q19	5543	1011	1025	1011
q20	712	340	342	340
q21	5628	4419	4554	4419
q22	1110	1051	1047	1047
Total cold run time: 117054 ms
Total hot run time: 39793 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4451	4280	4259	4259
q2	390	266	269	266
q3	2877	2686	2658	2658
q4	1891	1631	1563	1563
q5	5287	5328	5282	5282
q6	219	130	131	130
q7	2043	1675	1671	1671
q8	3177	3322	3348	3322
q9	8449	8371	8430	8371
q10	3385	3187	3159	3159
q11	602	493	504	493
q12	799	613	606	606
q13	17513	2988	2965	2965
q14	306	281	271	271
q15	530	491	471	471
q16	473	405	433	405
q17	1790	1473	1474	1473
q18	7784	7522	7463	7463
q19	1737	1537	1583	1537
q20	1987	1799	1786	1786
q21	5283	5137	5151	5137
q22	1109	1008	1020	1008
Total cold run time: 72082 ms
Total hot run time: 54296 ms

@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 8, 2024

run buildall

zhannngchen
zhannngchen previously approved these changes Aug 8, 2024
Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 8, 2024
Copy link
Contributor

github-actions bot commented Aug 8, 2024

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 39598 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 29b17d96155ae3f8de4770a26737b5c56f10720f, data reload: false

------ Round 1 ----------------------------------
q1	18320	4601	4384	4384
q2	2027	176	172	172
q3	10561	1121	1005	1005
q4	10162	745	728	728
q5	7495	2511	2491	2491
q6	225	138	136	136
q7	983	596	586	586
q8	9213	1911	1940	1911
q9	8878	6571	6581	6571
q10	7063	2269	2176	2176
q11	434	245	239	239
q12	382	220	216	216
q13	17880	2961	2979	2961
q14	271	233	243	233
q15	526	485	487	485
q16	504	391	383	383
q17	981	682	669	669
q18	8139	7490	7396	7396
q19	5186	1024	1063	1024
q20	658	338	325	325
q21	5434	4521	4514	4514
q22	1122	993	1011	993
Total cold run time: 116444 ms
Total hot run time: 39598 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4465	4298	4293	4293
q2	402	262	269	262
q3	2895	2622	2777	2622
q4	1990	1709	1713	1709
q5	5648	5469	5490	5469
q6	227	130	131	130
q7	2157	1831	1789	1789
q8	3275	3508	3495	3495
q9	8831	8790	8792	8790
q10	3548	3395	3181	3181
q11	589	493	480	480
q12	814	643	639	639
q13	16025	3181	3112	3112
q14	304	299	284	284
q15	530	488	493	488
q16	492	432	434	432
q17	1818	1515	1511	1511
q18	8130	7877	7855	7855
q19	1835	1648	1672	1648
q20	2118	1920	1897	1897
q21	5548	5499	5212	5212
q22	1156	1044	1028	1028
Total cold run time: 72797 ms
Total hot run time: 56326 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 205221 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 29b17d96155ae3f8de4770a26737b5c56f10720f, data reload: false

query1	952	412	412	412
query2	6465	1999	2007	1999
query3	6641	222	227	222
query4	31255	23238	23053	23053
query5	3628	506	489	489
query6	263	183	186	183
query7	4572	292	294	292
query8	245	208	213	208
query9	8606	2438	2405	2405
query10	922	872	875	872
query11	17704	15019	15068	15019
query12	132	98	97	97
query13	1632	391	376	376
query14	10341	7966	8030	7966
query15	349	329	337	329
query16	7727	496	479	479
query17	1731	587	547	547
query18	2076	423	421	421
query19	270	245	230	230
query20	124	115	125	115
query21	204	108	108	108
query22	4493	4524	4360	4360
query23	34301	33722	33693	33693
query24	10887	3076	2916	2916
query25	599	399	389	389
query26	704	161	156	156
query27	2114	286	281	281
query28	5842	2053	2035	2035
query29	853	411	408	408
query30	254	148	149	148
query31	955	763	761	761
query32	95	54	55	54
query33	623	290	274	274
query34	879	464	471	464
query35	951	891	837	837
query36	1115	931	911	911
query37	136	79	79	79
query38	4291	4157	4135	4135
query39	1411	1388	1382	1382
query40	206	117	113	113
query41	47	44	44	44
query42	115	94	96	94
query43	517	489	483	483
query44	1080	729	728	728
query45	435	368	403	368
query46	1115	804	821	804
query47	1842	1748	1752	1748
query48	362	290	297	290
query49	833	400	424	400
query50	802	409	400	400
query51	6811	6755	6673	6673
query52	106	91	96	91
query53	253	181	176	176
query54	919	455	447	447
query55	77	76	76	76
query56	259	241	234	234
query57	1117	1057	1098	1057
query58	231	228	226	226
query59	3191	2990	2888	2888
query60	306	265	262	262
query61	98	91	100	91
query62	804	642	644	642
query63	211	175	177	175
query64	9334	2468	1934	1934
query65	3203	3163	3141	3141
query66	741	336	332	332
query67	15637	14687	14700	14687
query68	6215	554	560	554
query69	453	415	402	402
query70	1203	1168	1119	1119
query71	532	283	271	271
query72	20127	17682	17597	17597
query73	817	325	326	325
query74	9214	8799	8863	8799
query75	4582	2727	2700	2700
query76	4383	967	988	967
query77	733	320	306	306
query78	9719	9162	8876	8876
query79	7796	536	530	530
query80	1106	509	497	497
query81	593	229	233	229
query82	808	142	138	138
query83	290	160	148	148
query84	322	80	80	80
query85	1315	298	300	298
query86	431	296	295	295
query87	4838	4552	4534	4534
query88	4991	2394	2403	2394
query89	436	285	282	282
query90	2027	195	197	195
query91	148	123	115	115
query92	68	50	51	50
query93	6320	549	541	541
query94	971	292	283	283
query95	352	266	266	266
query96	617	268	271	268
query97	3229	3067	3047	3047
query98	271	205	191	191
query99	1533	1246	1215	1215
Total cold run time: 323434 ms
Total hot run time: 205221 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.79 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 29b17d96155ae3f8de4770a26737b5c56f10720f, data reload: false

query1	0.05	0.04	0.04
query2	0.09	0.04	0.04
query3	0.23	0.06	0.06
query4	1.66	0.09	0.08
query5	0.52	0.49	0.50
query6	1.13	0.74	0.73
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.55	0.50	0.48
query10	0.56	0.55	0.56
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.61	0.58	0.58
query14	0.77	0.79	0.78
query15	0.85	0.84	0.82
query16	0.38	0.37	0.38
query17	0.98	1.02	1.03
query18	0.23	0.21	0.21
query19	1.82	1.71	1.69
query20	0.01	0.02	0.01
query21	15.40	0.74	0.64
query22	4.12	7.55	1.89
query23	18.30	1.34	1.40
query24	2.14	0.22	0.21
query25	0.16	0.09	0.08
query26	0.29	0.22	0.21
query27	0.46	0.22	0.22
query28	13.33	1.03	1.01
query29	12.64	3.36	3.37
query30	0.25	0.06	0.06
query31	2.87	0.39	0.39
query32	3.30	0.48	0.46
query33	2.90	2.96	2.94
query34	17.08	4.32	4.31
query35	4.37	4.37	4.36
query36	0.66	0.51	0.49
query37	0.19	0.16	0.15
query38	0.16	0.15	0.15
query39	0.05	0.03	0.03
query40	0.15	0.12	0.12
query41	0.10	0.04	0.05
query42	0.06	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 109.85 s
Total hot run time: 30.79 s

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 9, 2024
@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 9, 2024

run buildall

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Aug 9, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 9, 2024
@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 9, 2024

run performance

@doris-robot
Copy link

TPC-H: Total hot run time: 39424 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5a849fa620d4da31a50972df1c8dd9c7a9f3cec5, data reload: false

------ Round 1 ----------------------------------
q1	17835	6401	4291	4291
q2	2008	170	172	170
q3	10562	1131	1086	1086
q4	10231	710	768	710
q5	7504	2522	2506	2506
q6	226	137	137	137
q7	966	588	593	588
q8	9223	1898	1903	1898
q9	8685	6581	6553	6553
q10	7028	2239	2214	2214
q11	449	235	234	234
q12	396	218	216	216
q13	19053	2960	2983	2960
q14	279	227	241	227
q15	526	476	489	476
q16	509	388	397	388
q17	970	677	726	677
q18	7943	7481	7383	7383
q19	3888	1018	1047	1018
q20	665	334	336	334
q21	5272	4529	4355	4355
q22	1095	1019	1003	1003
Total cold run time: 115313 ms
Total hot run time: 39424 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4467	4258	4259	4258
q2	379	268	267	267
q3	2856	2640	2746	2640
q4	1990	1802	1698	1698
q5	5630	5488	5469	5469
q6	220	131	133	131
q7	2103	1783	1765	1765
q8	3291	3421	3448	3421
q9	8793	8805	8820	8805
q10	3532	3356	3154	3154
q11	578	494	503	494
q12	803	655	644	644
q13	17150	3139	3181	3139
q14	321	284	289	284
q15	531	492	479	479
q16	504	422	433	422
q17	1842	1500	1515	1500
q18	8029	8045	7780	7780
q19	1691	1561	1712	1561
q20	2128	1899	1861	1861
q21	7767	5403	5200	5200
q22	1172	1092	1013	1013
Total cold run time: 75777 ms
Total hot run time: 55985 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 200924 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5a849fa620d4da31a50972df1c8dd9c7a9f3cec5, data reload: false

query1	903	377	348	348
query2	6447	1990	1978	1978
query3	6640	205	218	205
query4	34113	23409	23244	23244
query5	3619	488	510	488
query6	291	168	164	164
query7	4580	282	285	282
query8	257	195	199	195
query9	8543	2400	2390	2390
query10	541	490	448	448
query11	15841	14947	14947	14947
query12	127	96	95	95
query13	1628	367	354	354
query14	10157	7526	7477	7477
query15	267	221	220	220
query16	7444	508	457	457
query17	1694	552	535	535
query18	1723	276	306	276
query19	193	141	139	139
query20	117	120	109	109
query21	202	111	103	103
query22	4620	4600	4405	4405
query23	34310	33497	33377	33377
query24	10906	2640	2684	2640
query25	606	364	365	364
query26	1157	151	156	151
query27	2285	280	291	280
query28	6386	2001	1989	1989
query29	798	400	402	400
query30	251	144	144	144
query31	980	766	731	731
query32	95	52	53	52
query33	737	283	291	283
query34	881	456	461	456
query35	990	833	803	803
query36	1081	926	911	911
query37	139	80	75	75
query38	4306	4110	4129	4110
query39	1425	1426	1370	1370
query40	192	115	111	111
query41	45	45	42	42
query42	115	91	96	91
query43	522	482	470	470
query44	1277	726	726	726
query45	238	205	205	205
query46	1075	739	705	705
query47	1895	1795	1802	1795
query48	366	299	290	290
query49	863	412	415	412
query50	792	401	401	401
query51	6707	6778	6629	6629
query52	99	85	91	85
query53	252	180	173	173
query54	907	436	430	430
query55	75	73	73	73
query56	253	248	239	239
query57	1161	1039	1050	1039
query58	221	226	228	226
query59	3037	2903	2687	2687
query60	276	268	265	265
query61	113	114	116	114
query62	810	649	630	630
query63	212	179	178	178
query64	9455	2330	1831	1831
query65	3206	3112	3102	3102
query66	753	336	341	336
query67	15342	15019	14910	14910
query68	5029	538	557	538
query69	457	377	414	377
query70	1173	1128	1118	1118
query71	557	298	278	278
query72	19616	16790	16138	16138
query73	793	318	321	318
query74	9133	8732	8596	8596
query75	4440	2682	2684	2682
query76	4049	1060	945	945
query77	710	300	291	291
query78	9708	8977	9428	8977
query79	8127	503	536	503
query80	1320	480	490	480
query81	595	222	219	219
query82	1327	139	133	133
query83	384	183	142	142
query84	277	72	74	72
query85	1531	267	265	265
query86	442	293	281	281
query87	4710	4539	4503	4503
query88	4709	2457	2476	2457
query89	418	284	284	284
query90	2028	194	195	194
query91	118	91	93	91
query92	60	49	47	47
query93	5499	532	525	525
query94	1013	278	291	278
query95	353	253	254	253
query96	612	271	272	271
query97	3181	3038	3006	3006
query98	214	199	200	199
query99	1555	1273	1304	1273
Total cold run time: 321677 ms
Total hot run time: 200924 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5a849fa620d4da31a50972df1c8dd9c7a9f3cec5, data reload: false

query1	0.05	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.04
query4	1.68	0.07	0.07
query5	0.48	0.50	0.48
query6	1.13	0.73	0.72
query7	0.02	0.01	0.01
query8	0.05	0.04	0.05
query9	0.56	0.49	0.50
query10	0.55	0.54	0.53
query11	0.15	0.12	0.11
query12	0.15	0.12	0.13
query13	0.59	0.59	0.59
query14	0.75	0.80	0.80
query15	0.84	0.82	0.81
query16	0.35	0.36	0.38
query17	0.99	0.96	1.01
query18	0.23	0.22	0.22
query19	1.92	1.85	1.81
query20	0.01	0.02	0.01
query21	15.41	0.77	0.65
query22	3.95	7.06	2.37
query23	18.29	1.38	1.16
query24	2.20	0.22	0.22
query25	0.17	0.07	0.08
query26	0.30	0.21	0.20
query27	0.45	0.22	0.23
query28	13.20	1.02	0.99
query29	12.63	3.34	3.30
query30	0.24	0.05	0.05
query31	2.88	0.40	0.38
query32	3.28	0.46	0.48
query33	2.92	2.90	2.95
query34	17.04	4.35	4.40
query35	4.42	4.41	4.58
query36	0.66	0.46	0.46
query37	0.18	0.16	0.15
query38	0.14	0.15	0.14
query39	0.05	0.04	0.04
query40	0.16	0.12	0.12
query41	0.09	0.04	0.05
query42	0.05	0.04	0.04
query43	0.05	0.05	0.04
Total cold run time: 109.57 s
Total hot run time: 31.01 s

@dataroaring dataroaring merged commit f9c7c03 into apache:master Aug 10, 2024
28 of 29 checks passed
dataroaring pushed a commit that referenced this pull request Aug 10, 2024
…ommit txn (#39018)

## Problem

consider the following scenarios for merge-on-write table in cloud mode
### Scenario 1: Load-Load Conflict
1. load txn1 tries to commit version n and gets the delete bitmap update
lock
2. load txn1 begins to calculate delete bitmap on BEs, this is a heavy
calculating process and lasts long
3. load txn2 tries to commit version n and gets the delete bitmap update
lock because load txn1's delete bitmap update lock has expired
4. load txn1's delete bitmap update lock expires and load txn2 get the
delete bitmap update lock
5. load txn2 commits successfully with version n and release the delete
bitmap update lock
6. load txn1 fails to commit due to timeout of the calculation of delete
bitmap
7. load txn1 retries the commit process with version n+1, gets the
bitmap update lock and sends delete bitmap calculation task to BEs
8. BE fails to register this new calculation task because there is a
task with the same signatrure(txn_id) running in the task_worker_pool
9. BE finishes the calculation of delete bitmap and report success
status to FE
10. load txn1 commits successfully with n+1

Finally, load txn1 failed to calculate delete bitmap for version n from
load txn2
### Scenario 2: Load-Compaction Conflict
1. load txn tries to commit and gets the delete bitmap update lock
2. load txn collects rowset_ids and submit a delete bitmap calculation
task to the threadpool for the diff rowsets. But the theadpool is full,
so the task is queued in the threadpool.
3. load txn's delete bitmap update lock expired and a compaction job on
the same tablet finished successfully.
4. load txn fails to commit due to timeout of the calculation of delete
bitmap
5. load txn retries the commit process, gets the bitmap update lock and
sends delete bitmap calculation task to BEs
6. BE fails to register this new calculation task because there is a
task with the same signatrure(txn_id) running in the task_worker_pool
7. BE finishes the calculation of delete bitmap and report success
status to FE
8. load txn1 commits successfully

Finally, load txn failed to calculate delete bitmap for the compaction
produced by compaction

## Solution
The root cause of the above failures is that when the commit process is
retried many times, FE may use the previous stale success response from
BEs and commit txns. One solution for that problem is that FE attaches
an unique id within the delete bitmap calculation task sent to BE and BE
takes it in the response for FE to check if the response is for the
current latest task. However, if the delete bitmap calculation always
consumes more time than the timeout of the delete bitmap calculation, FE
will retry the commit process infinitely which causes live lock.

This PR let the BE's response take the compaction stats(to avoid
load-compaction conflict) and versions(to avoid load-load conflict) from
the task request and let the FE compares it with the current task's to
know that if there is any compaction or load finished during the time
periods since the current load get the delete bitmap lock due to lock
expiration. If so, the current txn should retry or abort. If not, the
current txn can commit successfully.
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Aug 14, 2024
…ommit txn (apache#39018)

## Problem

consider the following scenarios for merge-on-write table in cloud mode
### Scenario 1: Load-Load Conflict
1. load txn1 tries to commit version n and gets the delete bitmap update
lock
2. load txn1 begins to calculate delete bitmap on BEs, this is a heavy
calculating process and lasts long
3. load txn2 tries to commit version n and gets the delete bitmap update
lock because load txn1's delete bitmap update lock has expired
4. load txn1's delete bitmap update lock expires and load txn2 get the
delete bitmap update lock
5. load txn2 commits successfully with version n and release the delete
bitmap update lock
6. load txn1 fails to commit due to timeout of the calculation of delete
bitmap
7. load txn1 retries the commit process with version n+1, gets the
bitmap update lock and sends delete bitmap calculation task to BEs
8. BE fails to register this new calculation task because there is a
task with the same signatrure(txn_id) running in the task_worker_pool
9. BE finishes the calculation of delete bitmap and report success
status to FE
10. load txn1 commits successfully with n+1

Finally, load txn1 failed to calculate delete bitmap for version n from
load txn2
### Scenario 2: Load-Compaction Conflict
1. load txn tries to commit and gets the delete bitmap update lock
2. load txn collects rowset_ids and submit a delete bitmap calculation
task to the threadpool for the diff rowsets. But the theadpool is full,
so the task is queued in the threadpool.
3. load txn's delete bitmap update lock expired and a compaction job on
the same tablet finished successfully.
4. load txn fails to commit due to timeout of the calculation of delete
bitmap
5. load txn retries the commit process, gets the bitmap update lock and
sends delete bitmap calculation task to BEs
6. BE fails to register this new calculation task because there is a
task with the same signatrure(txn_id) running in the task_worker_pool
7. BE finishes the calculation of delete bitmap and report success
status to FE
8. load txn1 commits successfully

Finally, load txn failed to calculate delete bitmap for the compaction
produced by compaction

## Solution
The root cause of the above failures is that when the commit process is
retried many times, FE may use the previous stale success response from
BEs and commit txns. One solution for that problem is that FE attaches
an unique id within the delete bitmap calculation task sent to BE and BE
takes it in the response for FE to check if the response is for the
current latest task. However, if the delete bitmap calculation always
consumes more time than the timeout of the delete bitmap calculation, FE
will retry the commit process infinitely which causes live lock.

This PR let the BE's response take the compaction stats(to avoid
load-compaction conflict) and versions(to avoid load-load conflict) from
the task request and let the FE compares it with the current task's to
know that if there is any compaction or load finished during the time
periods since the current load get the delete bitmap lock due to lock
expiration. If so, the current txn should retry or abort. If not, the
current txn can commit successfully.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.1-merged doing meta-change p0_w reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants