Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](partial update) partial update should not read old fileds from rows with delete sign #36210

Merged
merged 6 commits into from
Jun 14, 2024

Conversation

zhannngchen
Copy link
Contributor

@zhannngchen zhannngchen commented Jun 12, 2024

Proposed changes

Issue Number: close #34296

  1. When partial update filling in the missing fields, if a load job previously wrote data with a delete sign, it will also read out the data in the column with the delete sign, so that the newly written data will also become invisible
  2. This problem was fixed in [Fix](merge-on-write) Correct the alignment process when the existing rows with same key has marked delete sign #24877, but was introduced again in [Revert](merge-on-write) Don't use delete bitmap to mark delete for rows with delete sign when sequence column doesn't exist  #26721, and was never found because the case was changed to the wrong output in [Revert](merge-on-write) Don't use delete bitmap to mark delete for rows with delete sign when sequence column doesn't exist  #26721.
  3. The fix in [Fix](merge-on-write) Correct the alignment process when the existing rows with same key has marked delete sign #24877 didn't take into account the handling of concurrent conflicts in the publish phase, the current PR adds this part of the handling, and adds the corresponding case.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@zhannngchen
Copy link
Contributor Author

run buildall

@zhannngchen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39911 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b7e57db3d1498eb42c9691e1e240cd629114f436, data reload: false

------ Round 1 ----------------------------------
q1	18078	5620	4415	4415
q2	2671	197	194	194
q3	11937	1089	1135	1089
q4	10592	771	852	771
q5	7627	2763	2673	2673
q6	238	146	139	139
q7	971	641	599	599
q8	9490	2074	2072	2072
q9	8868	6501	6503	6501
q10	8997	3784	3696	3696
q11	459	236	250	236
q12	459	241	227	227
q13	17987	2976	2963	2963
q14	274	216	226	216
q15	523	487	484	484
q16	533	382	373	373
q17	990	680	728	680
q18	8107	7524	7323	7323
q19	6326	1410	1458	1410
q20	652	323	333	323
q21	4897	3914	3197	3197
q22	386	338	330	330
Total cold run time: 121062 ms
Total hot run time: 39911 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4383	4228	4234	4228
q2	381	272	270	270
q3	2973	2781	2711	2711
q4	1852	1611	1589	1589
q5	5276	5294	5278	5278
q6	225	130	132	130
q7	2077	1766	1786	1766
q8	3210	3337	3308	3308
q9	8362	8325	8323	8323
q10	3949	3696	3648	3648
q11	591	501	483	483
q12	779	578	602	578
q13	17580	2981	2941	2941
q14	296	266	270	266
q15	530	495	479	479
q16	481	424	413	413
q17	1784	1483	1487	1483
q18	7608	7691	7436	7436
q19	1698	1555	1636	1555
q20	1997	1780	1777	1777
q21	4974	4751	4832	4751
q22	618	534	561	534
Total cold run time: 71624 ms
Total hot run time: 53947 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172400 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b7e57db3d1498eb42c9691e1e240cd629114f436, data reload: false

query1	943	385	375	375
query2	6464	2438	2374	2374
query3	6646	207	206	206
query4	19170	17205	17388	17205
query5	4132	461	474	461
query6	247	161	157	157
query7	4581	297	302	297
query8	324	295	281	281
query9	8508	2393	2365	2365
query10	615	304	289	289
query11	10572	10020	10095	10020
query12	143	90	83	83
query13	1646	362	356	356
query14	10158	7278	7639	7278
query15	261	186	191	186
query16	7780	264	261	261
query17	1920	531	525	525
query18	1870	265	264	264
query19	196	154	152	152
query20	93	87	83	83
query21	216	141	123	123
query22	4315	4032	4195	4032
query23	33886	33038	33155	33038
query24	11963	2915	2820	2820
query25	699	352	354	352
query26	1794	148	148	148
query27	2952	314	308	308
query28	7517	2029	2032	2029
query29	1086	628	607	607
query30	286	154	148	148
query31	944	736	758	736
query32	105	55	57	55
query33	779	298	297	297
query34	940	469	462	462
query35	755	654	651	651
query36	1078	930	937	930
query37	187	77	73	73
query38	2897	2733	2728	2728
query39	847	828	794	794
query40	282	127	124	124
query41	55	50	53	50
query42	120	101	99	99
query43	588	537	534	534
query44	1291	729	733	729
query45	198	165	173	165
query46	1101	728	697	697
query47	1845	1770	1779	1770
query48	375	304	285	285
query49	1169	408	405	405
query50	767	373	368	368
query51	6846	6698	6619	6619
query52	97	87	99	87
query53	362	287	290	287
query54	941	435	427	427
query55	76	74	73	73
query56	284	282	254	254
query57	1158	1000	1064	1000
query58	244	249	246	246
query59	3503	3116	3142	3116
query60	287	263	277	263
query61	88	94	88	88
query62	641	447	453	447
query63	320	284	291	284
query64	9869	2265	1719	1719
query65	3200	3163	3109	3109
query66	1351	340	350	340
query67	15411	15047	14946	14946
query68	5426	538	528	528
query69	596	454	376	376
query70	1230	1100	1174	1100
query71	434	262	269	262
query72	7430	5098	5232	5098
query73	762	321	317	317
query74	5889	5554	5474	5474
query75	3808	2652	2695	2652
query76	3472	942	900	900
query77	703	299	300	299
query78	10644	9880	9736	9736
query79	2335	510	512	510
query80	1309	471	468	468
query81	564	219	221	219
query82	700	104	101	101
query83	196	184	168	168
query84	272	90	83	83
query85	1420	296	272	272
query86	425	336	328	328
query87	3278	3075	3060	3060
query88	3936	2355	2329	2329
query89	480	401	382	382
query90	1880	196	188	188
query91	128	103	104	103
query92	66	50	49	49
query93	3766	517	501	501
query94	1182	255	185	185
query95	402	312	304	304
query96	601	269	260	260
query97	3226	3050	3077	3050
query98	219	198	197	197
query99	1238	838	821	821
Total cold run time: 280001 ms
Total hot run time: 172400 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.24% (8945/24684)
Line Coverage: 27.74% (73011/263202)
Region Coverage: 27.22% (37927/139327)
Branch Coverage: 23.90% (19283/80684)
Coverage Report: http://coverage.selectdb-in.cc/coverage/b7e57db3d1498eb42c9691e1e240cd629114f436_b7e57db3d1498eb42c9691e1e240cd629114f436/report/index.html

@doris-robot
Copy link

ClickBench: Total hot run time: 30.53 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b7e57db3d1498eb42c9691e1e240cd629114f436, data reload: false

query1	0.04	0.03	0.04
query2	0.08	0.04	0.04
query3	0.22	0.05	0.05
query4	1.69	0.06	0.06
query5	0.50	0.50	0.48
query6	1.13	0.73	0.73
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.54	0.48	0.50
query10	0.56	0.56	0.53
query11	0.15	0.11	0.11
query12	0.15	0.11	0.12
query13	0.60	0.58	0.59
query14	0.76	0.77	0.80
query15	0.82	0.81	0.80
query16	0.37	0.36	0.36
query17	1.00	1.03	0.95
query18	0.20	0.28	0.22
query19	1.80	1.67	1.70
query20	0.01	0.01	0.01
query21	15.42	0.67	0.66
query22	4.34	7.35	2.07
query23	18.31	1.46	1.26
query24	2.09	0.23	0.22
query25	0.15	0.08	0.08
query26	0.26	0.18	0.18
query27	0.09	0.09	0.09
query28	13.19	1.01	0.99
query29	12.61	3.30	3.28
query30	0.26	0.06	0.06
query31	2.85	0.38	0.39
query32	3.28	0.47	0.47
query33	2.87	2.89	2.89
query34	17.05	4.39	4.44
query35	4.49	4.48	4.43
query36	0.66	0.46	0.46
query37	0.18	0.15	0.15
query38	0.14	0.15	0.14
query39	0.05	0.04	0.03
query40	0.17	0.17	0.14
query41	0.10	0.05	0.05
query42	0.07	0.05	0.06
query43	0.05	0.04	0.03
Total cold run time: 109.37 s
Total hot run time: 30.53 s

Copy link
Collaborator

@Yukang-Lian Yukang-Lian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

dataroaring
dataroaring previously approved these changes Jun 13, 2024
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 13, 2024
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jun 13, 2024
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@zhannngchen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39768 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a6e39567c919e24a427f00d6fb9e3cf1d3f47045, data reload: false

------ Round 1 ----------------------------------
q1	18634	4578	4333	4333
q2	2030	193	192	192
q3	10485	1118	1152	1118
q4	10193	733	832	733
q5	7479	2723	2643	2643
q6	220	134	132	132
q7	953	616	591	591
q8	9221	2089	2076	2076
q9	8884	6489	6495	6489
q10	8947	3695	3757	3695
q11	472	231	242	231
q12	398	225	221	221
q13	18618	3013	2988	2988
q14	264	228	214	214
q15	527	478	475	475
q16	528	394	375	375
q17	985	741	719	719
q18	8206	7544	7409	7409
q19	8061	1336	1530	1336
q20	695	324	331	324
q21	4980	3143	3174	3143
q22	388	335	331	331
Total cold run time: 121168 ms
Total hot run time: 39768 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4354	4260	4186	4186
q2	356	257	267	257
q3	2971	2759	2745	2745
q4	1876	1601	1637	1601
q5	5300	5293	5298	5293
q6	220	124	128	124
q7	2118	1737	1726	1726
q8	3220	3383	3332	3332
q9	8412	8371	8379	8371
q10	3909	3705	3742	3705
q11	586	492	481	481
q12	764	585	604	585
q13	17517	2991	2993	2991
q14	294	251	252	251
q15	522	465	467	465
q16	463	410	417	410
q17	1779	1479	1464	1464
q18	7720	7623	7242	7242
q19	1695	1598	1549	1549
q20	1983	1791	1752	1752
q21	4882	4771	4650	4650
q22	627	524	584	524
Total cold run time: 71568 ms
Total hot run time: 53704 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.45% (8989/24661)
Line Coverage: 28.02% (73678/262916)
Region Coverage: 27.50% (38274/139180)
Branch Coverage: 24.20% (19509/80632)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a6e39567c919e24a427f00d6fb9e3cf1d3f47045_a6e39567c919e24a427f00d6fb9e3cf1d3f47045/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 171564 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a6e39567c919e24a427f00d6fb9e3cf1d3f47045, data reload: false

query1	933	386	382	382
query2	6490	2433	2427	2427
query3	6650	208	208	208
query4	19133	17230	17283	17230
query5	4156	507	492	492
query6	276	154	164	154
query7	4590	296	291	291
query8	333	289	298	289
query9	8607	2372	2359	2359
query10	621	308	273	273
query11	10438	10063	9916	9916
query12	132	90	87	87
query13	1643	378	363	363
query14	9313	6301	7458	6301
query15	233	196	189	189
query16	8026	274	258	258
query17	1900	535	538	535
query18	2061	268	271	268
query19	195	153	149	149
query20	92	87	92	87
query21	219	138	124	124
query22	4386	3991	3951	3951
query23	33587	32960	32945	32945
query24	12129	2842	2801	2801
query25	650	354	361	354
query26	1780	155	150	150
query27	3017	313	320	313
query28	7787	2039	2028	2028
query29	1045	638	597	597
query30	268	146	148	146
query31	927	758	738	738
query32	98	55	57	55
query33	769	290	272	272
query34	991	466	470	466
query35	721	613	618	613
query36	1086	953	941	941
query37	294	69	73	69
query38	2891	2749	2750	2749
query39	846	779	790	779
query40	290	135	136	135
query41	61	56	55	55
query42	122	97	107	97
query43	576	538	550	538
query44	1291	722	727	722
query45	190	166	168	166
query46	1080	720	690	690
query47	1858	1808	1769	1769
query48	380	291	288	288
query49	1208	400	400	400
query50	763	393	385	385
query51	6806	6771	6612	6612
query52	101	96	92	92
query53	359	287	293	287
query54	980	450	444	444
query55	79	72	75	72
query56	282	266	260	260
query57	1152	1069	1043	1043
query58	258	274	260	260
query59	3602	3271	3075	3075
query60	296	282	302	282
query61	114	113	109	109
query62	652	448	459	448
query63	315	288	289	288
query64	9970	2397	1737	1737
query65	3211	3167	3086	3086
query66	1347	342	337	337
query67	15358	15029	14928	14928
query68	4657	533	546	533
query69	488	307	302	302
query70	1172	1143	1149	1143
query71	399	271	272	271
query72	7631	5830	5370	5370
query73	750	324	326	324
query74	5890	5459	5526	5459
query75	3574	2652	2647	2647
query76	2889	949	970	949
query77	501	310	302	302
query78	10413	9925	9793	9793
query79	2328	497	510	497
query80	939	457	445	445
query81	541	219	218	218
query82	1034	103	105	103
query83	279	174	176	174
query84	239	85	85	85
query85	1568	289	285	285
query86	479	295	314	295
query87	3237	3094	3085	3085
query88	3791	2334	2336	2334
query89	474	371	383	371
query90	1824	191	189	189
query91	130	100	98	98
query92	58	47	49	47
query93	2350	529	504	504
query94	1269	187	184	184
query95	477	326	316	316
query96	597	266	263	263
query97	3264	3067	2993	2993
query98	217	205	197	197
query99	1161	834	875	834
Total cold run time: 276634 ms
Total hot run time: 171564 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a6e39567c919e24a427f00d6fb9e3cf1d3f47045, data reload: false

query1	0.05	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.67	0.08	0.08
query5	0.52	0.50	0.48
query6	1.13	0.72	0.72
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.56	0.49	0.50
query10	0.55	0.55	0.55
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.59	0.58	0.60
query14	0.76	0.81	0.77
query15	0.82	0.81	0.82
query16	0.37	0.37	0.39
query17	0.95	0.96	1.05
query18	0.23	0.25	0.25
query19	1.76	1.71	1.76
query20	0.01	0.01	0.01
query21	15.50	0.64	0.64
query22	4.11	7.18	2.14
query23	18.30	1.36	1.31
query24	2.14	0.22	0.23
query25	0.15	0.09	0.08
query26	0.26	0.17	0.18
query27	0.08	0.09	0.08
query28	13.22	1.01	1.01
query29	12.59	3.27	3.22
query30	0.26	0.06	0.05
query31	2.87	0.39	0.38
query32	3.26	0.47	0.47
query33	2.86	2.87	2.87
query34	17.25	4.37	4.38
query35	4.47	4.47	4.47
query36	0.65	0.47	0.47
query37	0.18	0.16	0.15
query38	0.16	0.15	0.14
query39	0.05	0.04	0.04
query40	0.18	0.13	0.14
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.05	0.04
Total cold run time: 109.4 s
Total hot run time: 30.74 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 14, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@dataroaring dataroaring merged commit 70c53d0 into apache:master Jun 14, 2024
24 of 27 checks passed
dataroaring pushed a commit that referenced this pull request Jun 17, 2024
…rows with delete sign (#36210)

Issue Number: close #34296

1. When partial update filling in the missing fields, if a load job
previously wrote data with a delete sign, it will also read out the data
in the column with the delete sign, so that the newly written data will
also become invisible
2. This problem was fixed in #24877, but was introduced again in #26721,
and was never found because the case was changed to the wrong output in
#26721.
3. The fix in #24877 didn't take into account the handling of concurrent
conflicts in the publish phase, the current PR adds this part of the
handling, and adds the corresponding case.
zhannngchen added a commit to zhannngchen/incubator-doris that referenced this pull request Jun 24, 2024
…rows with delete sign (apache#36210)

Issue Number: close apache#34296

1. When partial update filling in the missing fields, if a load job
previously wrote data with a delete sign, it will also read out the data
in the column with the delete sign, so that the newly written data will
also become invisible
2. This problem was fixed in apache#24877, but was introduced again in apache#26721,
and was never found because the case was changed to the wrong output in
3. The fix in apache#24877 didn't take into account the handling of concurrent
conflicts in the publish phase, the current PR adds this part of the
handling, and adds the corresponding case.
zhannngchen added a commit to zhannngchen/incubator-doris that referenced this pull request Jun 24, 2024
…rows with delete sign (apache#36210)

Issue Number: close apache#34296

1. When partial update filling in the missing fields, if a load job
previously wrote data with a delete sign, it will also read out the data
in the column with the delete sign, so that the newly written data will
also become invisible
2. This problem was fixed in apache#24877, but was introduced again in apache#26721,
and was never found because the case was changed to the wrong output in
3. The fix in apache#24877 didn't take into account the handling of concurrent
conflicts in the publish phase, the current PR adds this part of the
handling, and adds the corresponding case.
zhannngchen added a commit that referenced this pull request Jun 24, 2024
dataroaring pushed a commit that referenced this pull request Jun 25, 2024
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] insert after delete where like ‘%%’ on paritial update mode will get no data
4 participants