Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement](statistics)Do not collect min max for agg table value columns while doing sample analyze #29483

Merged
merged 1 commit into from
Jan 6, 2024

Conversation

Jibing-Li
Copy link
Contributor

While doing sample analyze for agg table value columns, skip collecting min/max value, because these columns couldn't use zonemap, we don't want to scan the whole table while sampling.

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@Jibing-Li Jibing-Li marked this pull request as ready for review January 3, 2024 12:29
@Jibing-Li
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit d0753e55ddbcef78c2dd9b46ba11ceca81d5bf19, data reload: false

------ Round 1 ----------------------------------
q1	17647	5138	5136	5136
q2	2007	153	141	141
q3	10535	1115	1185	1115
q4	10179	781	821	781
q5	7799	2888	2904	2888
q6	207	138	132	132
q7	913	517	531	517
q8	9299	1980	2013	1980
q9	6832	6366	6384	6366
q10	8235	3070	3015	3015
q11	422	204	221	204
q12	386	233	234	233
q13	17999	3649	3606	3606
q14	235	218	210	210
q15	578	536	547	536
q16	444	402	417	402
q17	951	462	512	462
q18	7248	6733	6577	6577
q19	1568	1421	1430	1421
q20	694	364	336	336
q21	2784	2325	2378	2325
q22	358	310	326	310
Total cold run time: 107320 ms
Total hot run time: 38693 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5174	5142	5134	5134
q2	347	235	258	235
q3	3274	3290	3223	3223
q4	2118	1986	1976	1976
q5	5750	5729	5715	5715
q6	211	132	124	124
q7	2305	1863	1879	1863
q8	3333	3440	3452	3440
q9	8770	8710	8688	8688
q10	3768	3815	3816	3815
q11	580	494	479	479
q12	791	638	659	638
q13	8550	3199	3195	3195
q14	297	272	270	270
q15	596	520	527	520
q16	552	498	489	489
q17	1935	1806	1786	1786
q18	8624	8291	8271	8271
q19	1614	1597	1573	1573
q20	2214	1970	1954	1954
q21	5601	5246	5232	5232
q22	553	479	469	469
Total cold run time: 66957 ms
Total hot run time: 59089 ms

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit d0753e55ddbcef78c2dd9b46ba11ceca81d5bf19, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5455	5118	5197	5118
q2	392	162	159	159
q3	1437	1131	1140	1131
q4	1092	833	801	801
q5	3132	3129	3054	3054
q6	220	140	127	127
q7	975	535	528	528
q8	2152	2276	2244	2244
q9	6694	6704	6640	6640
q10	3156	3165	3132	3132
q11	333	219	204	204
q12	380	235	234	234
q13	4383	3642	3620	3620
q14	255	210	213	210
q15	613	556	565	556
q16	476	417	389	389
q17	1046	560	532	532
q18	7066	6734	6763	6734
q19	1652	1467	1569	1467
q20	559	365	336	336
q21	2884	2457	2523	2457
q22	395	317	341	317
Total cold run time: 44747 ms
Total hot run time: 39990 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5145	5130	5120	5120
q2	337	241	242	241
q3	3358	3307	3279	3279
q4	2139	2037	2034	2034
q5	5948	5903	5934	5903
q6	228	125	123	123
q7	2386	1884	1872	1872
q8	3574	3625	3670	3625
q9	9008	8916	8943	8916
q10	3875	3909	3902	3902
q11	597	486	487	486
q12	790	622	662	622
q13	3873	3212	3171	3171
q14	305	281	275	275
q15	623	548	541	541
q16	539	511	505	505
q17	1994	1801	1801	1801
q18	8760	8317	8423	8317
q19	1750	1690	1711	1690
q20	2278	1998	2010	1998
q21	5720	5370	5362	5362
q22	558	463	512	463
Total cold run time: 63785 ms
Total hot run time: 60246 ms

@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit d0753e55ddbcef78c2dd9b46ba11ceca81d5bf19, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	921	343	337	337
query2	6445	1861	1973	1861
query3	6643	207	196	196
query4	26196	22504	22414	22414
query5	5116	574	533	533
query6	266	186	185	185
query7	4582	271	262	262
query8	233	208	201	201
query9	8256	2770	2610	2610
query10	458	271	239	239
query11	16054	15556	15573	15556
query12	137	81	79	79
query13	1662	342	334	334
query14	12471	7097	7169	7097
query15	241	188	189	188
query16	6473	285	275	275
query17	1811	501	495	495
query18	1925	284	270	270
query19	280	144	141	141
query20	86	76	82	76
query21	185	96	96	96
query22	4918	4727	4599	4599
query23	32021	31226	31238	31226
query24	11549	2804	2774	2774
query25	579	358	357	357
query26	1739	151	150	150
query27	2830	272	285	272
query28	7104	1955	1947	1947
query29	2077	395	404	395
query30	287	141	150	141
query31	956	767	788	767
query32	91	65	64	64
query33	728	287	271	271
query34	849	447	456	447
query35	923	770	768	768
query36	1348	1125	1168	1125
query37	107	75	77	75
query38	3420	3289	3273	3273
query39	1342	1281	1277	1277
query40	305	99	97	97
query41	38	37	35	35
query42	105	88	95	88
query43	536	510	486	486
query44	1073	713	723	713
query45	191	192	183	183
query46	1066	637	641	637
query47	1734	1628	1568	1568
query48	347	258	272	258
query49	1216	337	332	332
query50	719	355	343	343
query51	5410	5282	5348	5282
query52	97	87	88	87
query53	216	155	152	152
query54	1299	559	578	559
query55	103	90	87	87
query56	212	200	203	200
query57	1027	972	972	972
query58	227	210	211	210
query59	2766	2578	2636	2578
query60	267	243	236	236
query61	91	85	88	85
query62	655	452	490	452
query63	172	155	152	152
query64	5842	1740	1779	1740
query65	3340	3271	3260	3260
query66	1322	334	336	334
query67	15597	14884	15258	14884
query68	13103	536	518	518
query69	535	255	256	255
query70	1809	1532	1470	1470
query71	496	226	230	226
query72	5586	3574	3562	3562
query73	3071	314	318	314
query74	7018	6427	6495	6427
query75	5356	2305	2254	2254
query76	6371	1176	1117	1117
query77	660	263	290	263
query78	9058	8726	8709	8709
query79	1814	499	509	499
query80	553	360	370	360
query81	448	206	206	206
query82	206	105	104	104
query83	186	137	137	137
query84	254	54	55	54
query85	964	291	301	291
query86	379	363	395	363
query87	3569	3379	3371	3371
query88	2899	2283	2294	2283
query89	344	270	258	258
query90	1912	210	226	210
query91	128	91	96	91
query92	61	52	55	52
query93	1559	492	495	492
query94	879	191	190	190
query95	477	428	417	417
query96	617	321	315	315
query97	4280	4111	4157	4111
query98	212	204	188	188
query99	1107	854	879	854
Total cold run time: 295193 ms
Total hot run time: 179344 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.53 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17187806424 Bytes

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 5, 2024
Copy link
Contributor

github-actions bot commented Jan 5, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Jan 5, 2024

PR approved by anyone and no changes requested.

Copy link
Member

@zy-kkk zy-kkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wsjz wsjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morrySnow
Copy link
Contributor

run pipelinex_p0

Copy link
Collaborator

@wm1581066 wm1581066 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 612e063 into apache:master Jan 6, 2024
32 of 34 checks passed
@Jibing-Li Jibing-Li deleted the agg branch January 6, 2024 13:48
Jibing-Li added a commit to Jibing-Li/incubator-doris that referenced this pull request Jan 6, 2024
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
wsjz pushed a commit to wsjz/incubator-doris that referenced this pull request Feb 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.4-merged dev/3.0.0-merged p0_b reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants