Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](file_cache) Add config to enable base compaction output to file cache #44497

Conversation

gavinchou
Copy link
Contributor

Previous implementation does not allow the output of base compaction write into file cache, which may have some performance penalty.

This commit add a config to make that policy configurable. be.conf enable_file_cache_keep_base_compaction_output it is false by default.

If your file cache is ample enough to accommodate all the data in your database, enable this option; otherwise, it is recommended to leave it disabled.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

… cache

Previous implementation does not allow the output of base compaction
write into file cache, which may have some performance penalty.

This commit add a config to make that policy configurable.
be.conf `enable_file_cache_keep_base_compaction_output` it is false by default.

If your file cache is ample enough to accommodate all the data in your database,
enable this option; otherwise, it is recommended to leave it disabled.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@gavinchou
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39983 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 25b1d102abb63ea00912a892771e0524fb27751c, data reload: false

------ Round 1 ----------------------------------
q1	17788	7585	7332	7332
q2	2056	176	169	169
q3	10843	1089	1181	1089
q4	10369	729	733	729
q5	7616	2749	2691	2691
q6	243	154	150	150
q7	986	624	605	605
q8	9219	1875	1972	1875
q9	6593	6395	6477	6395
q10	7006	2317	2301	2301
q11	463	260	263	260
q12	426	221	223	221
q13	17791	3074	3017	3017
q14	242	212	213	212
q15	573	530	512	512
q16	653	579	564	564
q17	1024	563	490	490
q18	7503	6678	6721	6678
q19	1348	1028	946	946
q20	495	181	184	181
q21	4023	3265	3251	3251
q22	377	315	322	315
Total cold run time: 107637 ms
Total hot run time: 39983 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7354	7253	7365	7253
q2	324	236	236	236
q3	2932	2860	2902	2860
q4	2030	1784	1817	1784
q5	5967	5663	5645	5645
q6	226	142	144	142
q7	2235	1807	1847	1807
q8	3506	3607	3516	3516
q9	8846	8935	8967	8935
q10	3609	3552	3566	3552
q11	624	529	526	526
q12	819	595	633	595
q13	11004	3248	3250	3248
q14	302	285	287	285
q15	576	520	515	515
q16	689	653	667	653
q17	1867	1661	1628	1628
q18	8445	7788	7728	7728
q19	1730	1653	1631	1631
q20	2106	1899	1892	1892
q21	5665	5294	5360	5294
q22	634	589	606	589
Total cold run time: 71490 ms
Total hot run time: 60314 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197082 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 25b1d102abb63ea00912a892771e0524fb27751c, data reload: false

query1	1255	926	919	919
query2	6243	2082	2035	2035
query3	10783	3915	3944	3915
query4	67668	29672	23787	23787
query5	4932	477	454	454
query6	320	184	192	184
query7	4897	307	298	298
query8	302	238	239	238
query9	6640	2702	2692	2692
query10	416	272	249	249
query11	15449	15358	15775	15358
query12	156	109	106	106
query13	1033	463	446	446
query14	14041	7325	7538	7325
query15	227	185	186	185
query16	7056	492	504	492
query17	1095	614	643	614
query18	1359	293	296	293
query19	196	152	153	152
query20	124	117	112	112
query21	206	104	103	103
query22	4609	4348	4393	4348
query23	34918	34278	34295	34278
query24	5637	2456	2526	2456
query25	483	397	375	375
query26	663	153	149	149
query27	1873	286	300	286
query28	4600	2517	2483	2483
query29	713	418	480	418
query30	217	151	160	151
query31	1031	830	830	830
query32	70	56	59	56
query33	452	297	302	297
query34	953	562	531	531
query35	892	752	722	722
query36	1078	949	990	949
query37	125	75	87	75
query38	4476	4356	4453	4356
query39	1546	1510	1486	1486
query40	210	111	106	106
query41	46	43	47	43
query42	110	99	105	99
query43	556	501	473	473
query44	1253	840	832	832
query45	190	171	165	165
query46	1157	720	762	720
query47	2060	1927	1956	1927
query48	437	343	346	343
query49	704	407	408	407
query50	836	397	427	397
query51	7480	7290	7144	7144
query52	99	87	89	87
query53	265	181	180	180
query54	530	405	406	405
query55	80	77	84	77
query56	269	240	241	240
query57	1290	1175	1148	1148
query58	226	236	232	232
query59	3150	3002	2917	2917
query60	272	255	262	255
query61	111	107	108	107
query62	782	669	676	669
query63	206	186	199	186
query64	1366	699	619	619
query65	3242	3218	3212	3212
query66	631	309	306	306
query67	15955	15651	15575	15575
query68	3798	592	575	575
query69	479	261	262	261
query70	1226	1149	1139	1139
query71	437	250	244	244
query72	6705	4032	4093	4032
query73	799	356	354	354
query74	10190	8959	9195	8959
query75	3402	2688	2692	2688
query76	2580	1036	1125	1036
query77	585	284	280	280
query78	10667	9438	9338	9338
query79	1830	605	604	604
query80	1030	440	444	440
query81	543	232	225	225
query82	293	124	119	119
query83	188	154	153	153
query84	281	72	79	72
query85	1019	308	303	303
query86	412	298	320	298
query87	4729	4540	4621	4540
query88	3916	2227	2187	2187
query89	428	295	304	295
query90	2150	202	192	192
query91	137	106	106	106
query92	69	50	53	50
query93	2724	563	559	559
query94	756	262	294	262
query95	344	247	249	247
query96	644	287	282	282
query97	2896	2722	2665	2665
query98	219	204	190	190
query99	1631	1314	1313	1313
Total cold run time: 319354 ms
Total hot run time: 197082 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.31% (9981/26050)
Line Coverage: 29.42% (83519/283837)
Region Coverage: 28.59% (42982/150365)
Branch Coverage: 25.17% (21837/86752)
Coverage Report: http://coverage.selectdb-in.cc/coverage/25b1d102abb63ea00912a892771e0524fb27751c_25b1d102abb63ea00912a892771e0524fb27751c/report/index.html

@doris-robot
Copy link

ClickBench: Total hot run time: 32.58 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 25b1d102abb63ea00912a892771e0524fb27751c, data reload: false

query1	0.03	0.03	0.02
query2	0.07	0.03	0.03
query3	0.24	0.08	0.07
query4	1.66	0.10	0.11
query5	0.43	0.41	0.41
query6	1.14	0.65	0.65
query7	0.01	0.02	0.02
query8	0.04	0.03	0.03
query9	0.55	0.51	0.49
query10	0.54	0.56	0.56
query11	0.14	0.10	0.10
query12	0.13	0.11	0.12
query13	0.61	0.61	0.61
query14	2.84	2.74	2.86
query15	0.89	0.82	0.82
query16	0.36	0.38	0.39
query17	1.00	1.01	1.01
query18	0.22	0.21	0.21
query19	1.94	1.73	1.93
query20	0.02	0.01	0.02
query21	15.37	0.56	0.58
query22	2.82	2.70	1.19
query23	17.02	0.97	0.85
query24	2.68	1.70	2.01
query25	0.13	0.13	0.13
query26	0.62	0.14	0.15
query27	0.05	0.04	0.05
query28	9.77	1.09	1.07
query29	12.55	3.21	3.18
query30	0.27	0.06	0.06
query31	2.87	0.37	0.38
query32	3.28	0.45	0.47
query33	3.09	3.13	3.05
query34	16.80	4.51	4.47
query35	4.49	4.42	4.49
query36	0.67	0.48	0.50
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.17	0.13	0.12
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.84 s
Total hot run time: 32.58 s

dataroaring pushed a commit that referenced this pull request Nov 24, 2024
… cache (#44497) (#44496)

Previous implementation does not allow the output of base compaction
write into file cache, which may have some performance penalty.

This commit add a config to make that policy configurable. be.conf
`enable_file_cache_keep_base_compaction_output` it is false by default.

If your file cache is ample enough to accommodate all the data in your
database, enable this option; otherwise, it is recommended to leave it
disabled.

master branch PR #44497
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 24, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@freemandealer freemandealer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 9867ba3 into apache:master Nov 25, 2024
32 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants