Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement](jdbc catalog)Optimize JDBC Catalog refresh to reduce frequent client creation. #40261

Merged
merged 2 commits into from
Sep 2, 2024

Conversation

zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Sep 2, 2024

In the previous JDBC Catalog refresh behavior, each refresh would close and recreate the JdbcClient. During this process, we observed that some JDBC drivers create classloader-level shared threads when the JdbcClient is repeatedly instantiated. These threads cannot be garbage collected. Since we use the JdbcClient as the context class loader for the JDBC driver, frequent creation leads to a buildup of non-recyclable shared threads in the JVM. This PR changes the refresh behavior to update the cache only, rather than recreating the client. Recreating the client is unnecessary when the Catalog configuration has not changed.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zy-kkk
Copy link
Member Author

zy-kkk commented Sep 2, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 49349 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 47704941845e7c5b83fc85b41c3541226919cc88, data reload: false

------ Round 1 ----------------------------------
q1	17608	4432	4363	4363
q2	2081	155	146	146
q3	10268	1864	1887	1864
q4	10112	1262	1341	1262
q5	8508	3948	3913	3913
q6	231	124	122	122
q7	2060	1632	1623	1623
q8	9478	2744	2709	2709
q9	13832	9955	10009	9955
q10	8667	3574	3536	3536
q11	415	249	251	249
q12	467	301	310	301
q13	18331	3935	4048	3935
q14	347	325	322	322
q15	502	454	458	454
q16	536	465	449	449
q17	1139	960	961	960
q18	7319	6976	6873	6873
q19	1682	1581	1528	1528
q20	540	303	296	296
q21	4416	4141	4095	4095
q22	482	400	394	394
Total cold run time: 119021 ms
Total hot run time: 49349 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4309	4315	4290	4290
q2	328	224	226	224
q3	4162	4171	4109	4109
q4	2752	2742	2733	2733
q5	7254	7095	7098	7095
q6	235	120	123	120
q7	3243	2781	2886	2781
q8	4348	4462	4474	4462
q9	14354	13895	14023	13895
q10	4234	4251	4244	4244
q11	774	679	665	665
q12	1030	838	837	837
q13	7007	3737	3715	3715
q14	449	419	424	419
q15	514	468	454	454
q16	621	583	591	583
q17	3830	3888	3840	3840
q18	8756	8609	8789	8609
q19	1692	1686	1668	1668
q20	2363	2137	2090	2090
q21	8492	8427	8360	8360
q22	1059	942	940	940
Total cold run time: 81806 ms
Total hot run time: 76133 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 212666 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 47704941845e7c5b83fc85b41c3541226919cc88, data reload: false

query1	937	396	434	396
query2	6557	2116	2155	2116
query3	6922	211	215	211
query4	23077	21521	21476	21476
query5	19767	6527	6493	6493
query6	281	218	228	218
query7	4334	297	303	297
query8	251	265	232	232
query9	3073	2665	2590	2590
query10	471	312	315	312
query11	15631	14928	14895	14895
query12	128	80	75	75
query13	1034	441	436	436
query14	17706	13353	13469	13353
query15	360	216	231	216
query16	6496	288	265	265
query17	1733	909	915	909
query18	903	313	323	313
query19	204	148	158	148
query20	76	80	81	80
query21	198	103	95	95
query22	5233	5155	5086	5086
query23	34269	33557	33534	33534
query24	7714	6315	6346	6315
query25	525	438	430	430
query26	1267	159	164	159
query27	2392	300	297	297
query28	6136	2240	2205	2205
query29	2862	2729	2691	2691
query30	243	173	167	167
query31	965	750	727	727
query32	73	66	54	54
query33	452	248	256	248
query34	859	465	476	465
query35	1155	959	929	929
query36	1229	1062	1097	1062
query37	174	64	62	62
query38	3057	2904	2943	2904
query39	1387	1321	1334	1321
query40	317	96	95	95
query41	39	37	37	37
query42	82	83	81	81
query43	626	526	529	526
query44	1151	707	720	707
query45	245	232	227	227
query46	1235	959	949	949
query47	1817	1686	1729	1686
query48	518	417	414	414
query49	644	371	369	369
query50	851	631	608	608
query51	4769	4620	4661	4620
query52	93	74	79	74
query53	228	193	184	184
query54	2635	2483	2491	2483
query55	84	92	79	79
query56	216	237	187	187
query57	1228	1138	1065	1065
query58	231	209	216	209
query59	3490	3244	3333	3244
query60	218	203	192	192
query61	102	101	108	101
query62	832	493	511	493
query63	210	179	176	176
query64	3422	1555	1431	1431
query65	3633	3557	3551	3551
query66	794	395	440	395
query67	15741	15287	16483	15287
query68	7101	644	644	644
query69	481	271	251	251
query70	1545	1566	1435	1435
query71	406	314	311	311
query72	6774	4922	4778	4778
query73	743	329	325	325
query74	6217	5893	5925	5893
query75	4583	3691	3712	3691
query76	4230	1162	1196	1162
query77	528	258	252	252
query78	12444	11954	11817	11817
query79	6725	655	640	640
query80	2904	383	392	383
query81	525	242	240	240
query82	1387	105	98	98
query83	189	135	138	135
query84	260	71	71	71
query85	1416	326	336	326
query86	354	288	293	288
query87	3234	3006	3016	3006
query88	4683	2332	2316	2316
query89	417	280	292	280
query90	1747	218	215	215
query91	159	137	129	129
query92	62	53	53	53
query93	4798	580	579	579
query94	869	212	214	212
query95	2066	2051	2043	2043
query96	645	333	330	330
query97	6482	6555	6446	6446
query98	218	196	207	196
query99	2863	869	894	869
Total cold run time: 312637 ms
Total hot run time: 212666 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.7 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 47704941845e7c5b83fc85b41c3541226919cc88, data reload: false

query1	0.02	0.03	0.02
query2	0.07	0.03	0.03
query3	0.24	0.05	0.05
query4	1.80	0.07	0.08
query5	0.55	0.53	0.53
query6	1.24	0.61	0.60
query7	0.02	0.01	0.02
query8	0.04	0.02	0.02
query9	0.52	0.49	0.49
query10	0.54	0.53	0.52
query11	0.12	0.08	0.09
query12	0.12	0.09	0.09
query13	0.64	0.61	0.60
query14	0.79	0.78	0.77
query15	0.78	0.75	0.76
query16	0.36	0.37	0.36
query17	1.01	1.01	0.99
query18	0.23	0.21	0.29
query19	1.92	1.82	1.82
query20	0.01	0.02	0.01
query21	15.46	0.56	0.53
query22	2.01	2.30	1.46
query23	16.52	1.11	1.08
query24	5.52	1.00	0.97
query25	0.35	0.13	0.05
query26	0.55	0.16	0.17
query27	0.04	0.03	0.04
query28	7.88	0.73	0.71
query29	12.66	2.36	2.31
query30	0.56	0.55	0.54
query31	2.82	0.39	0.37
query32	3.39	0.49	0.50
query33	3.11	3.06	3.07
query34	15.24	4.82	4.79
query35	4.89	4.87	4.86
query36	1.04	1.01	1.02
query37	0.06	0.04	0.05
query38	0.04	0.03	0.02
query39	0.02	0.02	0.01
query40	0.16	0.15	0.15
query41	0.07	0.02	0.01
query42	0.02	0.01	0.01
query43	0.02	0.02	0.02
Total cold run time: 103.45 s
Total hot run time: 30.7 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 47704941845e7c5b83fc85b41c3541226919cc88 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       20.9 seconds inserted 10000000 Rows, about 478K ops/s

@zy-kkk
Copy link
Member Author

zy-kkk commented Sep 2, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 48837 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9a284692e0469d423274e7d819f235bfe944e536, data reload: false

------ Round 1 ----------------------------------
q1	17979	4419	4346	4346
q2	2073	151	148	148
q3	10460	1928	1912	1912
q4	10302	1269	1311	1269
q5	8495	3889	3929	3889
q6	236	126	123	123
q7	2041	1618	1594	1594
q8	9288	2734	2706	2706
q9	10256	9648	9732	9648
q10	8648	3503	3494	3494
q11	413	247	253	247
q12	472	300	302	300
q13	18353	3945	4007	3945
q14	368	317	326	317
q15	524	458	461	458
q16	529	462	457	457
q17	1130	967	953	953
q18	7221	6829	6921	6829
q19	1686	1492	1448	1448
q20	541	315	305	305
q21	4440	4120	4058	4058
q22	480	391	397	391
Total cold run time: 115935 ms
Total hot run time: 48837 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4325	4317	4299	4299
q2	317	226	222	222
q3	4164	4138	4140	4138
q4	2759	2733	2739	2733
q5	7104	7096	7058	7058
q6	239	119	123	119
q7	3249	2876	2845	2845
q8	4381	4480	4468	4468
q9	13725	13670	13527	13527
q10	4254	4247	4261	4247
q11	769	676	680	676
q12	1016	841	863	841
q13	7232	3740	3742	3740
q14	453	420	429	420
q15	507	467	458	458
q16	631	592	569	569
q17	3856	3856	3846	3846
q18	8836	8728	8740	8728
q19	1693	1659	1617	1617
q20	2374	2182	2098	2098
q21	8412	8463	8396	8396
q22	1058	926	996	926
Total cold run time: 81354 ms
Total hot run time: 75971 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 212065 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9a284692e0469d423274e7d819f235bfe944e536, data reload: false

query1	938	398	412	398
query2	6528	2247	1952	1952
query3	6921	208	200	200
query4	23796	22219	21987	21987
query5	19721	6520	6505	6505
query6	272	221	243	221
query7	4156	303	306	303
query8	264	257	255	255
query9	3095	2675	2606	2606
query10	414	294	300	294
query11	15561	15272	14809	14809
query12	120	77	76	76
query13	1011	457	441	441
query14	17557	13369	13408	13369
query15	372	224	238	224
query16	6489	285	268	268
query17	1926	927	916	916
query18	903	322	327	322
query19	211	151	158	151
query20	84	75	79	75
query21	187	96	95	95
query22	5152	4903	4888	4888
query23	34238	33502	33468	33468
query24	6967	6362	6346	6346
query25	525	443	438	438
query26	1020	163	162	162
query27	2389	303	304	303
query28	6140	2256	2251	2251
query29	2845	2623	2779	2623
query30	246	170	168	168
query31	923	777	756	756
query32	68	64	60	60
query33	436	247	254	247
query34	869	469	482	469
query35	1105	923	907	907
query36	1217	1313	1264	1264
query37	90	60	59	59
query38	3063	2922	2909	2909
query39	1363	1322	1314	1314
query40	236	95	97	95
query41	39	37	38	37
query42	89	85	93	85
query43	665	627	725	627
query44	1220	718	730	718
query45	246	234	226	226
query46	1222	967	976	967
query47	1791	1652	1693	1652
query48	508	411	412	411
query49	624	372	374	372
query50	869	634	631	631
query51	4706	4628	4647	4628
query52	92	83	75	75
query53	236	189	193	189
query54	2679	2453	2475	2453
query55	89	88	86	86
query56	217	216	213	213
query57	1199	1101	1086	1086
query58	215	197	207	197
query59	3463	3447	3243	3243
query60	217	220	196	196
query61	98	95	114	95
query62	810	448	469	448
query63	201	180	177	177
query64	3307	1567	1499	1499
query65	3598	3563	3561	3561
query66	780	414	450	414
query67	16104	15832	15070	15070
query68	9182	666	656	656
query69	484	257	274	257
query70	1690	1408	1339	1339
query71	394	306	306	306
query72	6892	4817	4716	4716
query73	767	321	322	321
query74	6281	5802	5849	5802
query75	4544	3694	3660	3660
query76	4764	1154	1207	1154
query77	646	258	254	254
query78	12492	11446	11717	11446
query79	7052	658	659	658
query80	1691	398	386	386
query81	507	239	234	234
query82	1736	101	107	101
query83	172	131	131	131
query84	265	71	69	69
query85	983	319	319	319
query86	338	301	294	294
query87	3269	3017	3095	3017
query88	4812	2294	2300	2294
query89	424	318	310	310
query90	1901	218	224	218
query91	163	130	137	130
query92	62	59	55	55
query93	5839	573	593	573
query94	790	205	212	205
query95	2053	1912	1957	1912
query96	651	326	328	326
query97	6457	6369	6344	6344
query98	218	217	199	199
query99	2861	829	1013	829
Total cold run time: 315082 ms
Total hot run time: 212065 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9a284692e0469d423274e7d819f235bfe944e536, data reload: false

query1	0.02	0.02	0.02
query2	0.06	0.03	0.02
query3	0.25	0.05	0.04
query4	1.80	0.07	0.06
query5	0.54	0.52	0.52
query6	1.25	0.61	0.60
query7	0.02	0.00	0.00
query8	0.03	0.03	0.02
query9	0.53	0.49	0.47
query10	0.54	0.53	0.54
query11	0.13	0.08	0.08
query12	0.12	0.09	0.09
query13	0.62	0.61	0.62
query14	0.78	0.78	0.79
query15	0.78	0.76	0.78
query16	0.37	0.39	0.36
query17	1.03	1.01	1.01
query18	0.24	0.26	0.26
query19	1.93	1.82	1.82
query20	0.02	0.01	0.01
query21	15.43	0.55	0.54
query22	1.98	2.37	2.10
query23	17.19	1.02	0.97
query24	4.23	1.15	1.76
query25	0.38	0.10	0.05
query26	0.59	0.17	0.17
query27	0.04	0.04	0.04
query28	7.94	0.72	0.78
query29	12.66	2.26	2.32
query30	0.55	0.51	0.54
query31	2.81	0.39	0.38
query32	3.40	0.49	0.49
query33	3.05	3.06	3.11
query34	15.25	4.81	4.79
query35	4.89	4.88	4.85
query36	1.07	1.02	1.01
query37	0.06	0.05	0.05
query38	0.02	0.02	0.02
query39	0.02	0.02	0.01
query40	0.16	0.14	0.15
query41	0.07	0.02	0.01
query42	0.02	0.02	0.02
query43	0.02	0.01	0.02
Total cold run time: 102.89 s
Total hot run time: 31.41 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 9a284692e0469d423274e7d819f235bfe944e536 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       22.1 seconds inserted 10000000 Rows, about 452K ops/s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit c46f1d3 into apache:branch-2.0 Sep 2, 2024
22 of 25 checks passed
@zy-kkk zy-kkk deleted the improvement_refresh_jdbc branch September 6, 2024 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants