Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](Nereids) support create table with variant type #32953

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

morrySnow
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37790 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e99c7dc118af5670b110ffde28461baa5870383b, data reload: false

------ Round 1 ----------------------------------
q1	17627	4283	4092	4092
q2	2110	155	147	147
q3	10591	1122	1210	1122
q4	10229	729	821	729
q5	7469	3018	2975	2975
q6	207	127	121	121
q7	1022	591	572	572
q8	9345	2010	1991	1991
q9	7587	6563	6547	6547
q10	8421	3496	3599	3496
q11	443	218	215	215
q12	416	198	200	198
q13	17797	2832	2841	2832
q14	229	198	202	198
q15	507	478	466	466
q16	507	375	370	370
q17	944	530	610	530
q18	7061	6379	6388	6379
q19	2043	1422	1478	1422
q20	533	283	243	243
q21	3544	2947	2856	2856
q22	332	295	289	289
Total cold run time: 108964 ms
Total hot run time: 37790 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4149	4119	4080	4080
q2	322	230	231	230
q3	2967	2811	2829	2811
q4	1817	1544	1507	1507
q5	5260	5299	5296	5296
q6	190	115	114	114
q7	2226	1834	1851	1834
q8	3153	3302	3261	3261
q9	8706	8669	8637	8637
q10	3824	3794	3788	3788
q11	550	446	443	443
q12	729	564	519	519
q13	16928	2880	2866	2866
q14	274	256	256	256
q15	503	462	461	461
q16	473	418	438	418
q17	1698	1496	1467	1467
q18	7322	6998	7087	6998
q19	1590	1480	1536	1480
q20	1907	1721	1695	1695
q21	4747	4489	4635	4489
q22	511	465	451	451
Total cold run time: 69846 ms
Total hot run time: 53101 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183087 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e99c7dc118af5670b110ffde28461baa5870383b, data reload: false

query1	939	379	358	358
query2	6563	1913	1878	1878
query3	6700	212	210	210
query4	31849	21406	21405	21405
query5	4331	436	429	429
query6	271	187	177	177
query7	4633	306	314	306
query8	236	177	178	177
query9	9334	2372	2333	2333
query10	574	248	265	248
query11	17331	14361	14312	14312
query12	148	90	89	89
query13	1626	409	418	409
query14	10455	7657	7767	7657
query15	243	207	203	203
query16	8151	260	260	260
query17	1950	603	574	574
query18	2095	297	290	290
query19	317	159	161	159
query20	100	92	88	88
query21	200	135	132	132
query22	5048	4788	4755	4755
query23	33465	32831	32652	32652
query24	10828	3009	3035	3009
query25	644	411	410	410
query26	1205	164	163	163
query27	2764	349	362	349
query28	7432	1934	1887	1887
query29	940	651	639	639
query30	304	154	154	154
query31	994	758	739	739
query32	95	58	60	58
query33	776	260	258	258
query34	1086	509	523	509
query35	848	653	622	622
query36	1057	876	888	876
query37	131	68	67	67
query38	3549	3421	3435	3421
query39	1468	1476	1439	1439
query40	220	122	120	120
query41	52	50	51	50
query42	106	100	101	100
query43	513	466	465	465
query44	1201	739	729	729
query45	287	269	268	268
query46	1212	749	740	740
query47	1948	1822	1904	1822
query48	462	354	375	354
query49	1132	353	365	353
query50	827	383	376	376
query51	6838	6696	6600	6600
query52	111	92	91	91
query53	376	285	285	285
query54	312	246	250	246
query55	85	82	84	82
query56	260	237	236	236
query57	1239	1134	1160	1134
query58	243	222	219	219
query59	2908	2648	2604	2604
query60	285	255	265	255
query61	120	112	124	112
query62	687	462	460	460
query63	316	285	282	282
query64	5944	4263	4273	4263
query65	3138	3058	3090	3058
query66	988	382	356	356
query67	15513	14849	14852	14849
query68	7372	516	526	516
query69	644	384	379	379
query70	1253	1148	1122	1122
query71	523	268	266	266
query72	6360	2790	2594	2594
query73	774	318	313	313
query74	7972	6508	6381	6381
query75	3633	2334	2357	2334
query76	5035	933	967	933
query77	669	263	266	263
query78	11108	10204	10200	10200
query79	10148	553	551	551
query80	1594	395	387	387
query81	543	226	226	226
query82	891	90	87	87
query83	212	148	150	148
query84	284	77	82	77
query85	1503	336	323	323
query86	476	289	307	289
query87	3809	3613	3547	3547
query88	5105	2302	2303	2302
query89	527	391	386	386
query90	1998	176	183	176
query91	182	134	142	134
query92	59	51	52	51
query93	7080	505	496	496
query94	1244	183	183	183
query95	427	338	333	333
query96	674	277	268	268
query97	2657	2476	2485	2476
query98	238	234	222	222
query99	1282	917	914	914
Total cold run time: 314279 ms
Total hot run time: 183087 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit e99c7dc118af5670b110ffde28461baa5870383b with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       13.8 seconds inserted 10000000 Rows, about 724K ops/s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 28, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit 56fa2f7 into apache:master Mar 28, 2024
27 of 31 checks passed
@morrySnow morrySnow deleted the fix_variant_create branch March 28, 2024 07:14
Jibing-Li added a commit that referenced this pull request Mar 29, 2024
* [fix](merge cloud) Fix cloud be set be tag map (#32864)

* [chore] Add gavinchou to collaborators (#32881)

* [chore](show) support statement to show views from table (#32358)

MySQL [test]> show views;
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
| t2_view        |
+----------------+
2 rows in set (0.00 sec)

MySQL [test]> show views like '%t1%';
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
+----------------+
1 row in set (0.01 sec)

MySQL [test]> show views where create_time > '2024-03-18';
+----------------+
| Tables_in_test |
+----------------+
| t2_view        |
+----------------+
1 row in set (0.02 sec)

* [Enhancement](ranger) Disable some permission operations when Ranger or LDAP are enabled (#32538)

Disable some permission operations when Ranger or LDAP are enabled.

* [chore](ci) exclude unstable trino_connector case (#32892)

Co-authored-by: stephen <[email protected]>

* [fix](Nereids) NPE when create table with implicit index type (#32893)

* [improvement](mtmv) Support more join types for query rewriting by materialized view (#32685)

This pattern of rewriting is supported for multi-table joins and supported join types is as following:

INNER JOIN
LEFT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
LEFT SEMI JOIN
RIGHT SEMI JOIN
LEFT ANTI JOIN
RIGHT ANTI JOIN

* [Serde](Variant) support arrow serialization for varint type (#32780)

* [fix](multicatalog) fix no data error when read hive table on cosn (#32815)

Currently, when reading a hive on cosn table, doris return empty result, but the table has data.
iceberg on cosn is ok.
The reason is misuse of cosn's file sytem. according to cosn's doc, its fs.cosn.impl should be org.apache.hadoop.fs.CosFileSystem

* [fix](nereids)EliminateGroupByConstant should replace agg's output after removing constant group by keys (#32878)

* [Fix](executor)Fix regression test for test_active_queries/test_backend_active_tasks #32899

* [fix](iceberg) fix iceberg catalog bug and p2 test cases (#32898)

1. Fix iceberg catalog bug

    This PR #30198 change the logic of `IcebergHMSExternalCatalog.java`,
    to get locationUrl by calling hive metastore's `getCatalog()` method.
    But this method only exists in hive 3+. So it will fail if we using hive 2.x.

    I temporary remove this logic, because this logic is only used from iceberg table writing.
    Which is still under development. We will rethink this logic later.

2. Fix test cases

    Some of P2 test cases missed `order_qt`. And because the output format of the floating point
    type is changed, some result in `out` files need to be regenerated.

* [revert](jni) revert part of #32455 (#32904)

* [fix](spill) Avoid releasing resources while spill tasks are executing (#32783)

* [chore](log) print query id before logging profile in be.INFO (#32922)

* [fix](grace-exit) Stop incorrectly of reportwork cause heap use after free #32929

* [improvement](decommission be) decommission check replica num (#32748)

* [fix](arrow-flight) Fix reach limit of connections error (#32911)

Fix Reach limit of connections error
in fe.conf , arrow_flight_token_cache_size is mandatory less than qe_max_connection/2. arrow flight sql is a stateless protocol, connection is usually not actively disconnected, bearer token is evict from the cache will unregister ConnectContext.

Fix ConnectContext.command not be reset to COM_SLEEP in time, this will result in frequent kill connection after query timeout.

Fix bearer token evict log and exception.

TODO: use arrow flight session: https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRdxBLQLTcvvtRpqsvmhrHpdH

* [bugfix](cloud) few variable not initialized (#32868)

../../cloud/src/recycler/meta_checker.cpp
can cause uninitialised memory read.

* [fix](arrow-flight) Fix arrow flight sql compatible with JDK 17 and upgrade arrow 15.0.2 (#32796)

--add-opens=java.base/java.nio=ALL-UNNAMED, see: https://arrow.apache.org/docs/java/install.html#java-compatibility
groovy use flight sql connection to execute query SUM(MAX(c1) OVER (PARTITION BY)) report error: AGGREGATE clause must not contain analytic expressions, but no problem in Java execute it with jdbc::arrow-flight-sql.
groovy not support print arrow array type, throw IndexOutOfBoundsException.
"arrow_flight_sql" not support two phase read
./run-regression-test.sh --run --clean -g arrow_flight_sql

* [fix](spill) SpillStream's writer maybe may not have been finalized (#32931)

* [improvement](spill) Disable DistinctStreamingAgg when spill is enabled (#32932)

* [Improve](inverted_index) update clucene and improve array inverted index writer  (#32436)

* [Performance](exec) replace SipHash in function by XXHash (#32919)

* [feature](agg) add aggregate function sum0 (#32541)

* [improvement](mtmv) Support to get tables in materialized view when collecting table in plan (#32797)

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY RANDOM BUCKETS 1 
PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10

* [enhance](mtmv)support olap table partition column is null (#32698)

* [enhancement](cloud) add table version to cloud (#32738)

Add table version to cloud.

In Fe:
Get: If Fe is cloud mode, get table version from meta service.
Update: Op drop/replace temp partition, commit transaction.

In meta service:
Add: create Index. init value is 1.
Remove: by recycler.
Update: commit/drop partition rpc, commit txn rpc. Atomic++.

* [fix](cloud) schema change from not null to null (#32913)

1. Use equals instead of == for type comparing
2. null bitmap size is reisze by size of ref column.

* [feature](Nereids): add ColumnPruningPostProcessor. (#32800)

* [case](rowpolicy)fix row policy has been exist (#32880)

* [fix](pipeline) fix use error row desc when origin block clear (#32803)

* [fix](Nereids) support variant column with index when create table (#32948)

* [opt](Nereids) support create table with variant type (#32953)

* [test](insert-overwrite) Add insert overwrite auto detect concurrency cases (#32935)

* [fix](compile) fe cannot compile in idea (#32955)

* [enhancement](plsql) Support select * from routines (#32866)

Support show of plsql procedure using select * from routines.

* [fix](trino-connector) fix `NoClassDefFoundError` of hudi `Utils` class (#32846)

Due to the change of this PR #32455 , the `trino-connector-scanner` package cannot access the `hudi_scanner` package, so the exception NoclassDeffounderror will appear.

We need to write a separate Utils class.

* [exec](column) change some complex column move to noexcept (#32954)

* [Enhancement](data skew) extends show data skew (#32732)

* [chore](test) let suite compatible with Nereids (#32964)

* Support identical column name in different index. (#32792)

* Limit the max string length to 1024 while collecting column stats to control BE memory usage. (#32470)

* [fix](merge-iterator) fix NOT_IMPLEMENTED_ERROR when read next block view (#32961)

* [improvement](executor)Add tag property for workload group #32874

* [fix](auth)unified workload and resource permission logic (#32907)

- `Grant resource` can no longer grant global `usage_priv`
-  `grant resource %` instead of `grant resource *`

before change:
```
grant usage_priv on resource * to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: Usage_priv 
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: NULL
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 
```
after change
```
grant usage_priv on resource '%' to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: NULL
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: %: Usage_priv 
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 

```

---------

Co-authored-by: yujun <[email protected]>
Co-authored-by: Gavin Chou <[email protected]>
Co-authored-by: xy720 <[email protected]>
Co-authored-by: yongjinhou <[email protected]>
Co-authored-by: Dongyang Li <[email protected]>
Co-authored-by: stephen <[email protected]>
Co-authored-by: morrySnow <[email protected]>
Co-authored-by: seawinde <[email protected]>
Co-authored-by: lihangyu <[email protected]>
Co-authored-by: Yulei-Yang <[email protected]>
Co-authored-by: starocean999 <[email protected]>
Co-authored-by: wangbo <[email protected]>
Co-authored-by: Mingyu Chen <[email protected]>
Co-authored-by: Jerry Hu <[email protected]>
Co-authored-by: zhiqiang <[email protected]>
Co-authored-by: Xinyi Zou <[email protected]>
Co-authored-by: Vallish Pai <[email protected]>
Co-authored-by: amory <[email protected]>
Co-authored-by: HappenLee <[email protected]>
Co-authored-by: Jensen <[email protected]>
Co-authored-by: zhangdong <[email protected]>
Co-authored-by: Yongqiang YANG <[email protected]>
Co-authored-by: jakevin <[email protected]>
Co-authored-by: Mryange <[email protected]>
Co-authored-by: zclllyybb <[email protected]>
Co-authored-by: Tiewei Fang <[email protected]>
Co-authored-by: Xin Liao <[email protected]>
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Apr 1, 2024
eldenmoon added a commit that referenced this pull request Apr 1, 2024
)

* [fix](Nereids) support variant column with index when create table (#32948)

* [opt](Nereids) support create table with variant type (#32953)

---------

Co-authored-by: morrySnow <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.1-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants