Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of ClickHouse connector #9637

Closed
yunfeng79 opened this issue Oct 14, 2021 · 8 comments
Closed

Improve performance of ClickHouse connector #9637

yunfeng79 opened this issue Oct 14, 2021 · 8 comments

Comments

@yunfeng79
Copy link

yunfeng79 commented Oct 14, 2021

ClickHouse 21.3.14.1 sql :

 select name   from a where user_id = '1';
time:65ms

trino sql

 select name   from a where user_id = '1';
 dx_event_name  
----------------
 device_init    
 device_init    
 login          
 ta_app_install 
 reg            
(5 rows)

Query 20211014_090917_05877_4kn34, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
4.59 [12.3M rows, 0B] [2.68M rows/s, 0B/s]

trino-360
1master 2:node

query.max-memory=30GB
query.max-memory-per-node=2GB
query.max-total-memory-per-node=4GB

Why Trino query is slow?

@ebyhr
Copy link
Member

ebyhr commented Oct 14, 2021

Could you share the CREATE TABLE statement int ClickHouse (not Trino) and row numbers in the table?

@ebyhr ebyhr changed the title trino sql Slow query Improve performance of ClickHouse connector Oct 14, 2021
@yunfeng79
Copy link
Author

@ebyhr Yes,CREATE TABLE statement int ClickHouse (not Trino) and row numbers 12320035 in the table ,where user_id ='1' is row number five。catalog import clickhouse connector。

@yunfeng79
Copy link
Author

yunfeng79 commented Oct 14, 2021

@ebyhr table
CREATE TABLE a
(

`dx_event_name` String,

`dx_user_id` String,

`dx_account_id` String,

`dx_distinct_id` String,

`dx_event_time` UInt64,

`dx_part_date` String,

`dx_send_time` UInt64,

`dx_receive_time` UInt64,

`dx_update_time` UInt64,

`dx_ip` String,

`dx_country` String,

`dx_province` String,

`dx_city` String,

`dx_carrier` String,

`dx_uuid` String,

`dx_error_message` String,

`dx_error_column` String,

`op_version` UInt64,

`dx_lib_version` String,

`dx_zone_offset` Int32,

`servectype` Int64,

......
updatetime String
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{layer}-{shard}/a',
'{replica}',
op_version)
PARTITION BY dx_event_name
ORDER BY (dx_event_name,
dx_event_time,
dx_user_id,
dx_uuid)
SETTINGS index_granularity = 8192

@ebyhr
Copy link
Member

ebyhr commented Oct 14, 2021

The table doesn't have user_id column. What's the column type?

@yunfeng79
Copy link
Author

@ebyhr user_id is dx_user_id.

@ebyhr
Copy link
Member

ebyhr commented Oct 14, 2021

Predicate pushdown to String type is disabled now.
https://github.com/trinodb/trino/blob/master/plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java#L376-L385

Let me close this issue as duplicate of #7100

By the way, I would recommend to join the community Slack. https://trino.io/slack.html

@ebyhr ebyhr closed this as completed Oct 14, 2021
@yunfeng79
Copy link
Author

@ebyhr Thank you!

@yunfeng79
Copy link
Author

@ebyhr When will pushdown clickhouse be supported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants