Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow inserts into Array(lowcardinality(...)) #6935

Closed
filimonov opened this issue Sep 13, 2019 · 16 comments
Closed

Slow inserts into Array(lowcardinality(...)) #6935

filimonov opened this issue Sep 13, 2019 · 16 comments
Assignees
Labels
comp-lowcardinality LowCardinality data type performance

Comments

@filimonov
Copy link
Contributor

filimonov commented Sep 13, 2019

Direct insert into table with lot of Array(LowCardinality(String)) columns:

cat data_arrays.tsv | time -p clickhouse-client -q 'INSERT INTO test_arrays_lc FORMAT TSV' --log_queries=1 --query_id=insert_into_test_arrays_lc --max_insert_block_size=50000 
real 32.04
user 30.25
sys 0.61

Insert into table with Array(String) columns + copy data to the table with lot of Array(LowCardinality(String)) columns:

cat data_arrays.tsv | time -p clickhouse-client -q 'INSERT INTO test_arrays FORMAT TSV' --log_queries=1 --query_id=insert_into_test_arrays --max_insert_block_size=50000 
real 13.91
user 8.64
sys 1.15

time -p clickhouse-client -q 'INSERT INTO test_arrays_lc SELECT * FROM test_arrays' --log_queries=1 
real 8,89
user 0,01
sys 0,02

8.89 + 13.91 = 22.8 is much better than 32.04

Details of the test: https://gist.github.com/filimonov/8f47f575bee742fe8997b5a9366ff541

@filimonov filimonov added performance comp-lowcardinality LowCardinality data type labels Sep 13, 2019
@filimonov
Copy link
Contributor Author

filimonov commented Sep 13, 2019

I've tried to disable low_cardinality_allow_in_native_format, even worse:

cat data_arrays.tsv | time -p clickhouse-client -q 'INSERT INTO test_arrays_lc FORMAT TSV' --log_queries=1 --query_id=insert_into_test_arrays_lc --max_insert_block_size=50000 --low_cardinality_allow_in_native_format=0

real 78.98
user 68.38
sys 1.43

@YoannBuch
Copy link

YoannBuch commented May 7, 2020

Hi @filimonov, what is the status of this ticket?
We changed our schema to use Array(LowCardinality(String)) and LowCardinality(String) for many of our columns and now inserts are a lot slower.
(We are inserting data through a distributed table).
Are there any workarounds?

@HeenaBansal2009
Copy link
Contributor

HeenaBansal2009 commented May 19, 2022

Hi @alexey-milovidov , Iworking on this issue and captured flamegraph and visualize it on speedscope .
From the initial analysis , What I observed is , mutateThread is the one which is at stake.
From the speedscope, there are five functions are narrowed down which are not contributing in Array(String) insertion.
image

@alexey-milovidov
Copy link
Member

@HeenaBansal2009 you are probably using your self-built ClickHouse version with less inlining or built with debug mode.

@alexey-milovidov
Copy link
Member

And the te_next_event_compute is something not known.
I cannot find this function either in our repository neither at https://www.google.ru/search?q=te_next_event_compute

@alexey-milovidov
Copy link
Member

@filimonov why did you change the max_insert_size setting?
What will change if you keep it by default?

@HeenaBansal2009
Copy link
Contributor

HeenaBansal2009 commented May 20, 2022

@HeenaBansal2009 you are probably using your self-built ClickHouse version with less inlining or built with debug mode.

Yes @alexey, I have taken the graph with clickhouse profiling settings ON and ClickHouse is built with Debug Mode.
I tried with clickhouse binaries built with release tag . Update user.xml and config.xml. I didn't see any clickhouse function calls inside the flamegraph and speedscope . Anything missing?
Executing like -
execute perf event on clickhouse-serv PID.
execute the use case with python file.
convert the perf event in flamegraph.

@HeenaBansal2009
Copy link
Contributor

And the te_next_event_compute is something not known.
I cannot find this function either in our repository neither at https://www.google.ru/search?q=te_next_event_compute

te_next_event_compute is part of jemalloc submodule : contrib/jemalloc/src/thread_event.c

@mlex
Copy link
Contributor

mlex commented May 20, 2022

We had a similar problem at Instana. We never profiled it, but our assumption (it's just an assumption - we never verified it) was, that

a) Array(LowCardinality(String)) actually behaves like LowCardinality(Array(String)). So the lookup table is actually built per array - and not per array element. When we changed our arrays to sorted, ingestion performance (and disk usage) became much better (we were not relying on ordering in the arrays, so it didn't matter). This is good to know to estimate the cardinality of a column: for array columns it's not the cardinality of the elements that matters, but the cardinality of the "combinations".

b) During insert the hash table is rebuilt with every new line. This is what makes it slow. Our workaround was to create a table with Engine=Null and Array(String) columns together with a materialized view from this table into the final table with Array(LowCardinality(String)).

For the mentioned example, this could look like this:

CREATE TABLE test_arrays_null (
    id UInt64,
    column001 Array(String),
    ....
    ) Engine=Null;
    
CREATE MATERIALIZED VIEW mv TO test_arrays_lc AS SELECT * FROM test_arrays_null;

Running the different INSERTs gives these results on my MacBook:

INSERT INTO test_arrays
real 9.88
user 8.67
sys 1.40

INSERT INTO test_arrays_lc
real 11.01
user 89.98
sys 0.98

INSERT INTO test_arrays_null:
real 7.29
user 8.40
sys 1.11

@alexey-milovidov
Copy link
Member

a) Array(LowCardinality(String)) actually behaves like LowCardinality(Array(String)).

This assumption is incorrect. It behaves as expected (array on top of LowCardinality column).

b) During insert the hash table is rebuilt with every new line.

This assumption is also incorrect. The hash table is built for every inserted block.

@alexey-milovidov
Copy link
Member

@filimonov Your Perl script is missing #!/usr/bin/perl at the beginning.

@alexey-milovidov
Copy link
Member

The issue does not reproduce:

milovidov@milovidov-nix:~/work/test_insert_lowcardinality$ cat data_arrays.tsv | time -p clickhouse-client -q 'INSERT INTO test_arrays FORMAT TSV'
real 27.87
user 15.16
sys 2.33
milovidov@milovidov-nix:~/work/test_insert_lowcardinality$ cat data_arrays.tsv | time -p clickhouse-client -q 'INSERT INTO test_arrays_lc FORMAT TSV'
real 40.89
user 125.59
sys 1.01
milovidov@milovidov-nix:~/work/test_insert_lowcardinality$ clickhouse-client 
ClickHouse client version 22.6.1.1.
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 22.6.1 revision 54455.

milovidov-nix :) TRUNCATE test_arrays_lc

TRUNCATE TABLE test_arrays_lc

Query id: d50af9b8-2018-44fc-9c94-2d9e888bdc0d

Ok.

0 rows in set. Elapsed: 0.098 sec. 

milovidov-nix :) INSERT INTO test_arrays_lc SELECT * FROM test_arrays

INSERT INTO test_arrays_lc SELECT *
FROM test_arrays

Query id: f859ead8-ae9c-4c31-97ab-4757eb3b5437

Ok.

0 rows in set. Elapsed: 17.870 sec. Processed 150.00 thousand rows, 2.39 GB (8.39 thousand rows/s., 133.98 MB/s.)

28 + 17 > 41

@alexey-milovidov
Copy link
Member

Also checked on c6g.16xlarge Graviton2 machine just in case:
https://pastila.nl/?0076dba6/a1d22232b9e5567ed77459030cd844e8

Also, everything is just perfect.
And the time for INSERT is almost matching (12 seconds in clickhouse-client and 12 seconds for INSERT SELECT).

@HeenaBansal2009
Copy link
Contributor

HeenaBansal2009 commented May 30, 2022

Hi @alexey-milovidov ,
I have few followup question for the above behaviour.

  • Is the above test executed on Single node cluster, because we are seeing issue with multiple node cluster setup.
  • I can see the user time is almost 8 times in Array(LC(string)) insertion vs Array(String).

@alexey-milovidov
Copy link
Member

Query log only shows the time spend on the server side. It does not show the time spent by the client parsing the data.

@HeenaBansal2009
Copy link
Contributor

HeenaBansal2009 commented Jun 8, 2022

Hi @alexey-milovidov ,
We observed that issue with insertion is only with Rowbased formats like JSONEachRow,RowBinary,Values.
However , insertion with Native (columnBased format) is better in array(LC) than array(string).

`fyre@belleville-ottawa:~$ cat  /home/fyre/Instana_load.bin |  time -p clickhouse-client  -u default --password=snHuf0ZM -q "INSERT INTO arr_lc_200 FORMAT RowBinary"
real 1.27
user 1.18
sys 0.02
fyre@belleville-ottawa:~$ cat  /home/fyre/Instana_load.bin |  time -p clickhouse-client  -u default --password=snHuf0ZM -q "INSERT INTO arr_str_200 FORMAT RowBinary"
real 0.59
user 0.17
sys 0.06`

Since , the merge tree settings are same for one client session in both scenarios , Is this expected ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp-lowcardinality LowCardinality data type performance
Projects
None yet
Development

No branches or pull requests

6 participants