-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Padding for IO buffers. #2980
Padding for IO buffers. #2980
Conversation
dbms/src/Common/PODArray.h
Outdated
insert_assume_reserved(from_begin, from_end); | ||
} | ||
|
||
/// Works under assumption, that it's possible to read up to 15 excessive bytes after `from_end` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if the PODArray is padded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
dbms/src/IO/ReadHelpers.cpp
Outdated
@@ -177,7 +189,10 @@ void readStringInto(Vector & s, ReadBuffer & buf) | |||
{ | |||
char * next_pos = find_first_symbols<'\t', '\n'>(buf.position(), buf.buffer().end()); | |||
|
|||
appendToStringOrVector(s, buf.position(), next_pos); | |||
if (buf.isPadded()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can put it inside one another inline function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is indeed a chance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, recent revisions to FunctionComparisor make g++ 8.2 mad. It takes 4 minutes to compile the translation unit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something is over-complicated in implementation of DECIMAL data type. We will address it.
insert_assume_reserved(from_begin, from_end); | ||
} | ||
|
||
/// Works under assumption, that it's possible to read up to 15 excessive bytes after `from_end` and this PODArray is padded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can enable this method only if the PODArray has enough padding.
By static_assert
, for example.
PS. We can name this method insertSmallAllowReadWriteOverflow15
instead of insertSmall
to be more precise.
Testing data ``` select 'aaaaaaaa','bbbbbbbb','cccccccc','dddddddd','eeeeeeee','ffffffff','gggg','hhh' from numbers(3000000) into outfile '/tmp/test.tsv' ``` Testing command ``` echo "select count() from file('/tmp/test.tsv', CSV, 'a String, b String, c String, d String, e String, f String, g String, h String') where not ignore(e)" | clickhouse-benchmark ``` TSV parser has less overhead than CSV, using it would better unveil the benefits of memcpySmall. Before ``` QPS: 1.662, RPS: 4985463.906, MiB/s: 603.823, result RPS: 1.662, result MiB/s: 0.000. 0.000% 0.559 sec. 10.000% 0.564 sec. 20.000% 0.568 sec. 30.000% 0.572 sec. 40.000% 0.575 sec. 50.000% 0.581 sec. 60.000% 0.592 sec. 70.000% 0.624 sec. 80.000% 0.639 sec. 90.000% 0.664 sec. 95.000% 0.686 sec. 99.000% 0.711 sec. 99.900% 0.715 sec. 99.990% 0.716 sec. ``` After ``` QPS: 1.861, RPS: 5582303.107, MiB/s: 676.110, result RPS: 1.861, result MiB/s: 0.000. 0.000% 0.510 sec. 10.000% 0.514 sec. 20.000% 0.517 sec. 30.000% 0.521 sec. 40.000% 0.523 sec. 50.000% 0.527 sec. 60.000% 0.530 sec. 70.000% 0.539 sec. 80.000% 0.558 sec. 90.000% 0.584 sec. 95.000% 0.589 sec. 99.000% 0.608 sec. 99.900% 0.655 sec. 99.990% 0.663 sec. ```
Ok. But the code in |
Testing data
Testing command
TSV parser has less overhead than CSV, using it would better unveil the benefits of memcpySmall.
Before
After
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en