Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional functions for SQL compatibility #3826

Merged
merged 26 commits into from
Dec 20, 2018

Conversation

blinkov
Copy link
Contributor

@blinkov blinkov commented Dec 13, 2018

  • left/right
  • trim/ltrim/rtrim
  • timestampadd/timestampsub
  • other interval-relarted improvements
  • additional case insensitive functions

#3712 #3714 #3704 #3705

func_node->name = "substring";
func_node->arguments = std::move(expr_list_args);
func_node->children.push_back(func_node->arguments);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate function should be both more simple and more efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appeared that substring with negative second argument does exactly what right is supposed to, so I just simplified this aliasing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we need separate functions trim, trimLeft, trimRight with fairly efficient implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See b7566a8, looks roughly like this for 16Gb of text with a lot of whitespace:

    "runs": [
        {
            "min_time": 1.224000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(value)"
        },
        {
            "min_time": 2.019000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(trimLeft(value))"
        },
        {
            "min_time": 1.979000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(trimRight(value))"
        },
        {
            "min_time": 1.991000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(trimBoth(value))"
        },
        {
            "min_time": 5.121000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(replaceRegexpOne(value, '^ *', ''))"
        },
        {
            "min_time": 8.320000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(replaceRegexpOne(value, ' *$', ''))"
        },
        {
            "min_time": 12.282000,
            "query": "SELECT count() FROM whitespaces WHERE NOT ignore(replaceRegexpAll(value, '^ *| *$', ''))"
        }
    ]

@blinkov blinkov changed the title Introduce LEFT/RIGHT functions and extended syntax for TRIM function Additional functions for SQL compatibility Dec 19, 2018
@alexey-milovidov alexey-milovidov merged commit f85857d into master Dec 20, 2018
@alexey-milovidov
Copy link
Member

No test for regexpQuoteMeta?

@blinkov
Copy link
Contributor Author

blinkov commented Dec 20, 2018

@alexey-milovidov there's a test for it's implicit call from one of trim variants

@alexey-milovidov
Copy link
Member

mtlog-perftest03j.yandex.ru :) SELECT URL FROM hits_1000m_transformed WHERE URL != regexpQuoteMeta(URL) LIMIT 100

SELECT URL
FROM hits_1000m_transformed 
WHERE URL != regexpQuoteMeta(URL)
LIMIT 100

Received exception from server (version 18.14.17):
Code: 241. DB::Exception: Received from localhost:9000, ::1. DB::Exception: Memory limit (for query) exceeded: would use 16.03 GiB (attempt to allocate chunk of 8589934592 bytes), maximum: 9.31 GiB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants