Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli/sql: put a limit on history size #88173

Merged
merged 1 commit into from
Sep 20, 2022
Merged

Conversation

knz
Copy link
Contributor

@knz knz commented Sep 19, 2022

Fixes #54679.

Previously, there was no limit. Some users managed to make their history run into megabyte-size, despite de-duplication, which was causing slowness.

This patch fixes it by adding a limit of 1000 entries. Sufficiently large to not be inconvenient, but sufficiently small that it prevents the history file from growing abnormally large.

Release note (cli change): The interactive SQL shell now retains a maximum of 1000 entries. There was no limit previously.

@knz knz requested review from DrewKimball and a team September 19, 2022 20:54
@knz knz requested a review from a team as a code owner September 19, 2022 20:54
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jordanlewis
Copy link
Member

Thank you. Can you add a bytes limit too please if it's a bytes limit issues?

@knz
Copy link
Contributor Author

knz commented Sep 19, 2022

Can you add a bytes limit too please if it's a bytes limit issues?

No. There's no such thing inside libedit.

No interest in extending this now we have #86457 in the pipeline.

@knz
Copy link
Contributor Author

knz commented Sep 19, 2022

FWIW shells (bash, zsh etc) also use a line count, not a byte count.

Previously, there was no limit. Some users managed to make their
history run into megabyte-size, despite de-duplication, which was
causing slowness.

This patch fixes it by adding a limit of 1000 entries. Sufficiently
large to not be inconvenient, but sufficiently small that it prevents
the history file from growing abnormally large.

Release note (cli change): The interactive SQL shell now retains a
maximum of 1000 entries. There was no limit previously.
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the de-duplication is entirely working as expected - I'm able to grow the log up to 100KB by just running tpch Q2 many times (and could probably grow further if I kept going). Running the same query with different limits grows the log even faster - I got up to ~10MB that way - and running completely different queries would probably grow it faster still. This change truncates the later case down to <2MB, so I think it's a good improvement even if it doesn't totally fix the problem.
:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@knz
Copy link
Contributor Author

knz commented Sep 20, 2022

I'm able to grow the log up to 100KB by just running tpch Q2 many times (and could probably grow further if I kept going)

That does not make sense to me. When you run a SQL input file with the shell, no entry should be added to the history!

Can you show us what is the precise command you run?

@DrewKimball
Copy link
Collaborator

That does not make sense to me. When you run a SQL input file with the shell, no entry should be added to the history!

I was copy-pasting into the interactive session rather than using -f or -e; is that what you mean? I agree that it shouldn't be possible to grow the log otherwise.

Can you show us what is the precise command you run?

Now that I'm trying again, it doesn't always happen - running the same query repeatedly and waiting for each previous one to finish before pasting the next and hitting enter does not cause the log to grow, as expected. However, I found two cases that do cause it to grow: pasting and hitting enter before the previous run returns, and alternating between two queries.

For the first case it was easier to do tpch Q5 than Q2 because it takes longer to return (running against a tpch db obviously):

SELECT
    n_name,
    sum(l_extendedprice * (1 - l_discount)) AS revenue
FROM
    customer,
    orders,
    lineitem,
    supplier,
    nation,
    region
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND l_suppkey = s_suppkey
    AND c_nationkey = s_nationkey
    AND s_nationkey = n_nationkey
    AND n_regionkey = r_regionkey
    AND r_name = 'ASIA'
    AND o_orderDATE >= DATE '1994-01-01'
    AND o_orderDATE < DATE '1994-01-01' + INTERVAL '1' YEAR
GROUP BY
    n_name
ORDER BY
    revenue DESC;

For the second I just alternated between SELECT 1; and SELECT 2;.

Both cases cause a consistent increase in log size on each iteration. The second case seems more serious than the first, since it seems to indicate that the de-duplication isn't working over the whole history as expected. Maybe de-duplication over the entire history only happens occasionally, and the trigger condition isn't being met? I did notice during some of the tests that the log size would shrink significantly before growing again, but it didn't seem consistent or predictable...

@DrewKimball
Copy link
Collaborator

For that first case (pasting and running before previous one completes) I was able to reproduce with this statement:

SELECT
pg_sleep(3);

but not this one:

SELECT pg_sleep(3);

Going back through the history with the up arrow shows that the successful reproduction executes the second time looking like this:

SELECT^Mpg_sleep(3);

So I think the first case is just a special case of the second since it causes the query to alternate between a newline and ^.

@knz
Copy link
Contributor Author

knz commented Sep 20, 2022

thanks for explaining!

indeed, the de-dup as implemented only de-dups the last entries. So this explains it.

bors r=DrewKimball

@craig
Copy link
Contributor

craig bot commented Sep 20, 2022

Build failed:

@knz
Copy link
Contributor Author

knz commented Sep 20, 2022

unrelated flake #88249

bors r=DrewKimball

@craig
Copy link
Contributor

craig bot commented Sep 20, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cockroachdb cli client is slower than.. itself
4 participants