-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster delete options #78680
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. I was unable to automatically find someone to ping. If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
If you use hash-sharded indexes, you can
|
well I guess that wont help with already existing data? When you say time-ordered inserts, I would assume that it would be possible to delete those quite fast since it would be able to efficiently find everything before a certain time? |
and the biggest problem is probably the locking and retrial of transactions. Issuing fast delete queries by selecting their ids first got us deleting only around 20.000 records per ~6 seconds. As far as I know there is also no way to execute big operations right? With big I mean, delete once 1 million records, and accept blocking the whole table for ~10-40 seconds. |
So we ended up renaming the old table, recreating a new one and inserting over the stuff we didn't wanted to delete. Take this thought for example: |
In the end, deletion shouldn't be slower than inserting. I can insert at a rate of way over 100k rows per second, but can't even delete 10k in one. |
@wzrdtales thanks for the feedback. Sorry you ran into this, glad you were able to figure out a workaround. We do have some plans to speed up deletes, for example:
Regarding your last point, unfortunately I think it is unlikely that SQL We have an open issue about supporting |
there is slight difference with the databases you named. They could finish deletes earlier than cockroachdb, b/c they in doubt can allow locking the table in their favour, while cockroachdb will just stubbornly restart a transaction even without a lot of traffic on the table just b/c it takes too long for it. I will have a look at drop partition and let you know my opinion |
The locking you're describing is tracked in #50181 |
@ajwerner not quite, that is about row locking and unfortunately open for a long time. |
@michae2 |
@vy-ton and @kevin-v-ngo tagging you to note that this is another painful deletes issue. @wzrdtales I'm going to close this issue in favor of tracking |
And the locking behavior as well is at fault not just a feature like drop partition missing.
Sent from Nine
…--
Best
--
Tobias Gurtzick
CEO
WizardTales GmbH
Additional Contacts:
@github | ***@***.***
--
Website: https://www.wizardtales.com
***@***.***
nmlpm: -- coming soon & open source --
Please tag your mails with "request", "issue" or "proposal", unless none of these fit or there already exists a thread.
-- Why should one reach for what one may be capable reaching, that's awfully boring and after all one still be ones yesterday being. --
Sitz der Gesellschaft: Ratingen
Kokkolastr. 5
40882 Ratingen
Germany
Registergericht: Amtsgericht Düsseldorf, HRB 84140
Geschäftsführer: Tobias Gurtzick
________________________________
Von: Michael Erickson ***@***.***>
Gesendet: Dienstag, 12. April 2022 21:12
An: cockroachdb/cockroach
Cc: Tobias Gurtzick; Mention
Betreff: Re: [cockroachdb/cockroach] Faster delete options (Issue #78680)
@vy-ton and @kevin-v-ngo tagging you to note that this is another painful deletes issue. DROP PARTITION might have helped in this case.
@wzrdtales I'm going to close this issue in favor of tracking DROP PARTITION support in #51295 and improved delete performance in the other issues linked above.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Can you help unpack what you meant by:
Are you suggesting that a full-table lock would be better than row locking for the relevant rows? Is the idea that row-locking doesn't capture gaps and those gaps may contain new writes and thus may invalidate the original scan? |
other dbs allow you explicitly lock the whole table (a sacrifice the user makes on intention). cockroachdb does not even allow remotely longer running delete queries b/c of locking and forcing them to retry and that is the main issue I see why it takes so long to delete. B/c cockroachdb certainly is capable of searching the rows to delete faster and would certainly be faster in getting them ready for delete when it can just act alone on that table in the meantime without needing to care about locking at all. Drop partition is more elegant of course since it would avoid that completely for some scenarios. But we already have one datastructure where building up proper partitions would be quite a strange setup, since we would delete only all the hourly records, except the last one of a day (the day might be even updated multiple weeks later and get a new entry) after they are older than two years. |
Are you aware of the experimental TTL feature we're launching in 22.1? Would that be of interest to you? |
no not aware yet |
Is your feature request related to a problem? Please describe.
Right now the speed of inserting is way faster than deleting rows. We have a table with 12billion+ rows and we want to start deleting everything before date XYZ. However, doing this would take forever, probably it would be faster to create a new table with only the wanted entries, than actually deleting.
Describe the solution you'd like
Support for large delete statements, if necessary with a complete block for the time being of the whole table.
Describe alternatives you've considered
Looping like the docs say, but it takes ages... .
Jira issue: CRDB-14214
The text was updated successfully, but these errors were encountered: