-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
7.3 database migration is slow unless I turn off fsync/fdatasync #4441
Comments
Shall it do more batches per database transaction, committing e.g. every 20 seconds instead of every N entries? |
Maybe addressing bloat before migration;
|
@Dmole we're hard at work to downsize and speed up. Starting from Furthermore, if you don't want to store any logs, you can edit your The Lastly, |
Thanks, I noticed the 7.3 improvement; why I have the "# or this on V7.3.0+ ..." line. dlcheckpoints and old blocks do not apear to be required in V7.3.0B (bandwidth tokens remain, Tribler works) removing the old blocks dropped the file size from 212MB to 12MB. |
What we store on the blockchain is the change and the running total tokens, something like:
Right now we assume the last delivered block is correct and simply use the running total, so everything works. However, if/once we actually require a consistent chain and you do not have the previous entries, your tokens cannot be verified. In the last example, if you only present the last block your token count would be Note that you can filter your own blocks in the sql query, by not deleting SQL entries which have a |
When the blockchain is finally enforced in some future version all bandwidth tokens will be reset (as they were in the past) to prevent exploitation (because they will be linked to real $) (right?). |
Indeed, short term it's a non-issue. I'd hate to see anyone in the future blindly copy-pasting that command and voiding their currency though. |
Our goal is indeed to link our bandwidth currency with other currencies, but we are not ready for this yet. |
So short term Tribler can be light on storage, but long term is there a plan to make the chain scale? maybe 1 agree on a checkpoint transaction once a year That scheme should reliably reduce the chain size from O(time*clients) to O(clients) with the limitation that if you don't use the software for a year your history is lost. |
We actually have a similar checkpoint-based system available to us: #2457 The nasty part is that we then need some form of consensus. |
Yeah there are a bunch of related complications that should be worked out before the chain is actually used for anything, but I was just pointing out that getting the chain size down to O(1) should be one of the goals. |
@Dmole , if you remove |
@ichorid that's no loss for me because untill tribler gets html channel pages (or a meta search), tribler channels are useless to me. AKA tor->tpb/rut/etc->tribler is more practical for the masses. |
Most of discussion here is offtopic anyway. What about sqlite's (or other database's) synchronisations and transactions? I expect it to be rather easy fix, something like |
@vi , unfortunately, if we disable synchronization, the DB will become corrupted in case of a sudden power outage. Fixing this will require us to add recovery procedures, which can add to the code complexity. |
Addind
might be a good idea anyway to mitigate regressions (would have helped historically). Manual restore is better than nothing. vi how big is/was your .Tribler folder? (unbound space is as bad as unbound time) |
@ichorid , Just increase batch size (or do multiple batches per database transaction), so it takes around 30 seconds. It would preserve syncs, but make them rarer. Currently on HDD sync time dominates actual calculations. |
After migration:
Heavy-weight directories are symlinked to separate xfs filesystem. |
@vi , the batch size is determined dynamically, so processing of each batch will never take more than 0.5 seconds. When we increase the batch size to last more then 0.5 seconds, the Twisted reactor becomes unresponsive, the network packets get dropped, etc. Migrating ia 700 Mb Tribler database on a fast SSD takes about 2 hours in background mode (with <0.5 sec batches), and only 10 minutes if everything is processed offline in a single batch. So, it is a tough choice: would we force our users to wait for 10-∞ minutes staring at the "Spinngin Gears" screen, or allow them to use the new version of Tribler instantly, but with annoying "Converting" message for 2-∞ hours? |
There are other choices (backup + no-sync, or trim bloat first). I thought collected_torrents was dead #3960 |
That 2 hours can actually span multiple days. And it is bad if Tribler can't migrade database overnight. One way is to measure sync time and adjust batch size, so that sync time does not dominate.
Can't it be done in background? Or enlarge batches if there are no user interactions for a while.
Bad idea. Without a progress bar it would be as if Tribler simple does not work. |
What about using two SQlite databases for the migration? Until migration is done, the new database is in unsynced unsafe mode. In case of corruption due to surprice shutdown migration can be just restarted from old database. Upon completion, new database replaces the old and syncing is enabled on it. |
It is indeed looks unused. Is it safe to be deleted or its content may be used for tests, etc.? |
@Dmole , collected_torrents are dead, indeed. We only convert the bare minimum metadata from
It is already done in a background thread. The problem is, writing to the database locks it for the main thread.
As I have written above, we already dynamically adjust the batch size so it gets to the maximum size that does not affect other stuff running on the reactor. Targeting for efficiency instead is useless, since if the efficiency threshold is higher than the interactivity threshold, Tribler core becomes unresponsive, and if it is less than the interactivity threshold - the interactivity threshold will still offer more efficiency.
This is exactly how it works now: we open the old database read-only, and just create the corresponding entries in the new one. Unfortunately, the process is dominated by disk synchronization. If we turn off the synchronization, we face the possibility of corruption of new DB on power outage. In addition, we'll have to force Tribler restart at the end of the conversion process to turn it on again, which we would rather avoid. As I told you, the trade-offs are complex, and we have not decided yet how to do this. Basically, there are two ways of doing it:
|
Until old readonly database is deleted, new database is not serious and important. Corruption is handleable by restarting migration. Anyways, it may be nice if there is a "turbo mode" button on migration banner for turning off synchonisation (after accepting confirmations, which state about power outages). This is how I personally handled this (using external means for turning off sync). |
Maybe we should really opt for "offline migration + progress bar + skip button" thing. I would like to hear other team members' opinion on that. @qstokkink, @xoriole , @devos50 ? |
Online is better. ichorid, vi and I used 2 diffrent solutions not on your "straw man" alternatives list; likely others will encounter this issue when 7.3 leaves beta, so if it's to much trouble to improve the code it may be helpful to list the alternatives in the release notes. |
I'm not sure how easy is to opt out syncing externally on Windows and Mac. |
@Dmole , you are definitely right that we should put the comment about alternative options in the release notes. However, we would like to come to a solution that will be acceptable for the majority of our users, who are non-programmers. We cannot rely on them reading (and understanding!) the documentation. What could look like a "straw man" to a technical person, can look quite differently for an ordinary user. |
Is it by the way a good idea to ask confirmation before migrating database to new version when starting non-release (devel or beta) Tribler? Non-aggreement may just quit. |
@vi , we want our beta to be as close as possible to a real end-user experience. One of the more controversial questions was just that: the unavoidable migration procedure. And, in this topic, we're getting the necessary feedback. |
Tribler version/branch+revision:
release-7.3.0-beta1 branch, 65d974c
Operating system and version:
Linux
Steps to reproduce the behavior:
Use 7.2 and earlier for a while, then run 7.3.
Expected behavior:
Database migration is fast enough to complete overnight, even on HDD.
Actual behavior:
It converts only 0.7 batches per second and seems to fully synchonize to disk after every batch.
Disabling filsystem sync globally or using
eatmydata
bumps the speed up to 60 batches per second.The text was updated successfully, but these errors were encountered: