Better database for Tribler/ Prevent Trustchain exit nodes wiping out #4471

grimadas · 2019-04-25T13:22:29Z

Current problems for scalability

Current database cannot scale well and handling big amount of trustchain records. After some number of records we see a degradation in performance for exit nodes.
One of the solutions is to wipe out database and start over, but that results in identity, reputation lose.
Alternately, we can improve underlying trustchain database. I think a good start is to look into workload and see how other databases can handle that.

Abstract Database module

Currently we have at least three different entries to the database with hardcoded sql queries in the codebase: Upgrade, IPv8/Database.py and Metastore using pony-orm.
To simplify the migration from one database to another to improve salability and latency we need to abstract database access( we might want to look into graph, key/value databases in future).

A good starting point would be a database adapter and an sqlite implementation of it.

The text was updated successfully, but these errors were encountered:

ichorid · 2019-04-25T16:14:55Z

I doubt it will be possible to abstract out Metadata Store access in the foreseeable future. PonyORM already provides a good level of abstraction. However, Trustchain can benefit greatly from switching to some NoSQL-based store (e.g. LevelDB). Building an abstraction level for storing blocks could become a nice step in that direction.

@qstokkink , @devos50, do any components of Tribler beside Trustchain use IPv8/Database.py?

devos50 · 2019-04-25T16:37:48Z

@ichorid No, I don't think so.

qstokkink · 2019-04-26T07:34:47Z

The attestation community has its own Database + database schema.

ichorid · 2019-04-26T08:33:57Z

@qstokkink , would you consider moving the attestation community to use Pony?

qstokkink · 2019-04-26T09:07:17Z

@ichorid I just went through the code with @grimadas, doesn't seem like a good idea right now.

grimadas · 2019-04-26T09:57:01Z

I'll first do quick and dirty check if it is worth to migrate to db other than sqlite, and proceed after.

synctext · 2019-04-30T12:25:48Z

We require DB embedding! As just discussed: our choice and selection of database technology is significantly limited due to our requirement that everything is bundled within our installer. We can't rely on big servers with dedicated database services, our DB is running locally and competing for resources.

grimadas · 2019-04-30T14:18:24Z

These are benchmarking results for database behind tribler explorer.

Figure 1 is the average execution time for a query. You see that one query get_block_creation takes more than 40 sec.

Figure 2 is showing number of times when query was executed.

Figure 3 is the total time for the experiment.

ichorid · 2019-04-30T14:55:09Z

Block explorer is a special case. Could you try to capture a typical client Tribler workload session and build the same charts for it?

qstokkink · 2019-04-30T15:01:40Z

Maybe it would make sense then for the block explorer to use a different database type. As this is not used by our users anyway, we can change this without any major repercussions (we'll just have to sit through the databse conversion).

I also agree with @ichorid, this tells us nothing about the Tribler user load.

grimadas · 2019-04-30T15:40:07Z

I'll do local experiments now, imitating typical workload of a Tribler user: idle, downloading etc.
Also, exit node is another type of user with heavy workload

grimadas · 2019-05-09T08:19:15Z

Some results from the exit node. It seems that database queries are not the bottleneck.

grimadas · 2019-05-09T09:55:55Z

devos50 · 2019-09-08T10:04:05Z

I made some modifications to the TrustChain crawler. First, I increased the statistics maintainance interval to one hour, which means that block creation statistics (the graph) is rebuilt every hour, instead of every five minutes. Aggregation of statistics is a major bottleneck as it executes a resource-intensive SQL query.

Second, I started to explore whether a key-value store can be utilized to improve the performance of the TrustChain crawler. Last week I was told that there was some initial work to rewrite the TrustChain persistence layer to use key-value (we also received various questions already why we were not using one already). @grimadas do you have some initial code/results for this?

grimadas · 2019-09-08T18:54:42Z

There are several options with good python bindings, but I guess one of the best matches for us is lmdb. It is pretty easy and supports async read/write with multiple threads.

synctext · 2019-09-08T19:59:51Z

Multi-core crawling is possible in near future?

ichorid · 2019-09-09T16:38:46Z

This is a good target for 7.5 release, not for the technical 7.4.

qstokkink · 2024-08-29T10:02:17Z

Looking at this issue, I think it is no longer applicable for the current state of affairs: exit nodes no longer run any database.

devos50 added the type: enhancement label Apr 25, 2019

devos50 added this to the V7.4: Libtorrent wrapper refactor milestone Apr 25, 2019

ichorid added the needs discussion label Apr 26, 2019

grimadas changed the title ~~Database abstraction for Tribler~~ Better database for Tribler/ Prevent Trustchain exit nodes whipping out Apr 26, 2019

grimadas changed the title ~~Better database for Tribler/ Prevent Trustchain exit nodes whipping out~~ Better database for Tribler/ Prevent Trustchain exit nodes wiping out Apr 26, 2019

synctext assigned grimadas Apr 30, 2019

synctext mentioned this issue Apr 30, 2019

Trustchain scalability and stess testing experiment #4140

Closed

ichorid modified the milestones: V7.4: Python 3, V7.5: core refactoring Sep 9, 2019

ichorid modified the milestones: V7.5: core refactoring, V7.6: Collective authoring Mar 2, 2020

ichorid modified the milestones: V7.6: Faster, Quicker and more Responsive, Next-next release Jun 11, 2020

drew2a added the was in next-next label Nov 4, 2020

drew2a modified the milestones: Next-next release, Backlog Nov 4, 2020

drew2a added component: token economy and removed was in next-next labels Jan 15, 2021

qstokkink removed this from the Backlog milestone Aug 23, 2024

qstokkink closed this as completed Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better database for Tribler/ Prevent Trustchain exit nodes wiping out #4471

Better database for Tribler/ Prevent Trustchain exit nodes wiping out #4471

grimadas commented Apr 25, 2019 •

edited

Loading

ichorid commented Apr 25, 2019

devos50 commented Apr 25, 2019

qstokkink commented Apr 26, 2019

ichorid commented Apr 26, 2019

qstokkink commented Apr 26, 2019

grimadas commented Apr 26, 2019

synctext commented Apr 30, 2019

grimadas commented Apr 30, 2019

ichorid commented Apr 30, 2019

qstokkink commented Apr 30, 2019

grimadas commented Apr 30, 2019 •

edited

Loading

grimadas commented May 9, 2019

grimadas commented May 9, 2019

devos50 commented Sep 8, 2019

grimadas commented Sep 8, 2019

synctext commented Sep 8, 2019

ichorid commented Sep 9, 2019

qstokkink commented Aug 29, 2024

Better database for Tribler/ Prevent Trustchain exit nodes wiping out #4471

Better database for Tribler/ Prevent Trustchain exit nodes wiping out #4471

Comments

grimadas commented Apr 25, 2019 • edited Loading

Current problems for scalability

Abstract Database module

ichorid commented Apr 25, 2019

devos50 commented Apr 25, 2019

qstokkink commented Apr 26, 2019

ichorid commented Apr 26, 2019

qstokkink commented Apr 26, 2019

grimadas commented Apr 26, 2019

synctext commented Apr 30, 2019

grimadas commented Apr 30, 2019

ichorid commented Apr 30, 2019

qstokkink commented Apr 30, 2019

grimadas commented Apr 30, 2019 • edited Loading

grimadas commented May 9, 2019

grimadas commented May 9, 2019

devos50 commented Sep 8, 2019

grimadas commented Sep 8, 2019

synctext commented Sep 8, 2019

ichorid commented Sep 9, 2019

qstokkink commented Aug 29, 2024

grimadas commented Apr 25, 2019 •

edited

Loading

grimadas commented Apr 30, 2019 •

edited

Loading