-
-
Notifications
You must be signed in to change notification settings - Fork 886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database performance is abysmal #405
Comments
Here is an analysis of the postgres logs using pgBadger. One thing I noticed is that every query selects a lot of columns, and uses a lot of stuff like selection and sort parameters. http://nextcloud.nutomic.com/index.php/s/m3j8yKcG6zTR9mC Edit: Query durations seem to be missing in the log analysis for some reason. The most frequent query (actually the second, after |
Okay I've got this big db imported locally, and am getting some similarly bad results. I'm gonna make a script that generates a bunch of explain files for the most common queries and their output, and puts them in a folder, so they can be analyzed by http://tatiyants.com/pev/ This is the general command it needs:
Ideally doing diffs of the json before and after creating the indexes |
Okay yeah this is pretty substantial. I'm getting like 10x better results just by adding some indexes on |
Okay I've added some indexes with a commit I'll post here in a second, but here's what just adding a few indexes did: To test this, run
The main thing that needs to be optimized, is the Someone could take a look at the expensive cross joins I'm doing for the views here: https://github.com/dessalines/lemmy/blob/master/server/migrations/2019-12-29-164820_add_avatar/up.sql#L23 They're the only way I could finally come up with to basically be a single fetch, where you can optionally provide your |
So I tried to do a LOT of optimizing today, and could only get the post query down to The other option options would be to have the post_view be a materialized view, which would mean the whole caching mess and having to periodically refresh that view, or to move the hot_sorting off sql which would be my last resort. |
Something weird is going on, maybe with the nginx configs on the servers or something. I'm testing out adding the connection pooling locally, and finding that locally, both before and after adding the proper r2d2 code, I'm getting the same speed with When I push to dev.lemmy.ml, and test there, I get the same before and afters, but its only
The only difference is possibly the nginx config. |
Is postgres still using 90% of cpu or more while running the benchmark? If so, the queries are still too complex and we need to cache them or something. |
I tried this on my dev server, and postgres only goes up to 10% usage, but its still at only 3 reqs per sec. I'm running the exact same postgres docker image locally too. |
I put most of the comments in #411 . But we might be able to just close this ticket, since the DB performance is now hundreds of times better, and every major query (front page posts, user searching, comments, and communities) is now < |
I'll close now but we can re-open if it becomes an issue. At least we have the performance tests in |
Re-opening this issue, as the DB is now currently the CPU bottleneck. The best way to figure this out, is to analyze the actually running longer queries, and optimize them individually. |
Better make a new issue, this is already quite long. |
I ran a quick benchmark against the feeds endpoint, and Lemmy only manages to handle 1.19 requests/second (peertube.social does 34 r/s, and is written in typescript). I dont know if this is a problem with feeds in particular or with Lemmy in general, as other APIs are only available as websocket and much more complicated to test.
I ran the following using ab.
I took this screenshot on the server at the same time. It is obvious that the database is doing too much work. Maybe we need to add more indexes in the database, or reduce the amount of queries.
Edit: Worth noting that I cant reproduce this problem locally, so it is probably related to the database size.
Edit 2: This also means that any Lemmy instance can be DDoSed by doing only one request per second.
The text was updated successfully, but these errors were encountered: