-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query optimisations for massive libraries #478
Comments
wow 3M tracks! can you try run with bba1a80 and see if you have any slow db queries? if a query takes longer than 5 seconds the http handler with timeout |
Thanks for your quick response! It looks I'm hitting that 5s timeout:
|
I did a quick test increasing timeout vars in cmd/gonic/gonic.go:
.. and so far looks good. Not sure if it's the correct way to fix it. |
ah looks like the artist info query is taking 10.94 seconds maybe i can optimise that query. would you mind emailing me a copy of your database? you can remove personal info from it like
then email me the backup |
Could it be a cache miss and then a delay from last.fm? I guess probably not if it happens to manifest for the guy with 3M tracks but I thought I'd throw it out there. |
@brian-doherty it seems to be the query used to find the next artist info that needs to be refreshed Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0 you can see the query is taking 10sec (10935.083355ms), in that time the database is locked so other requests will timeout waiting for the db to be unlocked |
does anyone know how we could remove the scan on artists? the query runs every N seconds to find artists with no or stale info so needs to be pretty snappy |
Scan or not, when you run this query you're potentially asking for a very large number of rows very frequently, even though you're keeping the info cached for 30 days. Would it work better if you asked it to only return the n oldest, and handled those, and left the rest for the next refresh cycle? |
that query should just return one row (there a LIMIT 1 on the end) and in the application it's a First() |
My bad. Maybe you need an index on updated_at? Then it can go straight to the record with the lowest value instead of searching for one that's below. |
that's what i was thinking too, there is an index there on line 7, still no luck |
i've also asked on the ##sqlite IRC channel on libera, they're usually quite helpful in there |
What if you order by updated_at, instead of id? Then it doesn't have to search at all? |
ooh it seems to be different. at least the scan is a bit lower. dunno if that's better https://www.db-fiddle.com/f/miGsKWi3oZBBdjkejeAdgM/3 i can test it though if @cascooscuro sends the DB |
You could also structure it as two queries -- do an empty one if its there,
otherwise do the oldest if it's old enough.
…On Tue, Mar 5, 2024, 5:14 PM Senan Kelly ***@***.***> wrote:
ooh it seems to be different. at least the scan is a bit lower. dunno if
that's better https://www.db-fiddle.com/f/miGsKWi3oZBBdjkejeAdgM/3
i can test it though if @cascooscuro <https://github.com/cascooscuro>
sends the DB
—
Reply to this email directly, view it on GitHub
<#478 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASFD42KCWZIE65VIX7Z446LYWZGUDAVCNFSM6AAAAABEFAS6QCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZG44TMOJSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
DB file shared. Thanks for your help! |
haha i just tested it, it's actually quite quick Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0 the log output is a bit silly, this is actually |
identified two more slow queries on @cascooscuro's database that are near the 5s timeout
|
Now those are queries I'd expect to be slow. :D Start caching the sums and counts in the DB? |
could help but seems like the order by is the biggest offender, by far this is after adding an index on created_at |
From my limited experience with Sqlite, it can be quite dumb when doing joins. With indices the query should be fast, I suspect it's not using them - it should help to rewrite it using subqueries (i.e. first select the album ids explicitly, then do the joins). |
@lomereiter investigations and/or pull requests very welcome! if you like I can provide a large database for testing |
I did a test on my minuscule DB, turns out that both indexes (on albums.created_at and tracks.album_id) can help if a subquery is used to help the query planner. I wonder what the difference is like for the large database, not sure if it's worth sharing the file, you could just run this one query locally and see if it helps.
Compare the query plans:
vs
|
The first time I run it:
and then:
and the "after" query:
The results are consistent in what I see with gonic. The first time I click it may timeout, but after that it "warms" and the response is much quicker. |
@cascooscuro interesting, I suppose what you call "warm-up" is reading from disk. If you don't mind please share the DB (send an email to [email protected]) |
the query plans:
|
Ah, I see, you should have added the indices first, as in the fiddle:
Anyway, I think a single index on tracks is good enough, as albums can be sorted by many other fields - see the PR. |
would this be closed by #508 ? |
yes, it has improved quite a lot. Thanks |
gonic version: v0.16.2
from source: 80a9aeb..88e58c0 master-> origin/master
running behind ngnix proxy
When browsing artists/album or when starting a new stream the response timeouts and have to click again, this second time the server answers instantly. There are no issues with the streams that are already playing.
This issue looks similar to #411 but I'm not using docker/jukebox.
The only thing I see in the logs are:
Mar 04 12:18:08 server gonic[3420077]: 2024/03/04 12:18:08 error writing subsonic response: write tcp 10.11.12.13:4747->10.11.12.13:35128: i/o timeout
Mar 04 12:18:08 server gonic[3420077]: 2024/03/04 12:18:08 response 200 GET /getMusicDirectory?c=Jamstash&callback=angular.callbacks._1c&f=jsonp&id=al-99067&p=enc:123456789&u=user&v=1.6.0
My library is somewhat extreme: 3M tracks, 300K folders.
Thanks for your help and for this great project.
The text was updated successfully, but these errors were encountered: