Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query optimisations for massive libraries #478

Open
cascooscuro opened this issue Mar 4, 2024 · 30 comments · May be fixed by #508
Open

query optimisations for massive libraries #478

cascooscuro opened this issue Mar 4, 2024 · 30 comments · May be fixed by #508

Comments

@cascooscuro
Copy link

gonic version: v0.16.2
from source: 80a9aeb..88e58c0 master-> origin/master

running behind ngnix proxy

When browsing artists/album or when starting a new stream the response timeouts and have to click again, this second time the server answers instantly. There are no issues with the streams that are already playing.

This issue looks similar to #411 but I'm not using docker/jukebox.

The only thing I see in the logs are:

Mar 04 12:18:08 server gonic[3420077]: 2024/03/04 12:18:08 error writing subsonic response: write tcp 10.11.12.13:4747->10.11.12.13:35128: i/o timeout
Mar 04 12:18:08 server gonic[3420077]: 2024/03/04 12:18:08 response 200 GET /getMusicDirectory?c=Jamstash&callback=angular.callbacks._1c&f=jsonp&id=al-99067&p=enc:123456789&u=user&v=1.6.0

My library is somewhat extreme: 3M tracks, 300K folders.

Thanks for your help and for this great project.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

wow 3M tracks! can you try run with bba1a80 and see if you have any slow db queries? if a query takes longer than 5 seconds the http handler with timeout

@cascooscuro
Copy link
Author

Thanks for your quick response!

It looks I'm hitting that 5s timeout:

Mar 05 19:10:09 server gonic[1522298]: 2024/03/05 19:10:09 response  0  OPTIONS /getMusicDirectory.view?c=Jamstash&callback=angular.callbacks._3z&f=jsonp&id=al-98576&p=enc:12345678901234567&u=user&v=1.6.0
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/db/db.go:1162.088643msSELECT * FROM "users"  WHERE (name=?) ORDER BY "users"."id" ASC LIMIT 1[user] 1
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:761.232106msSELECT * FROM "albums"  WHERE ("albums"."id" = 98576) ORDER BY "albums"."id" ASC LIMIT 1[] 1
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:76789.401µsSELECT * FROM "album_stars"  WHERE ("album_id" IN (?)) AND (user_id=?) ORDER BY "album_stars"."user_id" ASC[98576 2] 0
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:76925.156µsSELECT * FROM "album_ratings"  WHERE ("album_id" IN (?)) AND (user_id=?) ORDER BY "album_ratings"."user_id" ASC[98576 2] 0
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:844.659379491sSELECT * FROM "albums"  WHERE (parent_id=?) ORDER BY albums.right_path COLLATE NOCASE[98576] 6229
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:4493.387870662sSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:1092.421367091sSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:11.632248246 +0100 CET m=-2591903.976856681] 0
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:8436.93328msSELECT * FROM "album_stars"  WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,..<snip>... AND (user_id=?)[102807 103933 105783 106052 106544 98999...
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:846.882454msSELECT * FROM "album_ratings"  WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?..<snip>... AND (user_id=?)[100243 100270 101787 102513 100559 102037 103125 256878 102864 103074 103654 98968 99165 100271 102334 104915 104969 105638 105840 98921 100269 103921...
Mar 05 19:10:14 server gonic[1522298]:  106279 100658 100931 101633 1
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:986.643003428sSELECT * FROM "tracks"  WHERE (album_id=?) ORDER BY "filename"[98576] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:4495.090854123sSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:1091.152361334sSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:19.63362109 +0100 CET m=-2591895.975483814] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:9863.126197msSELECT * FROM "artists" INNER JOIN "album_artists" ON "album_artists"."artist_id" = "artists"."id" WHERE (1 <> 1)[] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:44962.965328msSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:98178.219µsSELECT * FROM "artists" INNER JOIN "track_artists" ON "track_artists"."artist_id" = "artists"."id" WHERE (1 <> 1)[] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_raw.go:261520.187µsSELECT * FROM "transcode_preferences"  WHERE (user_id=?) AND (client COLLATE NOCASE IN (?,?)) ORDER BY client DESC LIMIT 1[2 * Jamstash] 1
Mar 05 19:10:20 server gonic[1522298]: 2024/03/05 19:10:20 error writing subsonic response: write tcp 10.11.12.13:4747->10.11.12.13:45596: i/o timeout
Mar 05 19:10:20 server gonic[1522298]: 2024/03/05 19:10:20 response  200  GET /getMusicDirectory?c=Jamstash&callback=angular.callbacks._3z&f=jsonp&id=al-98576&p=enc:12345678901234567&u=user&v=1.6.0
Mar 05 19:10:25 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:449476.721µsSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0

@cascooscuro
Copy link
Author

I did a quick test increasing timeout vars in cmd/gonic/gonic.go:

        var (
                readTimeout  = 10 * time.Second
                writeTimeout = 10 * time.Second
                idleTimeout  = 10 * time.Second
        )

.. and so far looks good. Not sure if it's the correct way to fix it.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

ah looks like the artist info query is taking 10.94 seconds

maybe i can optimise that query. would you mind emailing me a copy of your database? you can remove personal info from it like

$ cp gonic.db backup.db
$ sqlite3 backup.db "delete from users"

then email me the backup

@brian-doherty
Copy link
Contributor

Could it be a cache miss and then a delay from last.fm? I guess probably not if it happens to manifest for the guy with 3M tracks but I thought I'd throw it out there.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

@brian-doherty it seems to be the query used to find the next artist info that needs to be refreshed

Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0

you can see the query is taking 10sec (10935.083355ms), in that time the database is locked so other requests will timeout waiting for the db to be unlocked

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

also seems like we're doing a scan
image

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

does anyone know how we could remove the scan on artists?
https://www.db-fiddle.com/f/miGsKWi3oZBBdjkejeAdgM/1

the query runs every N seconds to find artists with no or stale info so needs to be pretty snappy

@brian-doherty
Copy link
Contributor

Scan or not, when you run this query you're potentially asking for a very large number of rows very frequently, even though you're keeping the info cached for 30 days. Would it work better if you asked it to only return the n oldest, and handled those, and left the rest for the next refresh cycle?

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

that query should just return one row (there a LIMIT 1 on the end) and in the application it's a First()

@brian-doherty
Copy link
Contributor

My bad. Maybe you need an index on updated_at? Then it can go straight to the record with the lowest value instead of searching for one that's below.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

that's what i was thinking too, there is an index there on line 7, still no luck

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

i've also asked on the ##sqlite IRC channel on libera, they're usually quite helpful in there

@brian-doherty
Copy link
Contributor

What if you order by updated_at, instead of id? Then it doesn't have to search at all?

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

ooh it seems to be different. at least the scan is a bit lower. dunno if that's better https://www.db-fiddle.com/f/miGsKWi3oZBBdjkejeAdgM/3

i can test it though if @cascooscuro sends the DB

@brian-doherty
Copy link
Contributor

brian-doherty commented Mar 6, 2024 via email

@cascooscuro
Copy link
Author

DB file shared. Thanks for your help!

@sentriz
Copy link
Owner

sentriz commented Mar 6, 2024

haha i just tested it, it's actually quite quick

Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0

the log output is a bit silly, this is actually artistinfocache.go:10935.083355ms is actually 35ms on line 109 😁

@sentriz
Copy link
Owner

sentriz commented Mar 6, 2024

identified two more slow queries on @cascooscuro's database that are near the 5s timeout

4.12s   SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration FROM "albums" LEFT JOIN tracks ON tracks.album_id=albums.id JOIN album_artists ON album_artists.album_id=albums.id  GROUP BY albums.id ORDER BY created_at DESC LIMIT 50 OFFSET 0
4.13s   SELECT artists.*, count(albums.id) album_count FROM "artists" JOIN artist_stars ON artist_stars.artist_id=artists.id JOIN artist_appearances ON artist_appearances.artist_id=artists.id JOIN albums ON albums.id=artist_appearances.album_id WHERE (artist_stars.user_id=?) GROUP BY artists.id ORDER BY artist_stars.star_date DESC

@brian-doherty
Copy link
Contributor

Now those are queries I'd expect to be slow. :D Start caching the sums and counts in the DB?

@sentriz
Copy link
Owner

sentriz commented Mar 6, 2024

could help but seems like the order by is the biggest offender, by far

image

this is after adding an index on created_at

https://www.db-fiddle.com/f/vQKEw9wCovnzqCoZTfnjod/1

@sentriz sentriz changed the title streaming keeps working but Artist/Album browsing stops query optimisations for massive libraries Mar 8, 2024
@lomereiter
Copy link
Contributor

From my limited experience with Sqlite, it can be quite dumb when doing joins. With indices the query should be fast, I suspect it's not using them - it should help to rewrite it using subqueries (i.e. first select the album ids explicitly, then do the joins).

@sentriz
Copy link
Owner

sentriz commented May 15, 2024

@lomereiter investigations and/or pull requests very welcome! if you like I can provide a large database for testing

@lomereiter
Copy link
Contributor

lomereiter commented May 16, 2024

I did a test on my minuscule DB, turns out that both indexes (on albums.created_at and tracks.album_id) can help if a subquery is used to help the query planner.

I wonder what the difference is like for the large database, not sure if it's worth sharing the file, you could just run this one query locally and see if it helps.

$ time sqlite3 gonic.db 'SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration 
        FROM albums 
        LEFT JOIN tracks ON tracks.album_id=albums.id 
        JOIN album_artists ON album_artists.album_id=albums.id 
        GROUP BY albums.id 
        ORDER BY created_at DESC 
        LIMIT 50 OFFSET 0;' >before.txt

real    0m0.022s
user    0m0.022s
sys     0m0.000s
$ time sqlite3 gonic.db 'SELECT falbums.*, count(tracks.id) child_count, sum(tracks.length) duration 
        FROM (SELECT albums.* 
                FROM albums 
                JOIN album_artists ON album_artists.album_id=albums.id 
                ORDER BY created_at DESC 
                LIMIT 50 OFFSET 0) falbums 
        LEFT JOIN tracks ON tracks.album_id=falbums.id 
        GROUP BY falbums.id 
        ORDER BY created_at DESC;' >after.txt

real    0m0.004s
user    0m0.000s
sys     0m0.004s
$ diff after.txt before.txt

Compare the query plans:

QUERY PLAN
|--SCAN albums
|--SEARCH tracks USING INDEX idx_test (album_id=?) LEFT-JOIN
|--SEARCH album_artists USING COVERING INDEX idx_album_id_artist_id (album_id=?)
`--USE TEMP B-TREE FOR ORDER BY

vs

QUERY PLAN
|--CO-ROUTINE falbums
|  |--SCAN albums USING INDEX idx_test2
|  `--SEARCH album_artists USING COVERING INDEX idx_album_id_artist_id (album_id=?)
|--SCAN falbums
|--SEARCH tracks USING INDEX idx_test (album_id=?) LEFT-JOIN
|--USE TEMP B-TREE FOR GROUP BY
`--USE TEMP B-TREE FOR ORDER BY

@cascooscuro
Copy link
Author

The first time I run it:

time sqlite3 gonic.db 'SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration
        FROM albums
        LEFT JOIN tracks ON tracks.album_id=albums.id
        JOIN album_artists ON album_artists.album_id=albums.id
        GROUP BY albums.id
        ORDER BY created_at DESC
        LIMIT 50 OFFSET 0;' >before.txt

real    0m6.299s
user    0m2.552s
sys     0m0.960s

and then:

time sqlite3 gonic.db 'SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration
        FROM albums
        LEFT JOIN tracks ON tracks.album_id=albums.id
        JOIN album_artists ON album_artists.album_id=albums.id
        GROUP BY albums.id
        ORDER BY created_at DESC
        LIMIT 50 OFFSET 0;' >before.txt

real    0m1.992s
user    0m1.769s
sys     0m0.198s

and the "after" query:

time sqlite3 gonic.db 'SELECT falbums.*, count(tracks.id) child_count, sum(tracks.length) duration
        FROM (SELECT albums.*
                FROM albums
                JOIN album_artists ON album_artists.album_id=albums.id
                ORDER BY created_at DESC
                LIMIT 50 OFFSET 0) falbums
        LEFT JOIN tracks ON tracks.album_id=falbums.id
        GROUP BY falbums.id
        ORDER BY created_at DESC;' >after.txt

real    0m1.824s
user    0m1.469s
sys     0m0.333s

The results are consistent in what I see with gonic. The first time I click it may timeout, but after that it "warms" and the response is much quicker.

@lomereiter
Copy link
Contributor

@cascooscuro interesting, I suppose what you call "warm-up" is reading from disk. If you don't mind please share the DB (send an email to [email protected])

@lomereiter lomereiter linked a pull request May 25, 2024 that will close this issue
@cascooscuro
Copy link
Author

the query plans:

sqlite3 gonic.db 'EXPLAIN QUERY PLAN SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration
        FROM albums
        LEFT JOIN tracks ON tracks.album_id=albums.id
        JOIN album_artists ON album_artists.album_id=albums.id
        GROUP BY albums.id
        ORDER BY created_at DESC
        LIMIT 50 OFFSET 0;'
QUERY PLAN
|--SCAN albums
|--SEARCH album_artists USING COVERING INDEX idx_album_id_artist_id (album_id=?)
|--BLOOM FILTER ON tracks (album_id=?)
|--SEARCH tracks USING AUTOMATIC COVERING INDEX (album_id=?) LEFT-JOIN
`--USE TEMP B-TREE FOR ORDER BY
 sqlite3 gonic.db 'EXPLAIN QUERY PLAN SELECT falbums.*, count(tracks.id) child_count, sum(tracks.length) duration
        FROM (SELECT albums.*
                FROM albums
                JOIN album_artists ON album_artists.album_id=albums.id
                ORDER BY created_at DESC
                LIMIT 50 OFFSET 0) falbums
        LEFT JOIN tracks ON tracks.album_id=falbums.id
        GROUP BY falbums.id
        ORDER BY created_at DESC;'
QUERY PLAN
|--CO-ROUTINE falbums
|  |--SCAN album_artists
|  |--SEARCH albums USING INTEGER PRIMARY KEY (rowid=?)
|  `--USE TEMP B-TREE FOR ORDER BY
|--SCAN falbums
|--BLOOM FILTER ON tracks (album_id=?)
|--SEARCH tracks USING AUTOMATIC COVERING INDEX (album_id=?) LEFT-JOIN
|--USE TEMP B-TREE FOR GROUP BY
`--USE TEMP B-TREE FOR ORDER BY

@lomereiter
Copy link
Contributor

Ah, I see, you should have added the indices first, as in the fiddle:

CREATE INDEX idx_test on tracks (album_id);
CREATE INDEX idx_test_2 on albums (created_at);

Anyway, I think a single index on tracks is good enough, as albums can be sorted by many other fields - see the PR.

@sentriz
Copy link
Owner

sentriz commented Sep 18, 2024

would this be closed by #508 ?

@cascooscuro
Copy link
Author

yes, it has improved quite a lot. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@lomereiter @cascooscuro @sentriz @brian-doherty and others