Improve Card browser scrolling performance #11889

oakkitten · 2022-07-19T14:11:46Z

I think Card browser scrolling performance is slightly suboptimal. The UI freezes a lot when you flick down or up, and when I drag the scrollbar I get about 3 FPS.

There seems to be a task that pre-renders the table when you scroll in a background thread. It only renders a little more than 2 screenfuls of rows. I tried to make that task render all rows, and here are the results:

There's 19619 cards in my collection
It took the renderer task 61 seconds to complete
During that time, GC freed 1501 MB, judging by Logcat messages.

Some of the problems with this are:

ListView
Card browser is doing one SQL request for initial search to get card ids, and then multiple requests for each row. Both card and note objects are created, and they both execute SQL in the constructor. (Maybe don't do SQL in constructor??!?!?)
Some fields, such as search field, are not cached in these objects, and so another SQL query is run each time a row comes into view.
Renderer task calls notifyDataSetChanged() after every card
I think that it's also possible to have several rendering tasks running alongside each other, not sure what issues this might cause.

I think this can be drastically improved by taking these steps:

RecyclerView
Run a single query for each search. Do not construct card and note objects, instead, only request what's needed for display, and only store what's needed for display. Use the Cursor from the one query to fetch the rows.
Render all cards in background. I think for a collection like mine the IO shouldn't take more than, say, 300ms? Of course, it's possible to have super big collection for which this might not work well. Maybe do something like this;
- Have renderer task set up in such a way that it renders cards from a specified position, and up to a specified amount, in the specified direction, e.g. “render up to 10 cards up from position 15”. When user scrolls, this task is made to render from the position of scroll to the direction of scroll, and main thread can wait() for these cards, or maybe pass the Cursor to the main thread for a bit.
- ~~Or perhaps only display up to N cards at the same time, and replace the contents when user reaches bottom.~~
- ~~Or think of something else.~~
~~Profile and optimize post-IO processing, and, if too slow, perhaps offload to another thread.~~
Edit: just use browserRowForId.

Also, currently browser is reset every time it is opened. If one day it keeps its state, it'd be useful to think about how it could be updated with new data. E.g. make a search in browser, go to reviewer, edit a card there, go back to search, card is updated with animation. Normally this is done via calculating a diff; ~~this probably wouldn't be feasible for large search results~~.

The text was updated successfully, but these errors were encountered:

viciousAegis · 2022-07-19T15:30:18Z

In my initial plan I have included switching to RecyclerView, and I believe I would be doing that, so we can cross that out 👍🏼

dae · 2022-07-20T09:52:43Z

Some of the performance problems can likely be solved with col.newBackend.backend.browserRowForId(). It's intended to be used on demand (eg with a RecyclerView or similar)

oakkitten · 2022-08-08T17:44:31Z

col.newBackend.backend.browserRowForId()

I am assuming that it ultimately calls this. I am not sure if this will be faster at all. Briefly looking at the code, it seems that it constructs both the card and the note objects from scratch. Not only some of these might not be needed for the query, but these appear to do extra queries; besides, these queries are not optimized the way cursor IO is supposed to be. So—correct me if I'm wrong—the speed increase here will be due to running Rust instead of Java, which will pale before the IO penalty which is mostly the same. (That is, if Rust is faster here—JIT can do miracles!)

dae · 2022-08-09T01:33:48Z

Creating card/note structures in Rust is reasonably fast, and makes the code simpler and more maintainable than manually deserializing data from rows. I don't think your cursor solution will work: for one, if the user very quickly scrolls down, you'll pay the cost of walking the btree in the same way that large offsets are inadvisable. And it's a moot point anyway, since AnkiDroid is delegating to Rust for the DB access, and AFAIK that implementation fetches the entire query upfront before paginating it back to the Java layer.

oakkitten · 2022-08-11T17:26:08Z

My general idea was that the actual IO, if using a cursor or a similar thing, can be very fast. For instance, this single SQL query is enough to show questions and answers for the search, which is what AnkiDroid shows by default, and it only takes 90ms in Anki on my ancient laptop—not sure if it should be faster than my phone:

SELECT cards.id, notes.mid, cards.did, cards.ord, notes.flds, notes.flags, notes.tags 
FROM cards, notes 
WHERE cards.nid = notes.id
AND notes.flds like "%a%"

And even if it takes a lot of time to compute stuff to be displayed, it should be less of an issue, since the data is already in memory and fit for random access, and you can easily distribute computation across the CPUs.

P.S. I tried running some silly benchmarks in AnkiDroid.

First of all, running the above query on my phone takes around 430ms, which is 4.8 times slower than on my laptop. (My phone is Xiaomi Mi A2, and the laptop is X220 with i5 and Intel 530 SSD.)
For the same search, rendering questions using the above by calling col._renderQA() takes 25 and 30 seconds using legacy and modern schema, resp.
For the same search, fetching card IDs, then for each ID fetching a card and a note takes around 8 seconds.
For the same search, rendering questions using card and note objects takes 39 and 20 seconds using legacy and modern schema, resp.

I would love to benchmark col.newBackend.backend.browserRowForId() but I'm not sure how to call it.

Silly benchmark code; output:

// legacy_schema=true
#0: found 14682 questions in 8.269301397s
#1: found 14682 questions in 39.042556650s
#2: found 14682 questions in 25.410680396s
#3: found 14682 questions in 427.548116ms

// legacy_schema=false
#0: found 14682 questions in 9.970149379s
#1: found 14682 questions in 19.664074354s
#2: found 14682 questions in 30.226884574s
#3: found 14682 questions in 448.768482ms

P.P.S. In desktop Anki, this takes around 1.5s:

for (id,) in mw.col.db.all("""
    SELECT cards.id FROM cards, notes WHERE cards.nid = notes.id AND notes.flds like "%a%"
"""):
    Card(mw.col, id).note()

Not making any conclusions from this, but this is around 17 times 90 ms, which is consistent with 8s ÷ 430ms, which is 19.

dae · 2022-08-12T01:22:17Z

There are multiple problems with your benchmarks:

You're not measuring the time it takes for .query() to return. As I mentioned above, the entire query is fetched in advance.
You're optimizing for throughput instead of latency. We don't want to show all matching rows to the user as fast as possible; we only want to show the visible rows to the user as fast as possible. Fetching multiple rows in a single query has higher throughput than fetching one row at a time, but the more rows that are returned, the longer it takes. This is true to a certain extent with the "get sorted ids, then fetch a row at a time" as well, but it reduces the work done in the o(n) portion, as well as the amount of RAM required. Users with collections of 100,000 cards are not uncommon, and some users even take it to crazy extents like 1M+.
You're not doing any sorting, which is not representative of actual use.

When I tweak the code to use browserRowForId() and measure the full time of the query, the backend wins across the board, especially when the question needs to be rendered:

2022-08-12 11:09:56.662 25122-25206/com.ichi2.anki E/WeaselKt: runQuery round 2
2022-08-12 11:09:56.911 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionDroid: found 20 rows in 239.160346ms
2022-08-12 11:09:57.026 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionBackend: found 20 rows in 92.920019ms
2022-08-12 11:09:57.804 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionDroid: found 40000 rows in 752.482667ms
2022-08-12 11:09:58.329 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionBackend: found 40000 rows in 510.395245ms
2022-08-12 11:09:58.630 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionDroid: found 20 rows in 292.183416ms
2022-08-12 11:09:58.733 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionBackend: found 20 rows in 92.161677ms
2022-08-12 11:10:01.691 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionDroid: found 1000 rows in 2.943828616s
2022-08-12 11:10:03.168 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionBackend: found 40000 rows in 1.451204762s

Code:

package com.ichi2.anki
import com.google.protobuf.kotlin.toByteStringUtf8
import com.ichi2.libanki.Card
import com.ichi2.libanki.Collection
import com.ichi2.libanki.SortOrder
import com.ichi2.libanki.Utils
import timber.log.Timber
import kotlin.time.ExperimentalTime
import kotlin.time.measureTime

@Suppress("UNUSED_VARIABLE")
fun noQuestionDroid(col: Collection, limit: Int): List<String> {
    val cursor = col.db.database.query(
        """
        SELECT cards.id, notes.mid, cards.did, cards.ord, notes.flds, notes.flags, notes.tags, cards.reps 
        FROM cards, notes 
        WHERE cards.nid = notes.id
        AND notes.flds like "%a%"
        order by reps
        """
    )
    
    var n = 0
    return sequence {
        while (cursor.moveToNext() && n < limit) {
            val cid = cursor.getLong(0)
            val mid = cursor.getLong(1)
            val did = cursor.getLong(2)
            val ord = cursor.getInt(3)
            val fields = cursor.getString(4)
            val flags = cursor.getInt(5)
            val tags = cursor.getString(6)
            val reps = cursor.getInt(7)
            n += 1

            yield(reps.toString())
        }
    }.toList()
}

@Suppress("UNUSED_VARIABLE")
fun questionDroid(col: Collection, limit: Int): List<String> {
    val cursor = col.db.database.query(
        """
        SELECT cards.id, notes.mid, cards.did, cards.ord, notes.flds, notes.flags, notes.tags 
        FROM cards, notes 
        WHERE cards.nid = notes.id
        AND notes.flds like "%a%"
        order by sfld
        """
    )

    var n = 0
    return sequence {
        while (cursor.moveToNext() && n < limit) {
            val cid = cursor.getLong(0)
            val mid = cursor.getLong(1)
            val did = cursor.getLong(2)
            val ord = cursor.getInt(3)
            val fields = cursor.getString(4)
            val flags = cursor.getInt(5)
            val tags = cursor.getString(6)
            n += 1

            val model = col.models.get(mid)!!
            val splitFields = Utils.splitFields(fields)
            yield(col._renderQA(cid, model, did, ord, tags, splitFields, flags)["q"]!!)
        }
    }.toList()
}

@Suppress("UNUSED_VARIABLE")
fun noQuestionBackend(col: Collection, limit: Int): List<String> {
    col.backend.setActiveBrowserColumns(listOf("cardReps"))
    val ids = col.findCards("a", SortOrder.AfterSqlOrderBy("reps"))
    return ids.asSequence().take(limit).map { col.backend.browserRowForId(it).cellsList[0].text }.toList()
}


@Suppress("UNUSED_VARIABLE")
fun questionBackend(col: Collection, limit: Int): List<String> {
    col.backend.setActiveBrowserColumns(listOf("question"))
    val ids = col.findCards("a", SortOrder.AfterSqlOrderBy("sfld"))
    return ids.asSequence().take(limit).map { col.backend.browserRowForId(it).cellsList[0].text }.toList()
}

@OptIn(ExperimentalTime::class)
fun runQuery(col: Collection, name: String, builder: (col: Collection) -> List<String>) {
    System.gc()
    System.runFinalization()

    var size: Int
    val totalTime = measureTime {
        size = builder(col).size
    }
    Timber.wtf("runQuery $name: found $size rows in $totalTime")
}

fun runQueries(col: Collection) {
    // run a few times to prime DB cache
    for (i in listOf(1, 2, 3)) {
        Timber.e("runQuery round $i")
        runQuery(col, "noQuestionDroid", { noQuestionDroid(it, 20) })
        runQuery(col, "noQuestionBackend", { noQuestionBackend(it, 20) })
        runQuery(col, "noQuestionDroid", { noQuestionDroid(it, 40000) })
        runQuery(col, "noQuestionBackend", { noQuestionBackend(it, 40000) })

        runQuery(col, "questionDroid", { questionDroid(it, 20) })
        runQuery(col, "questionBackend", { questionBackend(it, 20) })
        runQuery(col, "questionDroid", { questionDroid(it, 1000) })
        runQuery(col, "questionBackend", { questionBackend(it, 40000) })
    }
}

oakkitten · 2022-08-13T20:25:24Z

Ahh this is great! It turns out that browserRowForId is so fast we can probably forgo caching entirely.

I wanted to test for thoughput because I thought that the only way to bring screen render under the fabled 16ms (or is it 8ms now?) would be to prepare all, or a lot of, rows in background. And if you prepare them in the background, it's reasonable to forgo latency to do the enitre job faster.

Turns out that for the same query I was running above, except with ORDER By notes.sfld and using browserRowForId:

Query + fetch all rows with cardReps takes ~1s
Query + fetch all rows with question takes ~1.4s
Query for the ids itself takes 140ms, and fetching 20 (additional) questions, which is about a screenful, takes ✨ 2 to 2.5ms ✨ on my Xiaomi! And on my Moto G 1st gen from 2013, 800ms and 8.5-9ms, resp.

This is so amazingly fast that it should give no frame drops even on old and slow devices.

46c4f4d changed a single doProgress() call to multiple ones. I suspect the speed improvements in that PR came from checking cancellation at the top, and the other change was not necessary. Things got worse with the introduction of ankidroid#11849, as the progress listener is called it a hot loop, and it invokes colIsOpen() each time. This change puts performance back on par with the legacy path; further improvements should be possible in the future by switching to a recycler view and another backend method: ankidroid#11889

46c4f4d changed a single doProgress() call to multiple ones. I suspect the speed improvements in that PR came from checking cancellation at the top, and the other change was not necessary. Things got worse with the introduction of #11849, as the progress listener is called it a hot loop, and it invokes colIsOpen() each time. This change puts performance back on par with the legacy path; further improvements should be possible in the future by switching to a recycler view and another backend method: #11889

github-actions · 2022-10-12T20:42:40Z

Hello 👋, this issue has been opened for more than 2 months with no activity on it. If the issue is still here, please keep in mind that we need community support and help to fix it! Just comment something like still searching for solutions and if you found one, please open a pull request! You have 7 days until this gets closed automatically

mikehardy · 2023-12-09T19:26:18Z

Hey there 👋 - this issue came up as something that is pending for the 2.17 release as it should fix #14353 in passing. It looks like there is functional code laying around somewhere for a couple of different approaches, but I don't see an actual PR open for this - is there a PR for this?

Note that with CardBrowser refactoring going on (👀 @david-allison ) it may be conflict-ridding and/or it may be in progress as part of the refactor already (but just not linked?)

So with the context out of the way, to get to the point: is there a current status + work effort in progress here? Just curious

Cheers :-)

david-allison · 2023-12-09T19:35:43Z

To note: I've got a patch here, to be applied after the refactorings are completed, but there's quite a lot of work:

This moves from CardCache to CardOrNoteId as the row primitive

That means that all operations in the card browser need to be able to transform the note Id to the card Id, or natively handle a NoteId

It also changes the "column" behaviour, as the set of Anki desktop columns don't match the current set of columns we provide

mikehardy · 2023-12-11T15:03:36Z

Okay, my priority in area of "things I control" then is to be a rapid reviewer on the CardBrowser refactors. Off I go to do that

github-actions bot added the Stale label Oct 12, 2022

dae added the Keep Open avoids the stale bot label Oct 12, 2022

github-actions bot removed the Stale label Oct 12, 2022

This was referenced Aug 29, 2023

Drop legacy schema, and update to 2.1.66 backend #14171

Merged

Regression: play tags are now being shown in the browser #14353

Closed

Imperfect HTML tag handling #5604

Closed

david-allison mentioned this issue Nov 24, 2023

refactor: CardBrowser #14810

Merged

5 tasks

david-allison added the Card Browser label Nov 28, 2023

david-allison mentioned this issue Nov 28, 2023

Add ability to sort cards in the Browser by FSRS-specific fields #14705

Closed

david-allison self-assigned this Nov 28, 2023

This was referenced Nov 29, 2023

[Request For Comment] refactor: Convert CardBrowser to ViewModel #14843

Closed

Fix changing the cards order to 'NoOrdering' after toggling to 'Notes' #14859

Merged

david-allison mentioned this issue Dec 7, 2023

Rename browser labels to Card type, Note type #14824

Closed

5 tasks

mikehardy added this to the 2.17 release milestone Dec 9, 2023

david-allison modified the milestones: 2.17 release, 2.18 release Jan 19, 2024

david-allison mentioned this issue Mar 14, 2024

[BUG]: Card browser's due date sort not matching new and review order #15883

Closed

4 tasks

david-allison mentioned this issue Apr 13, 2024

[BUG]: Toggle Cards/Notes #16130

Closed

5 tasks

mikehardy modified the milestones: 2.18 release, 2.19 release May 13, 2024

This was referenced Jun 8, 2024

refactor(card-browser): remove rowId references #16571

Merged

feat(col): expose browser table methods #16590

Merged

chore: map CardBrowserColumn to backend column #16595

Merged

This was referenced Jun 16, 2024

feat(browser)!: use backend for column definitions #16602

Merged

feat(browser): render rows using backend #16620

Merged

mikehardy modified the milestones: 2.19 release, 2.20 Release Aug 29, 2024

mikehardy modified the milestones: 2.20 Release, 2.21 release Dec 4, 2024

david-allison closed this as completed in #16620 Jan 5, 2025

github-actions bot unassigned david-allison Jan 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Card browser scrolling performance #11889

Improve Card browser scrolling performance #11889

oakkitten commented Jul 19, 2022 •

edited

Loading

viciousAegis commented Jul 19, 2022

dae commented Jul 20, 2022

oakkitten commented Aug 8, 2022

dae commented Aug 9, 2022

oakkitten commented Aug 11, 2022

dae commented Aug 12, 2022

oakkitten commented Aug 13, 2022

github-actions bot commented Oct 12, 2022

mikehardy commented Dec 9, 2023

david-allison commented Dec 9, 2023 •

edited

Loading

mikehardy commented Dec 11, 2023

Improve Card browser scrolling performance #11889

Improve Card browser scrolling performance #11889

Comments

oakkitten commented Jul 19, 2022 • edited Loading

viciousAegis commented Jul 19, 2022

dae commented Jul 20, 2022

oakkitten commented Aug 8, 2022

dae commented Aug 9, 2022

oakkitten commented Aug 11, 2022

dae commented Aug 12, 2022

oakkitten commented Aug 13, 2022

github-actions bot commented Oct 12, 2022

mikehardy commented Dec 9, 2023

david-allison commented Dec 9, 2023 • edited Loading

mikehardy commented Dec 11, 2023

oakkitten commented Jul 19, 2022 •

edited

Loading

david-allison commented Dec 9, 2023 •

edited

Loading