-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce CollectionManager for preventing concurrent access #11849
Conversation
@lukstbit in #10839 (comment) you talked about the need to inject dispatchers to make testing easier. If it is needed in this PR, could I trouble you to explain how it would help/suggest changes? |
Sorry for the late response. |
8aa37d0
to
e10448b
Compare
e93809f
to
eb9eeae
Compare
Can anyone with a Windows machine see if they can reproduce the Windows issue locally? |
Running with If I run the class individually, the tests pass though 🤷♂️, so it's hard to diagnose the issue |
Thanks Brayan, that at least implies it's an OS issue, and not machine-specific. I can try reproduce this in a VM, but it might take me a while to get around to it. |
I have a couple windows VMs I leave laying around for things like this but don't have a lot of time myself and still won't for a couple weeks. Net couple days at least I have a lot of car transit time where I may have network if there's a particular set of tests you want me to run / focus on to try to move this forward? Also, it appears to need conflict resolution at the moment |
Might be easier for me to dig into this directly, but it'll probably be a few weeks before I can return to this. |
58ba3b6
to
a00f797
Compare
One of the test failures was my fault, which I've resolved. From what I can tell, the six remaining intermittently-failing tests seem to be triggering a bug in the coroutine/JDK code on Windows. I added a bunch of print statements to trace what was going on, and it seems that invoking a suspend fun sometimes hangs on Windows. The print statements reveal that the suspend fun is run, but the calling code never continues after the coroutine completes. I've worked around this for now by excluding the tests on Windows, but this is of course not ideal, as Windows-bound developers will need to rely on CI to test these routines. The desktop code suffers from a similar issue: the javascript unit tests don't run on Windows, so only get checked via CI. |
Fixes a startup crash when the current deck ID is set to an invalid value
This regressed in 9644f57 + Fix incorrect private find() implementation in CardContentProvider, and move into shared file. Closes ankidroid#11999
ankidroid#11849 (comment) Android's onCreateOptionsMenu does not play well with coroutines, as it expects the menu to have been fully configured by the time the routine returns. This results in flicker, as the menu gets blanked out, and then configured a moment later when the coroutine runs. To work around this, the current state is stored in the deck picker, so that we can redraw the menu immediately, and then make any potential changes in the background. Other changes: - refactored onCreateOptionsMenu to make it simpler - instead of the sdCardAvailable checks (which I assume is a proxy for "col is available"), the entire menu is wrapped in a group, and the visibility of the group is toggled depending on whether the col is available or not. This also fixes the error on a full sync. - there are three sets of unit tests (one for search icon, one for sync icon, one for entire menu) that have been a pain since I originally introduced this PR, and and I've sunk a number of hours into trying to get them to work properly at this point. The issue appears to be that when mixing coroutine calls and invalidateOptionsMenu(), onCreateOptionsMenu() is not getting called before trying to await the job, leading to a hang or stale data. I tried advancing robolectric, but it did not help. Maybe someone more experienced in this area can figure it out, but for now I've changed these routines to be more of a unit test and less of an integration test: rather than checking the menu itself, they directly invoke the function that updates the menu state, and check the state instead. This takes onCreateOptionsMenu() out of the loop, and avoids the problems (and probably allows these tests to be re-enabled on Windows as well). The sync tests I've removed, as the entire menu is hidden/shown now when the col is closed, so they are redundant.
Theoretically it should be unnecessary, as it's called in refreshState()
(Rebased to fix a conflict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice touch on the delayed popup commit add
I'm queueing up dependency updates and a fresh i18n sync, this has sat a couple more days, I'm going to merge this in for a new alpha shortly to get all these fixes in and maintain what velocity we have here
#11849 (comment) Android's onCreateOptionsMenu does not play well with coroutines, as it expects the menu to have been fully configured by the time the routine returns. This results in flicker, as the menu gets blanked out, and then configured a moment later when the coroutine runs. To work around this, the current state is stored in the deck picker, so that we can redraw the menu immediately, and then make any potential changes in the background. Other changes: - refactored onCreateOptionsMenu to make it simpler - instead of the sdCardAvailable checks (which I assume is a proxy for "col is available"), the entire menu is wrapped in a group, and the visibility of the group is toggled depending on whether the col is available or not. This also fixes the error on a full sync. - there are three sets of unit tests (one for search icon, one for sync icon, one for entire menu) that have been a pain since I originally introduced this PR, and and I've sunk a number of hours into trying to get them to work properly at this point. The issue appears to be that when mixing coroutine calls and invalidateOptionsMenu(), onCreateOptionsMenu() is not getting called before trying to await the job, leading to a hang or stale data. I tried advancing robolectric, but it did not help. Maybe someone more experienced in this area can figure it out, but for now I've changed these routines to be more of a unit test and less of an integration test: rather than checking the menu itself, they directly invoke the function that updates the menu state, and check the state instead. This takes onCreateOptionsMenu() out of the loop, and avoids the problems (and probably allows these tests to be re-enabled on Windows as well). The sync tests I've removed, as the entire menu is hidden/shown now when the col is closed, so they are redundant.
46c4f4d changed a single doProgress() call to multiple ones. I suspect the speed improvements in that PR came from checking cancellation at the top, and the other change was not necessary. Things got worse with the introduction of ankidroid#11849, as the progress listener is called it a hot loop, and it invokes colIsOpen() each time. This change puts performance back on par with the legacy path; further improvements should be possible in the future by switching to a recycler view and another backend method: ankidroid#11889
46c4f4d changed a single doProgress() call to multiple ones. I suspect the speed improvements in that PR came from checking cancellation at the top, and the other change was not necessary. Things got worse with the introduction of ankidroid#11849, as the progress listener is called it a hot loop, and it invokes colIsOpen() each time. This change puts performance back on par with the legacy path; further improvements should be possible in the future by switching to a recycler view and another backend method: ankidroid#11889
46c4f4d changed a single doProgress() call to multiple ones. I suspect the speed improvements in that PR came from checking cancellation at the top, and the other change was not necessary. Things got worse with the introduction of #11849, as the progress listener is called it a hot loop, and it invokes colIsOpen() each time. This change puts performance back on par with the legacy path; further improvements should be possible in the future by switching to a recycler view and another backend method: #11889
Hi there @dae! This is the OpenCollective Notice for PRs merged from 2022-08-01 through 2022-08-31 If you are interested in compensation for this work, the process with details is here: We only post one comment per person per month to avoid spamming you, regardless of the number of PRs merged, but this note applies to all PRs merged for this month Please note that GSoC contributions are okay for this process. Our philosophy is that our users have donated to AnkiDroid for all contributions. The only PRs that will not go through the OpenCollective process are ones directly related to am accepted GSoC project from a selected participant, since those receive a stipend from GSoC itself. Please understand that our monthly budget is never guaranteed to cover all claims - the cap on payments-per-person may be lower, but we try to make our process as fair and transparent as possible, we just need your understanding. Thanks! |
Currently there are many instances in the AnkiDroid codebase where
the collection is accessed a) on the UI thread, blocking the UI, and
b) in unsafe ways (eg by checking to see if the collection is open,
and then assuming it will remain open for the rest of a method call.
"Fix full download crashing in new schema case" (159a108) demonstrates
a few of these cases.
This PR is an attempt at addressing those issues. It introduces
a
withCol
function that is intended to be used instead ofCollectionHelper.getCol(). For example, code that previously looked
like:
Can now be used like this:
The block is run on a background thread, and other withCol calls made
in parallel will be queued up and executed sequentially. Because of
the exclusive access, routines can safely close and reopen the
collection inside the block without fear of affecting other callers.
It's not practical to update all the legacy code to use withCol
immediately - too much work is required, and coroutines are not
accessible from Java code. The intention here is that this new path is
gradually bought into. Legacy code can continue calling CollectionHelper.
getCol(), which internally delegates to CollectionManager.
Two caveats to be aware of:
before they receive the collection handle, but because they retain it,
subsequent access is not guaranteed to be exclusive.
they will block the UI if a background operation is already running.
Logging has been added to help diagnose this, eg messages like:
E/CollectionManager: blocked main thread for 2626ms:
com.ichi2.anki.DeckPicker.onCreateOptionsMenu(DeckPicker.kt:624)
Other changes/notes:
removed. I can not reproduce the issue it reports, and the code checks
for col in onCreateOptionsMenu().
of each test, or two tests are flaky when run with the new schema.
Closes #11734
Closes #11417
Closes #11981
Closes #11979
Closes #11999
Closes #12007
Closes #12026
Closes #12027