Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usercache local storage was deleted and does not stick #750

Open
shankari opened this issue Jul 6, 2022 · 11 comments
Open

usercache local storage was deleted and does not stick #750

shankari opened this issue Jul 6, 2022 · 11 comments

Comments

@shankari
Copy link
Contributor

shankari commented Jul 6, 2022

Reported by a user:

Use 1 Use 2 Use 3
image image image

Note that, unlike e-mission/e-mission-phone@a09385b (part of e-mission/e-mission-phone#470), this time the local storage was not empty and the usercache was empty.

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

Looking at the user's logs, this has been going on since June 6th. It might have even occured earlier, but previous logs were truncated.

19,1654546646.61791,2022-06-06T13:17:26.617910-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

Every single time, we say that we will copy the local to the native, but the copy never seems to "stick"

18,1654546646.61722,2022-06-06T13:17:26.617220-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""}"
19,1654546646.61791,2022-06-06T13:17:26.617910-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

142,1654549916.2284,2022-06-06T14:11:56.228400-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""}"
143,1654549916.2290401,2022-06-06T14:11:56.229040-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

270,1654556446.13521,2022-06-06T16:00:46.135210-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""}"
271,1654556446.13664,2022-06-06T16:00:46.136640-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

1166,1654557878.03728,2022-06-06T16:24:38.037280-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""}"
1167,1654557878.03806,2022-06-06T16:24:38.038060-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

1715,1654559319.7890599,2022-06-06T16:48:39.789060-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""}"
1716,1654559319.7897198,2022-06-06T16:48:39.789720-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

...

287533,1657134282.70151,2022-07-06T12:04:42.701510-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""}"
287534,1657134282.70227,2022-07-06T12:04:42.702270-07:00,"uc_stored_val = null ls_stored_val = {""intro_done"":""2021-10-12T07:32:35-06:00""} copying local intro_done to native..."

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

My first thought was that this may be an issue with accessing the database from javascript.
But no, the native code in [ConfigManager isConsented]

    ConsentConfig* currConfig = (ConsentConfig*)[[BuiltinUserCache database] getDocument:CONSENT_CONFIG_KEY wrapperClass:[ConsentConfig class]];
        [LocalNotificationManager addNotification:[NSString stringWithFormat:@"reqConsent = %@, currConfig.approval_date = %@",
            reqConsent, currConfig.approval_date]];

    if ([reqConsent isEqualToString:currConfig.approval_date]) {
            [LocalNotificationManager addNotification:@"isConsented = YES"];
        return YES;
    } else {
            [LocalNotificationManager addNotification:@"isConsented = NO"];
        return NO;
    }

also fails

4681,1654643846.38509,2022-06-07T16:17:26.385090-07:00,"reqConsent = 2016-07-14, currConfig.approval_date = (null)"
4682,1654643846.38593,2022-06-07T16:17:26.385930-07:00,isConsented = NO

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

Further, the last trip recorded by the user is from 2022-06-03. So it looks like this issue with the database started back then. Unfortunately, the logs start from June 6, so we don't have logs of the time that the database queries broke.

0,1654546645.9126,2022-06-06T13:17:25.912600-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

Have asked the user to uninstall + reinstall, but to send me the userCacheDB before doing so.
Let's see if the DB is actually empty, or is merely corrupted.

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

Note also that the loggerDB was apparently not affected, so it is not true that iOS cleared out ALL the databases.
It appears to ONLY be true for the usercacheDB.
Not sure what (if anything) is different between the two databases.
Would be great to see if the usercacheDB is corrupted, and if that is the main reason.

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

the user sent the userCacheDB, and it is 258M so definitely not empty.

$ sqlite3 -csv /tmp/userCacheDB
SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
sqlite> .tables
userCache       userCacheError
sqlite> select count(*) from userCache;
Error: database disk image is malformed
sqlite> select * from userCache;
w1,,America/Denver,local-storage,CURR_GEOFENCE_LOCATION,,"{""type"":""Point"",""coordinates"":[...]}"
w2,,America/Denver,local-storage,CURR_GEOFENCE_LOCATION,,"{""type"":""Point"",""coordinates"":[...]}"
w3,,America/Denver,local-storage,CURR_GEOFENCE_LOCATION,,"{""type"":""Point"",""coordinates"":[...]}"
w4,,America/Denver,local-storage,CURR_GEOFENCE_LOCATION,,"{""type"":""Point"",""coordinates"":[...]}"
w5,,America/Denver,local-storage,CURR_GEOFENCE_LOCATION,,"{""type"":""Point"",""coordinates"":[...]}"
w6,,document,diary/trips-2022-05-27,,"[{""properties"":{""end_loc"":{""type"":""Point"",...
Error: database disk image is malformed

So the database was in fact corrupted.

@shankari
Copy link
Contributor Author

shankari commented Jul 7, 2022

Reasons why sqlite could be corrupted:
https://www.sqlite.org/howtocorrupt.html

I think we can rule out 4. Disk Drive and Flash Memory Failures since the related logger was not affected.
We can also rule 1.4. Mispairing database files and hot journals
We can also probably rule out 2. File locking problems, except 2.1

But a lot of the others relate to various file system operations that we have no control over and are opaque on iOS
For example, 1.2. Backup or restore while a transaction is active could interact with iOS backup/restore processes.
We do make sure that the DB file is marked as no backup

            NSURL *nosyncURL = [ NSURL fileURLWithPath: nosync];
            if (![nosyncURL setResourceValue: [NSNumber numberWithBool: YES] 
                                      forKey: NSURLIsExcludedFromBackupKey error: &err]) {  
                NSLog(@"IGNORED: error setting nobackup flag in LocalDatabase directory: %@", err);
            }

but not sure if the IGNORED error was triggered for this user, and whether iOS performs any local backups/file copies.

Similarly, 1.1. Continuing to use a file descriptor after it has been closed can interact with iOS background processes in unexpected ways.

the SQLite page also links to a similar error that facebook found and fixed
https://engineering.fb.com/2014/08/12/ios/debugging-file-corruption-on-ios/

I hope it doesn't come to that level of debugging - we don't have a ton of people capable of adding hooks into native code libraries, and we have a lot of priorities to handle - but this is not going to be an easy fix for sure 😦

@shankari
Copy link
Contributor Author

shankari commented Jul 8, 2022

It also looks like this is not unheard-of
https://stackoverflow.com/questions/9411689/offline-cache-on-ios-application-sqlite3-becomes-corrupted
and the way to check for it is to run

sqlite> PRAGMA integrity_check
   ...> ;
"*** in database main ***
Main freelist: size is 1 but should be 0
On tree page 15419 cell 1: 2nd reference to page 19657
On tree page 887 cell 53: 2nd reference to page 4355
On tree page 16948 cell 2: 2nd reference to page 17066
On tree page 9741 cell 0: 2nd reference to page 15805
Page 15700 is never used
...
Page 17338 is never used"
Error: database disk image is malformed

Some concrete fixes (https://sqlite.org/forum/forumpost/5a46109077):

  • close all connections before quitting
  • run PRAGMA integrity_check and prompt the user to reset the database if it is corrupted (we will copy over from local storage, so this can be transparent).

@shankari
Copy link
Contributor Author

shankari commented Jul 8, 2022

Found a similar issue in the more standard SQLite plugin, fixed using the EXTRA DURABLE build setting.
I used to use that plugin but ran into multiple issues with incorrect reads, so ended up rolling my own.
Maybe it is time to go back to it?

@shankari
Copy link
Contributor Author

shankari commented Jul 8, 2022

Worried about making a major change to storage at this point; tempted to just port over all their changes to my plugin instead. These appear to be:

Note also that we include the sqlite3 library directly

    <framework src="libsqlite3.0.tbd"/>

while they include the sqlite source directly, which allows them to control how it is compiled

        <header-file src="node_modules/cordova-sqlite-storage-dependencies/sqlite3.h" />
        <source-file src="node_modules/cordova-sqlite-storage-dependencies/sqlite3.c"
                     compiler-flags="-w -DSQLITE_THREADSAFE=1 -DSQLITE_DEFAULT_SYNCHRONOUS=3 -DSQLITE_DEFAULT_MEMSTATUS=0 -DSQLITE_OMIT_DECLTYPE -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_PROGRESS_CALLBACK -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_TEMP_STORE=2 -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4 -DSQLITE_ENABLE_RTREE -DSQLITE_DEFAULT_PAGE_SIZE=4096" />

which seems like a fairly major change to make outside of the migration cycle.

@shankari
Copy link
Contributor Author

shankari commented Jul 8, 2022

wrt

run PRAGMA integrity_check and prompt the user to reset the database if it is corrupted (we will copy over from local storage, so this can be transparent).

None of the standard deletion commands work. We would have to copy over the blank database.

sqlite> DELETE FROM userCache;
Error: database disk image is malformed
sqlite> DROP TABLE userCache;
Error: database disk image is malformed

The more I think about this, the more I think it is likely that the error is due to a power loss before the data was saved, which can be fixed by PRAGMA schema.synchronous = EXTRA

That is also the final conclusion in the SQLite store issue (number 34)

At this point I hope the "EXTRA DURABLE" build setting I made as discussed in storesafe/cordova-sqlite-storage#736 proves to be sufficiently robust against the common "consumer-grade" hardware issues discussed on the mailing list. I guess only time will tell here.

which also indicates that the multithreading fix is not a real fix.

INCORRECTLY resolved by compile-time DSQLITE_THREADSAFE=2 option (see storesafe/cordova-sqlite-storage#754)

@shankari shankari moved this to In Progress in OpenPATH Tasks Overview Jul 12, 2022
@shankari shankari moved this from Next week sprint to Current week sprint in OpenPATH Tasks Overview Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant