'Database is busy' #34

nl · 2013-04-09T05:22:16Z

It seems the underlying FMDB database is sometimes accessed from multiple threads at the 'same time'. My app isn't using any concurrency, dispatch queues or threads, yet I'm running into these issues when using live queries during a pull from a remote sync gateway. Not sure if it's relevant, but most of the documents being pulled contain attachments.

The log below show two different threads are trying to use the FMDatabase.

2013-04-09 07:12:13.414 Puller[58132:c07] -[FMDatabase executeUpdate:error:withArgumentsInArray:orVAList:]:740 Database busy (/Volumes/Home/Library/Application Support/iPhone Simulator/6.1/Applications/FDD4EE76-786A-4B29-8FF2-0ED674F2B847/Library/Application Support/CouchbaseLite/user-b56e5c11a783b8a0078d91d25b00588f.touchdb)
2013-04-09 07:12:13.415 Puller[58132:c07] Database busy
07:12:13.416| �WARNING*** : CouchbaseLite: Failed to rebuild view 'files/byIndex': 590
07:12:13.416| �WARNING*** : Failed to update view index: 590
2013-04-09 07:12:13.417 Puller[58132:4107] -[FMDatabase executeUpdate:error:withArgumentsInArray:orVAList:]:740 Database busy (/Volumes/Home/Library/Application Support/iPhone Simulator/6.1/Applications/FDD4EE76-786A-4B29-8FF2-0ED674F2B847/Library/Application Support/CouchbaseLite/user-b56e5c11a783b8a0078d91d25b00588f.touchdb)
2013-04-09 07:12:13.417 Puller[58132:4107] Database busy
07:12:13.418‖ �WARNING*** : CouchbaseLite: Failed to rebuild view 'files/byIndex': 590
07:12:13.418‖ �WARNING*** : Failed to update view index: 590
07:12:13.424| �WARNING*** : <CBLLiveQuery: 0x9379870>: Error updating rows: 590

The text was updated successfully, but these errors were encountered:

nl · 2013-04-12T15:28:10Z

It might be relevant that I'm using multiple pull CBLReplications to pull data from the same remote database, to the same CBLDatabase instance, with different query_params for the sync_gateway/channels filter.

rismay · 2013-04-12T17:54:52Z

I think you need to create multiple CBLManager instances and create pull replications from each of those instances with the different query_params.

snej · 2013-04-15T18:36:09Z

It seems the underlying FMDB database is sometimes accessed from multiple threads at the 'same time'.

That should never happen — FMDB doesn't support multithreaded access. But the warning for this is different, reading "The FMDatabase %@ is currently in use." It will also trigger an assertion failure in a debug build.

The error you're getting looks like the database file is being used by multiple threads (with one FMDB connection per thread) and one of the connections is leaving the database locked too long. This is possible, especially since looking at the retry logic in FMDB it'll give up after 200ms.

Try this: edit the following line in CBLDatabase+Internal.m:
_fmdb.busyRetryTimeout = 10;
You can change the value to 0, which will disable busy timeouts entirely. It leaves open the possibility that the thread will lock forever if the database never unlocks, but the only reason for the database to be stuck locked would be if another thread were already stuck in the middle of a transaction, in which case the app is in bad shape anyway.

Let me know if that fixes the problem, and if you see any unexpectedly long pauses.

nl · 2013-04-15T19:07:46Z

I changed the value to zero. In lieu of having the warning messages, I know have FMDB lock the entire app by sleeping the main thread indefinitely.

Here's a backtrace after pausing in lldb when the app locks up.


(lldb) bt
* thread #1: tid = 0x1c03, 0x908f1c72 libsystem_kernel.dylib`__semwait_signal + 10, stop reason = signal SIGSTOP
    frame #0: 0x908f1c72 libsystem_kernel.dylib`__semwait_signal + 10
    frame #1: 0x02343770 libsystem_sim_c.dylib`nanosleep + 196
    frame #2: 0x02343632 libsystem_sim_c.dylib`usleep + 60
    frame #3: 0x000c1ec6 SPApp`-[FMDatabase executeUpdate:error:withArgumentsInArray:orVAList:](self=0x07c84070, arrayArgs=0xbfff0100, args=0xbfffdff8) + 1271 at FMDatabase.m:737
    frame #4: 0x000c20f9 SPApp`-[FMDatabase executeUpdate:](self=0x07c84070, _cmd=0x000db150, sql=0x00127674) + 66 at FMDatabase.m:801
    frame #5: 0x00079c9b SPApp`-[CBLView(self=0x10b42290) updateIndex] + 768 at CBLView+Internal.m:145
    frame #6: 0x000b0704 SPApp`-[CBLDatabase(, _cmd=0x000dd00a, viewName=0x00125a14, options=(null)) queryViewNamed:options:lastSequence:status:] + 163 at CBLQuery.m:561
    frame #7: 0x000ae026 SPApp`-[CBLQuery rows](self=0x10b415f0, _cmd=0x012bafaa) + 233 at CBLQuery.m:142

snej · 2013-04-15T19:10:43Z

Could you get a dump of all threads? (I'm not sure what the LLDB command for that is...) There's probably some other thread that's locked up inside an FMDB call. Or if not it's really bad news, because that means something previously managed to return while leaving a transaction open, which should be impossible.

nl · 2013-04-15T19:19:07Z

There doesn't seem to be anything else happening on other threads. Here's the full bt.


(lldb) bt all
* thread #1: tid = 0x2403, 0x39c2f6a4 libsystem_kernel.dylib`__semwait_signal + 24, stop reason = signal SIGSTOP
    frame #0: 0x39c2f6a4 libsystem_kernel.dylib`__semwait_signal + 24
    frame #1: 0x39b8d3d2 libsystem_c.dylib`nanosleep + 142
    frame #2: 0x39b8d33e libsystem_c.dylib`usleep + 50
    frame #3: 0x0014d00c SPApp`-[FMDatabase executeUpdate:error:withArgumentsInArray:orVAList:](self=0x200ba940, _cmd=0x00165728, sql=0x00191f74, arrayArgs=0x00000000) + 1012 at FMDatabase.m:737
    frame #4: 0x0014d1d8 SPApp`-[FMDatabase executeUpdate:]() + 40 at FMDatabase.m:801
    frame #5: 0x00114424 SPApp`-[CBLView(Internal) updateIndex] + 656 at CBLView+Internal.m:145
    frame #6: 0x0013f2ec SPApp`-[CBLDatabase(, options=(null)) queryViewNamed:options:lastSequence:status:] + 176 at CBLQuery.m:561
    frame #7: 0x0013d46c SPApp`-[CBLQuery rows](self=0x2004bed0) + 240 at CBLQuery.m:142
    frame #8: 0x000db6fa SPApp`-[SPContentDatabase acceptedShares](self=0x1edf2700, _cmd=0x0015c979) + 134 at SPContentDatabase.m:314
    frame #9: 0x000da160 SPApp`-[SPContentDatabase currentChannels](self=0x1edf2700, _cmd=0x0015c073) + 92 at SPContentDatabase.m:241
    frame #10: 0x000d30ec SPApp`-[CSDatabase newSharedPull](self=0x1edf2700, _cmd=0x0015c198) + 64 at CSDatabase.m:301
    frame #11: 0x000d4266 SPApp`__30-[CSDatabase startReplicating]_block_invoke_2194(.block_descriptor=0x1ede5610) + 78 at CSDatabase.m:394
    frame #12: 0x000f026c SPApp`__39-[CSReplication noteFirstSyncCompleted]_block_invoke(.block_descriptor=0x001957c8, block=0x1ede5610, stop=0x2fd51e18) + 44 at CSReplication.m:228
    frame #13: 0x318a0496 CoreFoundation`__NSSetEnumerate + 682
    frame #14: 0x000f01e0 SPApp`-[CSReplication noteFirstSyncCompleted](self=0x1ed2db50, _cmd=0x0015dffd) + 100 at CSReplication.m:227
    frame #15: 0x000eff7c SPApp`__38-[CSReplication noteProgressDidChange]_block_invoke() + 44 at CSReplication.m:184
    frame #16: 0x39b4911e libdispatch.dylib`_dispatch_call_block_and_release + 10
    frame #17: 0x39b484b6 libdispatch.dylib`_dispatch_client_callout + 22
    frame #18: 0x39b4d1bc libdispatch.dylib`_dispatch_main_queue_callback_4CF$VARIANT$mp + 224
    frame #19: 0x3186ef3a CoreFoundation`__CFRunLoopRun + 1290
    frame #20: 0x317e1ebc CoreFoundation`CFRunLoopRunSpecific + 356
    frame #21: 0x317e1d48 CoreFoundation`CFRunLoopRunInMode + 104
    frame #22: 0x353962ea GraphicsServices`GSEventRunModal + 74
    frame #23: 0x336f7300 UIKit`UIApplicationMain + 1120
    frame #24: 0x000af664 SPApp`main(argc=1, argv=0x2fd52d10) + 116 at main.m:13
    frame #25: 0x39b68b20 libdyld.dylib`start + 4

  thread #2: tid = 0x2703, 0x39c1f648 libsystem_kernel.dylib`kevent64 + 24
    frame #0: 0x39c1f648 libsystem_kernel.dylib`kevent64 + 24
    frame #1: 0x39b4f978 libdispatch.dylib`_dispatch_mgr_invoke + 796
    frame #2: 0x39b4f658 libdispatch.dylib`_dispatch_mgr_thread$VARIANT$mp + 36

  thread #5: tid = 0x2a03, 0x39c1eeb4 libsystem_kernel.dylib`mach_msg_trap + 20
    frame #0: 0x39c1eeb4 libsystem_kernel.dylib`mach_msg_trap + 20
    frame #1: 0x39c1f04c libsystem_kernel.dylib`mach_msg + 40
    frame #2: 0x31870044 CoreFoundation`__CFRunLoopServiceMachPort + 128
    frame #3: 0x3186eda2 CoreFoundation`__CFRunLoopRun + 882
    frame #4: 0x317e1ebc CoreFoundation`CFRunLoopRunSpecific + 356
    frame #5: 0x317e1d48 CoreFoundation`CFRunLoopRunInMode + 104
    frame #6: 0x377d1504 WebCore`RunWebThread(void*) + 444
    frame #7: 0x39b88310 libsystem_c.dylib`_pthread_start + 308

  thread #6: tid = 0x231f, 0x39c1eeb4 libsystem_kernel.dylib`mach_msg_trap + 20
    frame #0: 0x39c1eeb4 libsystem_kernel.dylib`mach_msg_trap + 20
    frame #1: 0x39c1f04c libsystem_kernel.dylib`mach_msg + 40
    frame #2: 0x31870044 CoreFoundation`__CFRunLoopServiceMachPort + 128
    frame #3: 0x3186eda2 CoreFoundation`__CFRunLoopRun + 882
    frame #4: 0x317e1ebc CoreFoundation`CFRunLoopRunSpecific + 356
    frame #5: 0x317e1d48 CoreFoundation`CFRunLoopRunInMode + 104
    frame #6: 0x32104f96 Foundation`-[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 254
    frame #7: 0x0011837e SPApp`-[CBL_Server runServerThread](self=0x1ed80470) + 310 at CBL_Server.m:116
    frame #8: 0x321b1e84 Foundation`__NSThread__main__ + 972
    frame #9: 0x39b88310 libsystem_c.dylib`_pthread_start + 308

  thread #13: tid = 0x310b, 0x39c1eeb4 libsystem_kernel.dylib`mach_msg_trap + 20
    frame #0: 0x39c1eeb4 libsystem_kernel.dylib`mach_msg_trap + 20
    frame #1: 0x39c1f04c libsystem_kernel.dylib`mach_msg + 40
    frame #2: 0x31870044 CoreFoundation`__CFRunLoopServiceMachPort + 128
    frame #3: 0x3186eda2 CoreFoundation`__CFRunLoopRun + 882
    frame #4: 0x317e1ebc CoreFoundation`CFRunLoopRunSpecific + 356
    frame #5: 0x317e1d48 CoreFoundation`CFRunLoopRunInMode + 104
    frame #6: 0x3212e3d4 Foundation`+[NSURLConnection(Loader) _resourceLoadLoop:] + 308
    frame #7: 0x321b1e84 Foundation`__NSThread__main__ + 972
    frame #8: 0x39b88310 libsystem_c.dylib`_pthread_start + 308

  thread #14: tid = 0x3203, 0x39c2f594 libsystem_kernel.dylib`select$DARWIN_EXTSN + 20
    frame #0: 0x39c2f594 libsystem_kernel.dylib`select$DARWIN_EXTSN + 20
    frame #1: 0x318741f6 CoreFoundation`__CFSocketManager + 678
    frame #2: 0x39b88310 libsystem_c.dylib`_pthread_start + 308

  thread #15: tid = 0x3303, 0x39c2fd98 libsystem_kernel.dylib`__workq_kernreturn + 8
    frame #0: 0x39c2fd98 libsystem_kernel.dylib`__workq_kernreturn + 8
    frame #1: 0x39b7dcfa libsystem_c.dylib`_pthread_workq_return + 18
    frame #2: 0x39b7da16 libsystem_c.dylib`_pthread_wqthread + 366

snej · 2013-04-15T20:10:38Z

You've got a CBL_Server running on thread 6, so there must have been some background activity. Either you've started the HTTP listener, or you've called -asyncTellDatabaseNamed:to:. (If you don't believe you've done either, set a breakpoint at -[CBLManager backgroundServer] and see how it gets hit.)

nl · 2013-04-15T20:13:31Z

I don't believe I've done either :-)
Apparently, it gets hit in CBL(Live)Query – which i do use a lot.


* thread #1: tid = 0x1c03, 0x000b2087 SPApp`-[CBLManager backgroundServer](self=0x000b21b2, _cmd=0x000b2084) + 3 at CBLManager.m:196, stop reason = breakpoint 1.1
    frame #0: 0x000b2087 SPApp`-[CBLManager backgroundServer](self=0x000b21b2, _cmd=0x000b2084) + 3 at CBLManager.m:196
    frame #1: 0x000b21e3 SPApp`-[CBLManager asyncTellDatabaseNamed:to:](self=0x07c46950, _cmd=0x000dbf12, dbName=0x07c27b80, block=0xbfffe010) + 63 at CBLManager.m:209
    frame #2: 0x000ad4ef SPApp`-[CBLQuery runAsync:](, _cmd=0x000dbf40, onComplete=0xbfffe0b8) + 506 at CBLQuery.m:155
    frame #3: 0x000ae2e1 SPApp`-[CBLLiveQuery update](self=0x07c5f330, _cmd=0x00533c55) + 103 at CBLQuery.m:265
    frame #4: 0x000ae020 SPApp`-[CBLLiveQuery start](, _cmd=0x020828dd) + 197 at CBLQuery.m:215
    frame #5: 0x000ae19d SPApp`-[CBLLiveQuery addObserver:forKeyPath:options:context:](self=0x07c5f330, _cmd=0x0197d2a1, observer=0x07c5f4c0, keyPath=0x00124724, options=NSKeyValueObservingOptionInitial, context=0x00000000) + 63 at CBLQuery.m:245

snej · 2013-04-15T20:29:22Z

Oh right, sorry — I did that last month to optimize CBLLiveQuery. OK, so that narrows down what could be running on the other FMDB connection. Still can't figure out what could be wrong, though.

Trying to figure out #34.

snej · 2013-04-15T21:03:32Z

Could you retry with the latest commit? It adds some assertions and warnings about transactions. I think it was possible for the transaction level to get confused, but only in cases where sqlite itself fails to commit or back out a transaction, and I don't think either is likely to have happened.

nl · 2013-04-16T03:26:33Z

Ok, I tried again and I am getting the same messages as before.

snej · 2013-04-16T16:28:29Z

/scratching my head/

Could you please turn on CBLDatabase logging (and turn everything else off) and email or post the logs?

This looks like a nasty problem, and I'm glad you've run into it before I moved replication to the background thread; hopefully it'll be easier to figure out in this more limited scenario.

nl · 2013-04-16T18:21:59Z

Sure. I used the code below to enable logging and sent the logs to you by email.

    EnableLog(YES);
    EnableLogTo(CBLDatabase, YES);

snej · 2013-04-16T18:53:42Z

You can enable logs by going into the Arguments tab of the Run item of the scheme editor and adding items

-Log YES
-LogCBLDatabase YES

Then you can check and uncheck them. Also, Cmd-Opt-R will open that editor before running the target, making it super easy to update the arguments.

snej · 2013-04-16T19:05:30Z

Got the emailed logs, thanks. That's actually less detail than I wanted — I forgot to have you enable CBL_DatabaseVerbose too (that makes FMDB log the SQL queries.)

I think I know what's going on, though. This has to do with SQLite's "deferred transaction" feature. I'll have to go reread the docs, but IIRC, the thread may legitimately have to back out of its transaction and retry, if another thread is already holding a lock it needs. I ran into this once before; but I hadn't noticed that the "savepoint" mechanism TouchDB and CBL use is based on deferred transactions.

snej · 2013-04-16T19:16:13Z

nl · 2013-04-16T19:28:50Z

Good to know, thanks. It doesn't seem to have any devastating effect on data as far as I can tell.
Sometimes I'm really grateful CBL is abstracting all of this low level stuff away from us =)

snej · 2013-04-16T23:20:42Z

Whew — I just pushed a big commit that should cause CBL to handle db-busy errors correctly by backing off and retrying. There are a lot of code changes, but it passes all the unit tests. I don't have any code that reproduces your bug, though. Could you give this a try with your app?

If the SQLITE_BUSY error happens internally, you should see a log message like "Db busy, retrying transaction (#1)...". This will show up if any logging is enabled (-Log YES), you don't have to turn on CBLDatabase logging. There might be more than one of these if the other thread is taking a long time; but at some point it should succeed.

nl · 2013-04-17T03:50:28Z

On 'Database busy' I now see '05:46:53.307| CBLDatabase[<0x88accf0>user-e1b1c275376b67ba7ce277e6560207cc]: Db busy, retrying transaction (#1)...' like you said, but I no longer have any view update failure messages!
You fixed it, thanks!

snej · 2013-04-17T04:09:16Z

W00t! High-five!

snej added a commit that referenced this issue Apr 15, 2013

Added extra sanity checks around transactions.

cfef660

Trying to figure out #34.

ghost assigned snej Apr 16, 2013

snej mentioned this issue Apr 16, 2013

Replication should run in a background thread #18

Closed

snej closed this as completed in 99fcca3 Apr 16, 2013

PiranhaTodd mentioned this issue Oct 21, 2013

Hang on iOS7 when app goes to the background #164

Closed

laurentwz mentioned this issue Jan 15, 2014

CBL crash in -[CBLReplication bg_updateProgress:] #220

Closed

24601 mentioned this issue Oct 16, 2014

forestdb::RevTree:compact() causing crash #495

Closed

ajaykoppisetty mentioned this issue Dec 20, 2017

App needs trigger(going back to home screen and realunch) to sync/pull docs #1083

Closed

hrajput89 mentioned this issue Jan 23, 2018

2.0: Replication of 1M documents never completes and high CPU/memory usage #1970

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'Database is busy' #34

'Database is busy' #34

nl commented Apr 9, 2013

nl commented Apr 12, 2013

rismay commented Apr 12, 2013

snej commented Apr 15, 2013

nl commented Apr 15, 2013

snej commented Apr 15, 2013

nl commented Apr 15, 2013

snej commented Apr 15, 2013

nl commented Apr 15, 2013

snej commented Apr 15, 2013

snej commented Apr 15, 2013

nl commented Apr 16, 2013

snej commented Apr 16, 2013

nl commented Apr 16, 2013

snej commented Apr 16, 2013

snej commented Apr 16, 2013

snej commented Apr 16, 2013

nl commented Apr 16, 2013

snej commented Apr 16, 2013

nl commented Apr 17, 2013

snej commented Apr 17, 2013

'Database is busy' #34

'Database is busy' #34

Comments

nl commented Apr 9, 2013

nl commented Apr 12, 2013

rismay commented Apr 12, 2013

snej commented Apr 15, 2013

nl commented Apr 15, 2013

snej commented Apr 15, 2013

nl commented Apr 15, 2013

snej commented Apr 15, 2013

nl commented Apr 15, 2013

snej commented Apr 15, 2013

snej commented Apr 15, 2013

nl commented Apr 16, 2013

snej commented Apr 16, 2013

nl commented Apr 16, 2013

snej commented Apr 16, 2013

snej commented Apr 16, 2013

snej commented Apr 16, 2013

nl commented Apr 16, 2013

snej commented Apr 16, 2013

nl commented Apr 17, 2013

snej commented Apr 17, 2013