-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hanging NodeJS process #41
Comments
Hm thats no good.
|
I fear that it could be hard to reproduce as is since we didn't experience anything like that on our staging (which is a copy of our production environment but with little load), nor during load testing. Maybe there's something I could do on my side to aid debugging? Like running a debug build of the library. |
Thanks for that. I'm not sure what else a debugging build would tell me in this case. The stack trace you provided above is pretty good - it looks like whats happening is:
Anyway, I've been able to reproduce it already (!!) by making the call_threadsafe_function queue 1 element long, then running this little stress tester. It stutters, and after awhile just hangs. const fdb = require('.')
fdb.setAPIVersion(600);
const db = fdb.openSync();
let time = 0
setInterval(() => { console.log('still alive', time++) }, 2000)
;(async () => {
await db.set('x', 'hi')
const thread = async (id) => {
console.log('starting thread', id)
for (let i = 0; i < 100000; i++) {
await db.get('x')
console.log(i, id)
}
console.log('thread done', id)
}
for (let i = 0; i < 50; i++) {
thread(i)
}
})() |
Ah. Hahahaha I think I see the problem 😏 The issue is that sometimes foundationdb's future objects get resolved immediately, in the current thread. In particular, this happens with calls that are idempotent. For example, committing a read-only transaction. The order of operations which causes the issue is this:
... Nodejs's main thread is now waiting on the queue to have room. But the queue will only have room when it finishes processing more results from fdb... which it can't do while its blocked waiting on the queue to have room. Cycle (and resulting deadlock) is complete and the process hangs. |
📦 [email protected]. I'm 90% sure this will fix your issue. Give it a try and let me know! |
This is great, thank you! I will check it right now. |
The issue seems to be gone now. I appreciate your help very much! |
Awesome - glad to hear it! :) |
We're facing an issue that our services are constantly hanging, by which I mean that any JS code stops executing. This doesn't happen in our tests, nor at our staging environment, only on production. So I guess it's some random race condition that happens more often with increased load.
These services are only doing reads from FDB. This includes single key reads, range reads, and watches.
I tried to wrap all calls to the library with logs, but couldn't figure out a single point where it stops.
We're at NodeJS
v12.16.1
, FDB client6.2.15
. Runningconsole.log(require("foundationdb").modType)
returnsnapi
.I'm not very good at debugging native code, but here's a stack trace from a hanging process:
Please tell me if there's any additional info I can provide. Thanks in advance!
The text was updated successfully, but these errors were encountered: