-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock by immediately invoking UnlockWallet
after InitWallet
#3631
Comments
Are you able to reproduce this reliably? It seems like it was stuck opening the wallet database, which didn't allow |
It's 100% reliable on our setup. I'm not sure how long I waited, several minutes at least. |
Unable to reproduce this locally, are you sure your backend is active and accepting connections? For example, if your |
I just erased the wallet.db swapped back to lnd v0.7.1 and I was able to generate a new wallet and macaroons properly. |
From the logs it looks like lnd is stuck before it can spin up the final gRPC and REST server. Macaroons are only created shortly before that. Can you please increase the log level to debug and post the last lines after unlock? |
@guggero I set my logging level to |
@stridentbean does using the |
@wpaulino I added I've compiled 0.8.0 on my own and threw in some low tech Strangely, I can get all the way to https://github.com/lightningnetwork/lnd/blob/v0.8.0-beta/lnd.go#L1118. But, it never returns to execute this line https://github.com/lightningnetwork/lnd/blob/v0.8.0-beta/lnd.go#L318. |
Interesting. Could you restart with |
Running
|
This is when creating a new wallet? The profile shows that
Do you have some sort of unlock script that's being executed? |
Interesting. In our stack, we do have an unlock immediately after wallet initialization. It is redundant, so I will remove it to unblock us. It's probably still worth it to fix that issue from lnd's side. Not sure if that locking could happen in other cases. Thanks so much for the help on this one. |
UnlockWallet
after InitWallet
Wanted to add some information to this issue. Not entirely sure if this bug is caused by using Unfortunately we haven't been able to reproduce. If we receive any logs we'll post them here |
@dannypaz |
In this commit, we fix a deadlock that can happen if a user attempts to init then rapidly unlock a wallet right after. In my profiles, it seems the lnd gets caught up on the bbolt flock, which deadlocks the entire process. We fix this issue by making the Init/Unlock calls now fully synchronous. Only a single outstanding request can exist across the entire wallet unlocker service now. Fixes lightningnetwork#4330. Fixes lightningnetwork#3631.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoin| exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. This error is the same as when lnd is just not online at all, so we allow it to occur once (assuming our backoff period will be sufficient) to account for this race while still failing if lnd is consistently offline.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoint exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. This error is the same as when lnd is just not online at all, so we allow it to occur once (assuming our backoff period will be sufficient) to account for this race while still failing if lnd is consistently offline.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoint exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. This error is the same as when lnd is just not online at all, so we allow it to occur once (assuming our backoff period will be sufficient) to account for this race while still failing if lnd is consistently offline.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoint exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. This error is the same as when lnd is just not online at all, so we allow it to occur once (assuming our backoff period will be sufficient) to account for this race while still failing if lnd is consistently offline.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoint exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. We use the grpc wait until ready error which will allow the call to wait until our wait interval has elapsed before it fails. This allows lnd some time to come up.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoint exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. We use the grpc wait until ready error which will allow the call to wait until our wait interval has elapsed before it fails. This allows lnd some time to come up.
This commit adds an optional wait for lnd to unlock to LndServices. The wallet unlocker is not used, because polling the unlock endpoint exacerbates a known deadlock: lightningnetwork/lnd#3631. Instead, a call to the main grpc server is used. The wallet is considered locked when we receive a grpc unimplemented error, because the main server does not become active until the wallet is unlocked. Once the wallet is unlocked, there is a race condition where a query to the main server can return an unavailable code while the server is busy registering. We use the grpc wait until ready error which will allow the call to wait until our wait interval has elapsed before it fails. This allows lnd some time to come up.
Background
I'm upgrading our node software to use 0.8.0. It works fine if we are upgrading from 0.7.X to 0.8.0, but runs into an issue during wallet creation.
Your environment
Expected behaviour
Wallet creation via rpc should occur as per usual resulting in several macaroons created and a wallet.db file.
Actual behavior
We use an rpc call to create the wallet. The call returns successful, but afterwords no macaroon files have been created. The wallet.db does exist. The logs are listed below at this point. It also looks like lnd syncing has not started.
After that I went to the command line and ran
lncli unlock
to get this error[lncli] rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:10009: connect: connection refused"
I then restarted the lnd docker container and triedlncli unlock
again. This time it did work and all the macaroon files were created and everything works as normal.Conf file is
Logs after wallet creation call
The text was updated successfully, but these errors were encountered: