-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fatal error in Raiden wizard testing #4451
Comments
are the relevant lines. |
This is almost impossible to debug without log files, no? |
I don't know if I can reproduce this. So the first step is to reproduce the problem using the versions used when the fatal error was spotted. |
Maybe fix raiden-network#4451 If a channel is not open, or is in a closing state (still open but waiting on the closing transaction result) the `set_total_deposit` function was throwing a `ValueError`. That is a valid and recoverable race condition so this commit introduces an appropriate error and handles it in all places where it can be thrown.
I found the log file: https://gist.github.com/pirapira/210c4b883134d6f42f1c7a36faa6e0bb |
More info neededAll right so I had a look at the log files and I am confused. This should not happen, not according to the code and the logs. But I can't be sure about the code since I can't seem to find which release you were testing @pirapira. I thought this is testing a particular release but no release contains the
When I checked the logs I see the version shown as: Which means this is not a release. Plus I can't find the commit Can you help me and tell me the exact commit or release you were testing? Also why were you not testing a release? I thought that the raiden wizard downloads specific Raiden releases. Current findingsNow ... onto why I can't see why this is possible at least according to the code in current develop and the logs. I visualized the logs with our tool and the following parameters:
Yoichi's node is At line 11286 of the generated logs, just right before the crash that channel is shown to be open when the channel list is queried via the API. And then at line 11296 the crash happens. The crash happens due to the unhandled ValueError which is thrown if the channel is not in an open state. The local state is what queried, and not the blockchain. This is how the query looks like: raiden/raiden/transfer/channel.py Lines 1227 to 1257 in a5dedfc
The exception contains the channel state:
and in there you can see that The above in combination with the mistaken source code lines lead me to believe the problem may lie in the code of the commit tested which is why I want to find out what commit that is. |
@taleldayekh sent me an email on 24.7.2019 and it contained the following link: |
but the link has expired. I'm sending @LefterisJP the executable I ran. |
It seems that the version of Raiden wizard you sent me is installing the Raiden nightly from 24/07/2019. This is the commit. The line number matches there but the code is not that much different and the same question I posed above stands. I don't see how this can happen as both Perhaps the DB can contain more data? It should be under |
I sent the db files as well. |
Okay so I misunderstood the logs it seems. The following:
is not the channel state of the channel that lead to the crush! It's part of the repr of the connection manager which shows (I have no idea why...) the open channels. That log entry is so confusing and cost me a hell of a lot of time ... raiden/raiden/connection_manager.py Lines 448 to 462 in 2ba85cb
So in conclusion ... another channel was closed/closing. I can't say which one since:
I have a good guess, since a little less than 3 mins before the crash Yoichi seems to have closed channel with identifier 11. So with the lack of logs I can only assume that it was a timing problem. Before the closing was processed the connection manager entered here: raiden/raiden/connection_manager.py Line 356 in 2ba85cb
and counted the channel in the open channels. Later down the line a greenlet was spawned to further fund this channel here And by the time we got to the total deposit for the channel the state had switched to closing. So in any case the PR I made should handle this. |
If I had these logs when debugging this: raiden-network#4451 (comment) things would have gone much faster.
Maybe fix #4451 If a channel is not open, or is in a closing state (still open but waiting on the closing transaction result) the `set_total_deposit` function was throwing a `ValueError`. That is a valid and recoverable race condition so this commit introduces an appropriate error and handles it in all places where it can be thrown.
If I had these logs when debugging this: #4451 (comment) things would have gone much faster.
Problem Definition
I saw a raiden node crash for
ValueError: Channel is not in an open state.
.The console output is copied in https://gist.github.com/pirapira/a59f2cd6ff736aef4233f242db0cd366
System Description
Here add a detailed description of your system, e.g. output of the following script:
The text was updated successfully, but these errors were encountered: