-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (24) - ReadVarInt(): size too large: iostream error #76
Comments
I also had this exact same error today (about 6 hours ago), and not only that but at the exact same time and block!
|
Oh joy! Restart from snapshot time - again.
|
This is the same signature as the DisconnectTip/BlockUndo issue we have seen in previous versions. In 0.19.17 we ported over the recent serialization code from Bitcoin core which replaced some suspect serialization code used in previous versions. Since the issue is still occurring the problem must lie elsewhere in the code. |
I am writing a check in UndoWriteToDisk to verify the CBlockUndo data can be deserialized correctly immediately after it is written to disk, and throw a trap/int 3 if they do not match so we can break in with gdb. The original CBlockUndo data will then be present on the stack for comparison. Since the check is on the write path we can verify the block immediately when it is written rather than wait for the node to change tips. Reproduction time should be much less. |
@tawmaz I can run a node in gdb with these developments, if necessary. |
I ran a node over the weekend with GDB attached but did not hit the UndoReadFromDisk error. I think it may be necessary for the node to be actively staking to reproduce this, so I will send some coins to it and let it continue. I will plan to post a debug build this evening with the code to catch the UndoWriteToDiskError. |
@tawmaz I think that may be an incorrect hypothesis. My node has a balance of less than 12 PKOIN - still hoping/waiting for the 70 PKOIN perseverance reward. It hasn't been staking for weeks because with an expected time to win in excess of 20 million there's really no point in me remembering to unlock the wallet on each restart. It absolutely was not staking when it hit the problem which prompted me to open this issue. |
@walkjivefly Thanks for the datapoint regarding staking |
Here is the debug version with validation of CBlockUndo: https://github.com/tawmaz/pocketnet.core/releases/tag/v0.19.17blockundodbg |
I've hit this several times now. I'm currently running 19.19, would it be helpful if I ran the debug version? I might need help with gdb since I'm new to linux. I am more than happy to do/learn whatever I can for the community
Here are the logs to show what I've encountered. 2021-11-06T08:50:32Z Warning! Block generate (CheckInputs): 59b18a5a2099cebc6f9eda8d593696fab90a975741c8c6b2d0dfb3ad91d3f390 |
I attempted to catch this issue on serialization of the BlockUndo data last week, but instead hit the Reindexer segfault described in issue 80. I will try creating binary which verifies unserialize during serialize and running it to see if we can catch the issue when the Block object is still available. |
In both cases I have observed of this issue the error occurs on decompression and deserialization of the script code. For the incident that occurred this morning at block 1427557, the nSize value here was deserialized with a value of 69095430 (69MB)! This causes CHashVerifier::ignore() to read past the end of the file. Relevant Locals: Call stack:
|
The synchronization issue has been fixed in version 0.20.*, which is in pre-release state and is being prepared for release. |
Problem
v0.19.17 daemon shut itself down with the following candidate errors in debug.log:
To Reproduce
Unknown
Expected behavior
daemon should run without uncommanded shutdowns or fatal errors.
Desktop (please complete the following information):
Additional context
There was an RIDB error but it didn't claim to be fatal and didn't cause a crash. A few seconds later there was a successful RIDB rollback immediately followed by a fatal DisconnectTip() error, and then another fatal DisconnectTip() for the same block. A few lines later the node started an orderly shutdown.
I was running with only net,rpc,reindex logging enabled so there may have been something other than these candidate events which caused the shutdown.
Full log is available if required.
The text was updated successfully, but these errors were encountered: