-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'tokio-runtime-worker' panicked at 'Externalities not allowed to fail within runtime: "Trie lookup error: Database missing expected key: #6547
Comments
This means you have some storage corruption. Not that much we can do there. |
Something similar happened to us today. [Relaychain] ⚙️ Preparing 0.0 bps, target=#25875211 (17 peers), best: #25875205 (0x7e68…0363), finalized #25875203 (0x8312…b031) |
It keeps failing from time to time today on my side as well There was a similar issue a year ago #663 |
@Anastasiia-Khab are you using the default network backed (libp2p)? |
No problems with relaychains today? |
@lexnv I was using the default network configuration, but the port is not 30333 In logs I also have: |
And restarting the node fixes it? |
If yes, please provide more logs before the issue is happening, 10min should be enough. |
It does fixes it for a while, but then with the time fails again |
@Anastasiia-Khab ty, then please provide logs. Did this started to appear with 1.16.2? What version did you used before? Are you using paritydb? |
Started to appear yesterday. Updated to 1.16.2 on the day of release. Before was using 1.16.1
|
|
I had similar issue occur on one Kusama node today. ParityDB, running Polkadot 1.16.2 since the day it was released (downloaded from releases). Baremetal, 64GB RAM, i9-13900.
Polkadot service restarted and ran successfully for several hours, then node became non-operational several hours later but in a peculiar way:
Attempting stop/start of polkadot service with 180 second wait. Service stayed failed. Restarted service again by restarting server, and service came back up on the second attempt, with numerous errors like this:
Those may be expected as the 'transactions in queue' in telemetry was quite high as the node caught up, and transactions in queue went to zero. No 'trying to notify' errors for last eight minutes. |
@Anastasiia-Khab @infrachris Could you provide all warnings and errors you got from running the node? Are there other types of warnings except for the "transaction in queue"? (probably best to "cat logs.txt | grep 'WARN|ERROR'") |
Here are the immediate prior 100 logs of warn or above:
|
@Anastasiia-Khab ty! Can you also answer my other questions? |
Yes, writing to the DB is asynchronous, so the stuff that's not yet commited to disk is kept in memory. @MattHalpinParity could you look into it? |
This was happening during the spamming of Kusama, so a lot of data modifications were happening around this time. |
seems like yesterday was loadtests. |
Same here on Polkadot. I guess somehow we have all synchronized our storage corruption. |
related with: try to restart your nodes. |
@LukeWheeldon did you also use |
@bkchr nope, default settings on that side, rocksdb I believe. |
@LukeWheeldon but you have seen the same logs? Can you post your logs please. |
@bkchr unfortunately I didn't have logging enabled when this happened. system has been stable since this has happened. |
But you are 100% sure it was the same failure? How do you know if you didn't had logging enabled? 😅 (I just want to make sure this was not something different and would otherwise invalidate the assumptions here) |
@bkchr apologies, I should at least kept a screenshot of the error when it has crashed. At this point all I can say is that I feel pretty sure I have seen the same error as seen above:
I'll report with at least the crash logs next time. |
I also don't understand how you could see error messages if you had logs 'disabled'. Can you share how you believe you disabled logs? Please try these commands one by one:
|
|
I think it's here: #6547 (comment) Not using Paritydb, using Rocksdb |
I have not elected to run polkadot as a service; prometheus informs me as soon as there is something wrong and I act on it if it is under my control. When the binary crashes, I generally do get some level of details for why it has crashed. What I meant is that I run polkadot with the default level of verbosity. |
Might be related, I saw this in versi-net while testing something authority-disc improvements:
|
Is there an existing issue?
Experiencing problems? Have you tried our Stack Exchange first?
Description of bug
Two nodes failed at Kusama network in the same time with a Bug report:
Happened just once.
Steps to reproduce
No response
The text was updated successfully, but these errors were encountered: