-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected error messages after upgrading from 0.17.0 to 0.21.0 #454
Comments
Thanks for the log output @bajtos; this is likely related to some context handling done across all the parallel workloads where cancellation needs to interrupt it all and there's been a few changes in that version range that adjust the timing subtly such that these errors may pop up. I haven't seen these particular ones but I have been bothered by some odd "context cancelled" log messages that shouldn't be seen! What I can say is that neither of these errors is fatal, they're more of the type of "I probably shouldn't get in this state". They're annoying and they point to something that should be addressed, but I'm pretty confident they're just related to the retrieval bail state - the timing of "shut it all down" not hitting all the right things in the right order, but that doesn't indicate clean-up problems. The first error is bitswap - we got a peer from the indexer and fed it in to bitswap but at the same time we also told bitswap to stop bothering with this CID. The second error is from the SP scoring mechanism—keeping track of how SP's perform so we can make better choices about prioritising. "you told me a candidate failed a retrieval, but you already told me that retrieval had ended". I'll look in to it and see if I can replicate it these. |
Thank you, @rvagg, for the detailed explanation! ❤️ I also assumed these errors were not fatal, so I am not worried. I agree with you these logs are annoying; that's why I opened this issue. I can reliably reproduce this issue on my machine when I run
|
(a note for myself that I started tinkering with test cases to try and capture this in my local branch |
While upgrading rusty-lassie from Lassie 0.17.0 to 0.21.0 (see filecoin-station/rusty-lassie#68), I noticed new error messages in the test output.
These messages seem to be printed only when the retrieval is aborted via "max blocks" or "global timeout" feature.
Is this a known issue? How can I get rid of these messages?
The messages are repeated many times in my test output; I am posting one example for each. I also changed the formatting to make things easier to read.
cc @hannahhoward @rvagg
The text was updated successfully, but these errors were encountered: