-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Computation failed: Server disconnected #99
Comments
@cryptobench might be related, though this is not a single task failig, but the whole computation. |
@mbenke Is it possible that you're running out of file descriptors? Please provide the output of the commands below while computing
|
@mfranciszkiewicz here you are:
|
@mbenke thanks, it looks that 47% of the
in each of the relevant terminals (running the daemon, yapapi) and observe whether that improves the situation? |
@mfranciszkiewicz I did set the ulimit in all places (both terminals,
almost right out of the bat. Number of open files reported by |
...and again in 10 minutes: |
@mbenke the logs above do not give enough information. Could you re-produce this with We also added tags with severity and impact, please check if you agree |
@maaktweluit here are the logs form yesterday with NB this log probably does not contain the |
This log (with |
Thanks for the logs @mbenke ! I have checked both of them and they do not contain the error in the title:
Could you provide matching yapapi and yagna logs from the same run? I did see another issue in these logs so reporeted a new issue, linked above |
@maaktweluit @tworec indeed, the 'Server disconnected` message seems to occur only in yapapi logs. Again, they are big (259M), I am leaving them on server for you, as indicated in the email:
There are also some occurrences in the most recent logs (using latest patch release:
|
Getting hit badly by this today:
|
The show goes on...
|
It seems fetching debit notes is more often the culprit now:
|
...but sometimes also invoices:
@azawlocki can you have a look at this (and the comment above)? Can we perhaps catch the excetpion and try to recover/retry instead of crashing the computation? |
@azawlocki you can definitely do it better but this diff shows what I mean in the comment above:
NB This just the handles the exception, withourt attempting to retry, so something more should be done to handle this properly. |
@mbenke Thank you for the suggestion, this PR should fix the issue: golemfactory/yapapi#358 |
@azawlocki alas, it seems that even with the above PR I am getting
(the crucial call at line 252 is not wrapped) |
@mbenke I've added retrying for |
@azawlocki thanks, but looking at the code it seems will help only for timeouts, not |
@mbenke yes, you're right! Thanks for catching this, I've added a new commit to fix this. And made you a reviewer of golemfactory/yapapi#358 (since you reviewed already). |
@mbenke traditional question - we hope it has been fixed, are you observing this still? can we close it? |
Closing for now :) |
The text was updated successfully, but these errors were encountered: