-
Notifications
You must be signed in to change notification settings - Fork 2k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nothing tells you when you're timing out on 30 second harvester proof checks #2651
Comments
Moreover, I would like to know why your laptop can mount nearly 400TB disks, is your disk on nas? |
Your NAS caused your task to time out. It is recommended to abandon this method. |
多少速度会影响呢 |
We can't change the consensus algorithm anymore, but we can change the log level, and show the files. Please note that looking up qualities for plots passing the filter requires about 7 random reads in a plot, whereas actually looking up a proof requires 64 reads. It might not be feasible on a slow NAS, since these are sequential reads. Furthermore, you need to take into account network latency to propagate your proof and block to the network, so you should be under 5 seconds to reduce risk of losing rewards. Actually, the proof of space library does them sequentially, but they could be done in parallel, since it's a tree, so you could do 1 read, then 2, then 4, .. etc, for a total of around 7 sequential phases (one for each table in the plot). We haven't got around to doing this yet. |
Also another thing to point out is that the responses are returned to the full node as they come out from the drives, so the high time is probably only affecting the slow drive or the slow NAS. |
Yeah, this only started happening as I added multiple NAS devices to the network. With 1 or 2 NASes, it was all fine. Once you get to 5.. not so much, especially if the algorithm picks plots on 6 different devices. At the very least
thanks @mariano54 ! |
Is there any benefit of using larger plot files in this case? Less files per directory in his case. |
I think the problem is inherent in how Windows manages mounted network drives. If you'd to switch to a Linux-based farmer, it should work better. I know it's not an option for most people, but it'd be a good test to see if that's the cause. |
I think lots of people need to know their whole system just too slow to provide the a valid answer in time. So please mark the logs as WARNING as they takes longer a certain threshold. |
This was quite a rough finding since I've been happily farming on my RPi on wifi to a remote storage with proof checking usually between 60-90 seconds. |
Could some drives be powering down when idle? |
no, all the NAS devices default to "never sleep drives".
…On Mon, Apr 26, 2021 at 9:50 AM Jonathan Hartley ***@***.***> wrote:
Could some drives be powering down when idle?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2651 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALTWVITJALTYDZ56FWXQZDTKWKUJANCNFSM43QXK2MA>
.
|
yeah my 1 Synology NAS is doing the same if I hit it hard with other services such as docker, Plex, etc My Synology can sleep the HDs and out of the box I believe it set to that as default, no idea about yours. Synology HDD hibernation |
@mariano54 |
Yeah this is a critical issue for the project IMO, since a LOT of people are probably "farming" absolutely nothing due to the 30s harvester timeout, and the logs aren't WARN-ing or ERROR-ing them.. the GUI isn't telling them.. the only way to know this is happening is to intentionally set log level to INFO and scan for 30s or longer in the INFO messages 😱 😭 |
i believe the small time window was designed to push tape drives out of the chia eco system. ideally there would be big flashing warnings if youre missing any rewards. hopefully can be addressed in future releases. |
I did a somewhat scalable workaround for this by running multiple harvesters in containers (on the RPi4), each with ONE plot directory with 100 plots. This gives arounds 2 seconds for proof checking per container (up to 100 plots). |
isn't the simple answer make each Nas its now farmer until chia fix this. Building the raid array in performance mode on a Synology also causes this issue. |
It's soooooo sad that by "grep eligible" I found nothing there! |
You should try "farming on many machines" |
You fast ones will still answer in a fast speed, since it's all threaded. It will just display the time of when the slowest finished. |
It seems near criminal that we are conditioned to enable "INFO" level logs and routinely told to ignore the countless "ERROR" level scary sounding messages spamming the logs, yet not a whimper about anything in the logs when a proof challenge request lookup is nearing or fully exhausting some non-presented timeout value that only tribal knowledge or code inspection is aware of. I watched incredulously as my friends with far fewer plots reached, then far surpassed and doubled over my winnings with 1/2 the plots. Only once reaching out after the damage is done do we find out with an obtuse, "OF COURSE YOU WON'T WIN WITH LOOKUP DELAYS LIKE THIS" are we then made painfully aware of the worst of all actual "errors". Like I said, that these messages are not flagged as ERROR level or at least WARNING is essentially criminal, especially in the face of the constant stream of non-critical "ERROR" level messages spamming the logs. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
The problem
I thought I was farming, but I wasn't -- because something about my network caused the proof check to take more than the hard-coded 30 second limit.
I had an average time to win of 8 or 9 hours for more than 120 hours without a single win. This seemed statistically implausible, so I researched the logs, and cleared any errors or warnings in the logs (well done, all the warnings and errors in
debug.log
were indeed things I should fix!). Still no wins for a long time.How to reproduce
Have a bunch of plots on slow storage media; when the proof check happens, verifying the proofs takes longer than the hard-coded 30 seconds allowed. You will never win a single Chia, but there's absolutely nothing in the GUI to inform you that this is happening. You can view the logs, but in the logs it is not even presented as a warning (!), but as an INFO message:
Of the above, the proofs that take longer than 30 seconds are not eligible to win, but this is not logged as an ERROR or WARNING or surfaced in the UI in any way.
Expected behavior
The GUI will tell you "hey, your proof checks are too slow, there's absolutely no chance for you to win, even if you are farming infinity plots"
Screenshots
Desktop
Additional context
I followed up on the #support channel in Keybase, where I got the important advice to enable INFO level logging and check for the 30 second proof limit.. and I wrote up a detailed account on the forum; if you need excruciating levels of detail, please check there 🙇♂️
Recommended solution
The text was updated successfully, but these errors were encountered: