-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: clearer Synthetic PoRep separation #1720
Conversation
40c9797
to
4f46216
Compare
As this PR wasn't reviewed yet, I did dare to rebase it. I also pushed a new commit. It's easiest reviewed if you look at those two commits separately. The second one is again easiest with whitespace only changes hidden: 4f46216?diff=split&w=1 |
t_aux.synth_proofs_path(), | ||
partition_count, | ||
) | ||
.map_err(|error| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where Synth-PoRep is enabled and the challenge seed is set, but synthetic challenge proofs do not exist on disk, I think the fallback behavior should be to run normal PoRep (rather than return an error). In the event where, while waiting for the challenge seed on chain, a user's synthetic challenge proofs were moved/removed from their expected location, returning an error here would run in an endless loop (assuming the user were to rerun prove_layers
following the read synth-proofs error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user runs a Synthetic PoRep, you would do that in order to free space. So you would remove the tree files/layers once the proofs are generated. If you move/remove those proofs and fallback to non-Synthetic PoRep, it would likely fail, because the original files are not there anymore. I would find that behaviour more confusing/hard to follow that failing early on the Synthetic PoRep code path.
This is kind of a discussion, whether we want "fail early" behaviour, which I prefer as I find it easier to debug/finding root cases if it happend. Vs. "best effort", which might then make things fail later on in unexpected ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that it's totally debatable, and that the error makes sense too. Do you you have an opinion about falling back to standard PoRep only when the SDR layers weren't deleted? Or do you think an error should be returned in that case too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you you have an opinion about falling back to standard PoRep only when the SDR layers weren't deleted? Or do you think an error should be returned in that case too?
That might be a good middle ground. Though if it would be me, I think would still return early as I really see it as some misconfiguration/user error which under normal operations should never happen. And If someone did something wrong, they probably want to find out about it early. I'm also thinking about potential bug reports, which might be easier to tackle with early failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with erroring as it would force a user to immediately diagnose and fix the issue. My thinking for why standard PoRep should be run if the labels are still present on disk (with a logged warning saying "synth-porep was not run") is mainly because I think that re-sealing/labeling should be avoided when possible. However, if that fallback behavior is repeatedly taken, a user may be mislead into thinking that they are using synth-porep when in reality they aren't.
Maybe @cryptonemo has an opinion, but if not, I'm happy to approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought on it is that we should fallback to PoRep if it's possible and warn/log very loudly that something is misconfigured and SynthPoRep cannot be used. I agree that will complicate potential code/testing paths, but shouldn't complicate much past having the user reading the logs and clearly seeing there's a problem.
Maybe the more ideal answer from a user perspective is to check if the synth_proofs data exists in the clear data call (when enabled of course) and log loudly there that they cannot be cleared because synth_proofs data doesn't exist -- and then we have the logic to fall back to PoRep because something is wildly misconfigured ???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that will complicate potential code/testing paths, but shouldn't complicate much past having the user reading the logs and clearly seeing there's a problem.
I disagree here. Especially with Ni-PoRep there's yet another code path. I find the current code pretty clean, you can clearly see the different PoRep cases and reason about what happens when. With certain fallback it likely be a lot of additional back and forth, which I think is just not worth it.
From a users perspective I see it as: if i've misconfigured wildly I want to immediately know it, rather then having a system that tries random things just in hope it might recover somehow, but might still end up in a broken state. An example for today: SupraSeal's PC2 does not support Synthetic PoRep. When run with Lotus and SyntheticPoRep enabled, it runs into a seemingly endless loop, but then does something (the wrong thing). Wouldn't it be much better if it would just say: "error, you try to run Synthetic PoRep although it's not supported"?
So if there's any chance you could live with the current solution I'd prefer keeping it the current way.
If you can't live with it, should we then also start thinking about how it will look like if we also add Ni-PoRep. Should there also be fallbacks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's fair. Thanks!
This refactor looks good to me. I just have the one question regarding fallback behavior when synth proofs do not exist on disk. |
041b0ae
to
2779a36
Compare
The Synthetic PoRep code is interleaved with the original PoRep code. This commit refactors the code, so that it's clearer what the differences between Synthetic and Non-Synthetic PoRep is, without the need to read through large parts of the code, looking for if-statements. It also becomes more apparent in the code, that there are two "modes" of the `prove_layers` function in case of the Synthetic PoRep. One generation step in case there's no seed and one extracting step when there's a seed. There no longer is a fallback in case from Synthetic PoRep to Non-Synthetic PoRep in case the Synthetic PoRep Proofs file cannot be read, but when the labels are set. I consider that a feature as if this case happens, your system likelt has some problems, so it's better to error early.
`prove_layers_generate()` now operates on a single partition. This makes it clearer that Synthetic PoReps always operate on a single partition only. With the challenge generation moved outside of this function, it's also a clearer separation of concerns, as it now operates directly on the challenges given, without know anything about how they were generated. This also useful for the work of having small, special case binaries for certain operations.
0d4ef81
to
995c0c3
Compare
The force push has just been a rebase, without any other code changes. |
t_aux.synth_proofs_path(), | ||
partition_count, | ||
) | ||
.map_err(|error| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's fair. Thanks!
Due to me understanding the code wrongly during the refactoring at #1720, there was a wrong assert inserted into the code. With removing this assert the lifecycle 32GiB test as well as the benchy synthetic proofs generation works as expected again. This commit also updates the README with the correct command to run benchy for synthetic proofs generation.
Due to me understanding the code wrongly during the refactoring at #1720, there was a wrong assert inserted into the code. With removing this assert the lifecycle 32GiB test as well as the benchy synthetic proofs generation works as expected again. This commit also updates the README with the correct command to run benchy for synthetic proofs generation. Fixes #1749.
The Synthetic PoRep code is interleaved with the original PoRep code. This commit refactors the code, so that it's clearer what the differences between Synthetic and Non-Synthetic PoRep is, without the need to read through large parts of the code, looking for if-statements.
It also becomes more apparent in the code, that there are two "modes" of the
prove_layers
function in case of the Synthetic PoRep. One generation step in case there's no seed and one extracting step when there's a seed.There no longer is a fallback in case from Synthetic PoRep to Non-Synthetic PoRep in case the Synthetic PoRep Proofs file cannot be read, but when the labels are set. I consider that a feature as if this case happens, your system likelt has some problems, so it's better to error early.