-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSM optimization #40
MSM optimization #40
Conversation
The code coverage currently fails. Because everything else seems to run fine I think it's because the code coverage runs with Does anyone know if the ci machine supports x86 instructions? |
Could you also chekc why the bench i snot running here :) |
during halo2 automated benches implementation, we had agreed that the tests will be triggered from kzg-rebase branch. Therefore, the pr that introduced the workflow code was only merged with kzg-rebase. Now this branch is obsolete? I thought that halo2 optimizations would be submitted via branches sourcing from kzg-rebase. Was that not the case? Regardless of that, there is still a needed fix on the workflow. Where should it be directed? |
Nah we switched to schplonk you can get that by just removing the flag as its default.
Not sure probably just good to remove the kzg-rebase branch from benchmarks and just use master. |
|
@Brechtpd @barryWhiteHat |
What's the status of this PR? It doesn't seem like there's any blockers, and a faster MSM would certainly be very nice. @Brechtpd |
Ready for review but yeah it's been a while. :) However, I believe the fastest approach will very likely exploit the fact that in plonk many MSMs are done over the same bases (zcash#641), instead of just optimizing a single MSM as much as possible as in this PR. I think it's probably best to analyze the combined MSM approach and check its performance before continuing with this PR, and see if the approach in this PR still makes sense/is compatible with the combined approach. |
Ah thanks for pointing out that other issue. That is very interesting. I can't guess one way or the other which would end up being faster for the halo2 particular use case. Something to investigate in the future... |
Should we tske a look @Brechtpd ???? |
I believe some people (kilic if I remember correctly) already started looking into this but it is a bit complex and pretty low priority so I guess the review never got finished. In any case, at this point would definitely first look into the combined MSM approach and then potentially integrate that into this PR as that will require a different set of changes which are potentially simpler. |
@kilic any input on that??? |
This PR is a bit old and using pse/pairing as backend the one we used before we move to pse/halo2curves. Now trying to rebase it or possibly move it to pse/halo2curves with recent endomorhism features that covers both pasta and bn and is about to be added with privacy-scaling-explorations/halo2curves#24. |
@Brechtpd Can you confirm msm tests are passing at your side? I cannot hit expected results. I'm trying to debugging but maybe you can spot the issue much faster |
I'm having some trouble getting things building again because of old packages for some reason. I still had an old checkout and for some reason that does have |
I see. No worries. I think I'm getting closer |
Closing in favour of privacy-scaling-explorations/halo2curves#29 |
…imized mod inverse
Depends on privacy-scaling-explorations/pairing#7, which should be merged first so this doesn't need to depend on any external repo's.
Credit for the core ideas behind these optimizations to Zac Williamson/Barretenberg. The implementation in this PR is based on barretenberg (and I think most other implementations doing similar things are as well) which contains comments that largely explain things pretty well, other references mainly used to complement those comments:
More info inside
arithmetic_msm.rs
, but the main speedups are from:On my computer it's around twice as fast as
best_multiexp
but haven't done a lot of testing yet on other computers with more cores.The new implementation is used inside the prover, although for the best performance a shared cache object needs to be shared:
Because the bases were already stored in
Params
I just made that object mutable everywhere where needed but I'm not sure if that's the best idea. Perhaps a separate object may make more sense.