Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

Closed
mundaym opened this issue May 2, 2018 · 8 comments
Closed

x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

mundaym opened this issue May 2, 2018 · 8 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Proposal-Accepted Proposal-Crypto Proposal related to crypto packages or other security issues
Milestone

Comments

@mundaym
Copy link
Member

mundaym commented May 2, 2018

Currently the interface to poly1305, func Sum(out *[TagSize]byte, msg []byte, key *[32]byte), forces the user to copy everything they want to sum into a single slice (msg). It would be nice to have a way to append multiple slices and then generate the sum.

The main motivation would be to avoid an allocation in the x/crypto/chacha20poly1305 AEAD. We could avoid adding new API surface by implementing this in something like internal/poly1305 but I thought I'd create this proposal to see if we could just add it to the poly1305 package. I'm imagining the API would be used something like (not tested):

zeros := [16]byte{}
h := poly1305.New(key)
h.Write(additionalData)
if len(additionalData) % 16 != 0 {
     h.Write(zeros[:16 - (len(additionalData) % 16)] // pad to 16
}
h.Write(plaintext)
if len(plaintext) % 16 != 0 {
     h.Write(zeros[:16 - (len(plaintext) % 16)] // pad to 16
}
lengths := [16]byte{}
binary.LittleEndian.PutUint64(lengths[0:8], uint64(len(additionalData))
binary.LittleEndian.PutUint64(lengths[8:16], uint64(len(plaintext))
h.Write(lengths[:])
h.Sum(out[:len(plaintext)])

For reference this is the current code:

https://github.com/golang/crypto/blob/613d6eafa307c6881a737a3c35c0e312e8d3a8c5/chacha20poly1305/chacha20poly1305_generic.go#L31-L39

Also, @aead's poly1305 library already implements [EDIT: a subset of] hash.Hash if you want to take a look: https://github.com/aead/poly1305.

@mundaym mundaym added Proposal Proposal-Crypto Proposal related to crypto packages or other security issues labels May 2, 2018
@gopherbot gopherbot added this to the Proposal milestone May 2, 2018
@FiloSottile
Copy link
Contributor

Poly1305 is a MAC and not Hash, so the naming would be misleading (although there is precedent in HMAC), it can't be reused with the same key, so Reset would be pointless, and can't be reused after calling Sum as it would be a key reuse.

I don't think hash.Hash is a good fit. Just a simple API with io.Writer and MAC([]byte) []byte would do.

@aead
Copy link
Contributor

aead commented May 2, 2018

I completely agree with Filippo here. That's exactly the reason why I did not implement hash.Hash for poly1305. IIRC this was already part of the chacha20poly1305 work - but was not really needed after Vald's combined assembler implementation.

@mundaym
Copy link
Member Author

mundaym commented May 2, 2018

Poly1305 is a MAC and not Hash, so the naming would be misleading (although there is precedent in HMAC), it can't be reused with the same key, so Reset would be pointless, and can't be reused after calling Sum as it would be a key reuse.

I don't think hash.Hash is a good fit. Just a simple API with io.Writer and MAC([]byte) []byte would do.

SGTM. I think I'd use the function name Sum() to match HMAC (that would allow people to make their own MAC interface and switch between the two), but I don't really mind.

IIRC this was already part of the chacha20poly1305 - but was not really needed after Vlad's combined assembler implementation.

Yeah, I'm hoping this and the chacha20 API changes make it possible to get good performance without the complexity of the combined assembly approach.

@FiloSottile
Copy link
Contributor

I think I'd use the function name Sum() to match HMAC

Sure. But panic on anything after it is called once.

Yeah, I'm hoping this and the chacha20 API changes make it possible to get good performance without the complexity of the combined assembly approach.

You have all my support to make that happen BTW. Reducing the need for and amount of assembly is an explicit goal/wishlist item of mine. Let me know if you need anything from my side in that sense.

@mundaym mundaym changed the title proposal: x/crypto/poly1305: implement hash.Hash interface proposal: x/crypto/poly1305: implement a subset of the hash.Hash interface May 3, 2018
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/111335 mentions this issue: poly1305: implement a subset of the hash.Hash interface

gopherbot pushed a commit to golang/crypto that referenced this issue Mar 8, 2019
This CL adds the poly1305.MAC type which implements a
subset of the hash.Hash interface. With MAC it is possible
to compute an authentication tag of data without copying
it into a single byte slice.

This commit modifies the reference/generic and the
AMD64 assembler but not the ARM/s390x implementation
to support an io.Writer interface.

Updates golang/go#25219

Change-Id: I7ee5a9eadd43387cf3cd887d734c625575eee47d
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/111335
Run-TryBot: Filippo Valsorda <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Filippo Valsorda <[email protected]>
@FiloSottile FiloSottile added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 11, 2019
@FiloSottile FiloSottile modified the milestones: Proposal, Unreleased Nov 11, 2019
@FiloSottile FiloSottile changed the title proposal: x/crypto/poly1305: implement a subset of the hash.Hash interface x/crypto/poly1305: implement a subset of the hash.Hash interface Nov 11, 2019
@FiloSottile
Copy link
Contributor

This is still missing the arm and s390x implementations.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/219057 mentions this issue: poly1305: modify s390x assembly to implement MAC interface

gopherbot pushed a commit to golang/crypto that referenced this issue Apr 29, 2020
The vector (vx) implementation has been updated to read in the
state and update it - as opposed to being a single shot function.
This has allowed the new MAC interface can be implemented.

For performance reasons s390x uses a larger buffer than the generic
implementation. There is a relatively high fixed cost to read the
state, calculate the key coefficients and serialize the state, so
it makes sense to buffer more blocks before calling it.

For now I've had to remove the faster VMSL implementation. It is
too complex for me to update in time for Go 1.15. At some point
I'd like to revisit it but for now it looks like using the MAC
interface is more of a win than using VMSL.

The benchmarks show considerable improvements when using the MAC
interface. The Sum benchmarks show slowdown due to a combination
of the removal of the VMSL implementation and also the added
overhead from splitting the summation function into multiple parts.

poly1305:

name              old speed      new speed      delta
64                1.33GB/s ± 0%  0.80GB/s ± 1%   -39.51%  (p=0.000 n=16+20)
1K                4.04GB/s ± 0%  2.97GB/s ± 0%   -26.46%  (p=0.000 n=19+19)
2M                5.32GB/s ± 1%  3.63GB/s ± 0%   -31.76%  (p=0.000 n=20+19)
64Unaligned       1.33GB/s ± 0%  0.80GB/s ± 0%   -39.80%  (p=0.000 n=19+18)
1KUnaligned       4.09GB/s ± 1%  2.94GB/s ± 0%   -28.23%  (p=0.000 n=19+18)
2MUnaligned       5.33GB/s ± 1%  3.52GB/s ± 0%   -34.04%  (p=0.000 n=20+19)
Write64           1.03GB/s ± 1%  1.49GB/s ± 1%   +44.34%  (p=0.000 n=20+20)
Write1K           1.21GB/s ± 0%  3.24GB/s ± 0%  +169.02%  (p=0.000 n=20+17)
Write2M           1.24GB/s ± 1%  3.63GB/s ± 0%  +192.36%  (p=0.000 n=20+19)
Write64Unaligned  1.04GB/s ± 1%  1.50GB/s ± 0%   +44.16%  (p=0.000 n=19+14)
Write1KUnaligned  1.21GB/s ± 0%  3.20GB/s ± 0%  +164.55%  (p=0.000 n=20+16)
Write2MUnaligned  1.24GB/s ± 1%  3.51GB/s ± 0%  +183.96%  (p=0.000 n=20+19)

chacha20poly1305 (this vs. using generic MAC interface - post CL 206977):

name         old speed      new speed      delta
Open-64       147MB/s ± 2%   156MB/s ± 1%   +6.15%  (p=0.000 n=20+19)
Seal-64       151MB/s ± 0%   164MB/s ± 1%   +8.86%  (p=0.000 n=19+16)
Open-64-X     104MB/s ± 2%   111MB/s ± 1%   +6.24%  (p=0.000 n=20+20)
Seal-64-X     109MB/s ± 2%   111MB/s ± 1%   +2.11%  (p=0.000 n=20+19)
Open-1350     555MB/s ± 0%   751MB/s ± 1%  +35.19%  (p=0.000 n=20+20)
Seal-1350     557MB/s ± 0%   759MB/s ± 0%  +36.23%  (p=0.000 n=20+20)
Open-1350-X   517MB/s ± 1%   683MB/s ± 1%  +31.97%  (p=0.000 n=20+20)
Seal-1350-X   511MB/s ± 0%   683MB/s ± 0%  +33.77%  (p=0.000 n=18+19)
Open-8192     672MB/s ± 0%  1013MB/s ± 0%  +50.65%  (p=0.000 n=19+19)
Seal-8192     674MB/s ± 0%  1018MB/s ± 0%  +50.98%  (p=0.000 n=18+20)
Open-8192-X   663MB/s ± 0%   979MB/s ± 0%  +47.57%  (p=0.000 n=20+20)
Seal-8192-X   658MB/s ± 0%   985MB/s ± 0%  +49.62%  (p=0.000 n=18+20)

name         old allocs/op  new allocs/op  delta
Open-64          0.00           0.00          ~     (all equal)
Seal-64          0.00           0.00          ~     (all equal)
Open-64-X        0.00           0.00          ~     (all equal)
Seal-64-X        0.00           0.00          ~     (all equal)
Open-1350        0.00           0.00          ~     (all equal)
Seal-1350        0.00           0.00          ~     (all equal)
Open-1350-X      0.00           0.00          ~     (all equal)
Seal-1350-X      0.00           0.00          ~     (all equal)
Open-8192        0.00           0.00          ~     (all equal)
Seal-8192        0.00           0.00          ~     (all equal)
Open-8192-X      0.00           0.00          ~     (all equal)
Seal-8192-X      0.00           0.00          ~     (all equal)

chacha20poly1305 (this vs. using asm Sum interface - pre CL 206977):

name         old speed      new speed      delta
Open-64       144MB/s ± 0%   156MB/s ± 1%    +8.16%  (p=0.000 n=20+19)
Seal-64       150MB/s ± 0%   164MB/s ± 1%    +9.35%  (p=0.000 n=20+16)
Open-64-X     104MB/s ± 1%   111MB/s ± 1%    +6.15%  (p=0.000 n=19+20)
Seal-64-X     109MB/s ± 1%   111MB/s ± 1%    +1.43%  (p=0.000 n=19+19)
Open-1350     702MB/s ± 1%   751MB/s ± 1%    +6.98%  (p=0.000 n=20+20)
Seal-1350     715MB/s ± 0%   759MB/s ± 0%    +6.09%  (p=0.000 n=19+20)
Open-1350-X   642MB/s ± 0%   683MB/s ± 1%    +6.37%  (p=0.000 n=19+20)
Seal-1350-X   639MB/s ± 0%   683MB/s ± 0%    +6.98%  (p=0.000 n=20+19)
Open-8192     994MB/s ± 0%  1013MB/s ± 0%    +1.85%  (p=0.000 n=20+19)
Seal-8192    1.00GB/s ± 0%  1.02GB/s ± 0%    +1.90%  (p=0.000 n=20+20)
Open-8192-X   965MB/s ± 0%   979MB/s ± 0%    +1.43%  (p=0.000 n=19+20)
Seal-8192-X   962MB/s ± 0%   985MB/s ± 0%    +2.39%  (p=0.000 n=20+20)

name         old allocs/op  new allocs/op  delta
Open-64          1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Seal-64          1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Open-64-X        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Seal-64-X        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Open-1350        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Seal-1350        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Open-1350-X      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Seal-1350-X      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Open-8192        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Seal-8192        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Open-8192-X      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
Seal-8192-X      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)

Updates golang/go#25219.

Change-Id: Ib491e3a47b6b3ec8bbbe1f41f7bf42ad82f5c249
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/219057
Run-TryBot: Michael Munday <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Filippo Valsorda <[email protected]>
@mundaym
Copy link
Member Author

mundaym commented Nov 25, 2020

I think CL 219057 was the final part of this so closing the issue.

@mundaym mundaym closed this as completed Nov 25, 2020
@golang golang locked and limited conversation to collaborators Nov 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Proposal-Accepted Proposal-Crypto Proposal related to crypto packages or other security issues
Projects
None yet
Development

No branches or pull requests

4 participants