x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

mundaym · 2018-05-02T12:05:23Z

Currently the interface to poly1305, func Sum(out *[TagSize]byte, msg []byte, key *[32]byte), forces the user to copy everything they want to sum into a single slice (msg). It would be nice to have a way to append multiple slices and then generate the sum.

The main motivation would be to avoid an allocation in the x/crypto/chacha20poly1305 AEAD. We could avoid adding new API surface by implementing this in something like internal/poly1305 but I thought I'd create this proposal to see if we could just add it to the poly1305 package. I'm imagining the API would be used something like (not tested):

zeros := [16]byte{}
h := poly1305.New(key)
h.Write(additionalData)
if len(additionalData) % 16 != 0 {
     h.Write(zeros[:16 - (len(additionalData) % 16)] // pad to 16
}
h.Write(plaintext)
if len(plaintext) % 16 != 0 {
     h.Write(zeros[:16 - (len(plaintext) % 16)] // pad to 16
}
lengths := [16]byte{}
binary.LittleEndian.PutUint64(lengths[0:8], uint64(len(additionalData))
binary.LittleEndian.PutUint64(lengths[8:16], uint64(len(plaintext))
h.Write(lengths[:])
h.Sum(out[:len(plaintext)])

For reference this is the current code:

https://github.com/golang/crypto/blob/613d6eafa307c6881a737a3c35c0e312e8d3a8c5/chacha20poly1305/chacha20poly1305_generic.go#L31-L39

Also, @aead's poly1305 library already implements [EDIT: a subset of] hash.Hash if you want to take a look: https://github.com/aead/poly1305.

The text was updated successfully, but these errors were encountered:

FiloSottile · 2018-05-02T17:24:09Z

Poly1305 is a MAC and not Hash, so the naming would be misleading (although there is precedent in HMAC), it can't be reused with the same key, so Reset would be pointless, and can't be reused after calling Sum as it would be a key reuse.

I don't think hash.Hash is a good fit. Just a simple API with io.Writer and MAC([]byte) []byte would do.

aead · 2018-05-02T20:12:32Z

I completely agree with Filippo here. That's exactly the reason why I did not implement hash.Hash for poly1305. IIRC this was already part of the chacha20poly1305 work - but was not really needed after Vald's combined assembler implementation.

mundaym · 2018-05-02T20:40:51Z

Poly1305 is a MAC and not Hash, so the naming would be misleading (although there is precedent in HMAC), it can't be reused with the same key, so Reset would be pointless, and can't be reused after calling Sum as it would be a key reuse.

I don't think hash.Hash is a good fit. Just a simple API with io.Writer and MAC([]byte) []byte would do.

SGTM. I think I'd use the function name Sum() to match HMAC (that would allow people to make their own MAC interface and switch between the two), but I don't really mind.

IIRC this was already part of the chacha20poly1305 - but was not really needed after Vlad's combined assembler implementation.

Yeah, I'm hoping this and the chacha20 API changes make it possible to get good performance without the complexity of the combined assembly approach.

FiloSottile · 2018-05-02T20:48:38Z

I think I'd use the function name Sum() to match HMAC

Sure. But panic on anything after it is called once.

Yeah, I'm hoping this and the chacha20 API changes make it possible to get good performance without the complexity of the combined assembly approach.

You have all my support to make that happen BTW. Reducing the need for and amount of assembly is an explicit goal/wishlist item of mine. Let me know if you need anything from my side in that sense.

gopherbot · 2018-05-03T22:34:01Z

Change https://golang.org/cl/111335 mentions this issue: poly1305: implement a subset of the hash.Hash interface

This CL adds the poly1305.MAC type which implements a subset of the hash.Hash interface. With MAC it is possible to compute an authentication tag of data without copying it into a single byte slice. This commit modifies the reference/generic and the AMD64 assembler but not the ARM/s390x implementation to support an io.Writer interface. Updates golang/go#25219 Change-Id: I7ee5a9eadd43387cf3cd887d734c625575eee47d Reviewed-on: https://go-review.googlesource.com/c/crypto/+/111335 Run-TryBot: Filippo Valsorda <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Filippo Valsorda <[email protected]>

FiloSottile · 2019-11-11T21:30:29Z

This is still missing the arm and s390x implementations.

gopherbot · 2020-02-11T13:25:12Z

Change https://golang.org/cl/219057 mentions this issue: poly1305: modify s390x assembly to implement MAC interface

The vector (vx) implementation has been updated to read in the state and update it - as opposed to being a single shot function. This has allowed the new MAC interface can be implemented. For performance reasons s390x uses a larger buffer than the generic implementation. There is a relatively high fixed cost to read the state, calculate the key coefficients and serialize the state, so it makes sense to buffer more blocks before calling it. For now I've had to remove the faster VMSL implementation. It is too complex for me to update in time for Go 1.15. At some point I'd like to revisit it but for now it looks like using the MAC interface is more of a win than using VMSL. The benchmarks show considerable improvements when using the MAC interface. The Sum benchmarks show slowdown due to a combination of the removal of the VMSL implementation and also the added overhead from splitting the summation function into multiple parts. poly1305: name old speed new speed delta 64 1.33GB/s ± 0% 0.80GB/s ± 1% -39.51% (p=0.000 n=16+20) 1K 4.04GB/s ± 0% 2.97GB/s ± 0% -26.46% (p=0.000 n=19+19) 2M 5.32GB/s ± 1% 3.63GB/s ± 0% -31.76% (p=0.000 n=20+19) 64Unaligned 1.33GB/s ± 0% 0.80GB/s ± 0% -39.80% (p=0.000 n=19+18) 1KUnaligned 4.09GB/s ± 1% 2.94GB/s ± 0% -28.23% (p=0.000 n=19+18) 2MUnaligned 5.33GB/s ± 1% 3.52GB/s ± 0% -34.04% (p=0.000 n=20+19) Write64 1.03GB/s ± 1% 1.49GB/s ± 1% +44.34% (p=0.000 n=20+20) Write1K 1.21GB/s ± 0% 3.24GB/s ± 0% +169.02% (p=0.000 n=20+17) Write2M 1.24GB/s ± 1% 3.63GB/s ± 0% +192.36% (p=0.000 n=20+19) Write64Unaligned 1.04GB/s ± 1% 1.50GB/s ± 0% +44.16% (p=0.000 n=19+14) Write1KUnaligned 1.21GB/s ± 0% 3.20GB/s ± 0% +164.55% (p=0.000 n=20+16) Write2MUnaligned 1.24GB/s ± 1% 3.51GB/s ± 0% +183.96% (p=0.000 n=20+19) chacha20poly1305 (this vs. using generic MAC interface - post CL 206977): name old speed new speed delta Open-64 147MB/s ± 2% 156MB/s ± 1% +6.15% (p=0.000 n=20+19) Seal-64 151MB/s ± 0% 164MB/s ± 1% +8.86% (p=0.000 n=19+16) Open-64-X 104MB/s ± 2% 111MB/s ± 1% +6.24% (p=0.000 n=20+20) Seal-64-X 109MB/s ± 2% 111MB/s ± 1% +2.11% (p=0.000 n=20+19) Open-1350 555MB/s ± 0% 751MB/s ± 1% +35.19% (p=0.000 n=20+20) Seal-1350 557MB/s ± 0% 759MB/s ± 0% +36.23% (p=0.000 n=20+20) Open-1350-X 517MB/s ± 1% 683MB/s ± 1% +31.97% (p=0.000 n=20+20) Seal-1350-X 511MB/s ± 0% 683MB/s ± 0% +33.77% (p=0.000 n=18+19) Open-8192 672MB/s ± 0% 1013MB/s ± 0% +50.65% (p=0.000 n=19+19) Seal-8192 674MB/s ± 0% 1018MB/s ± 0% +50.98% (p=0.000 n=18+20) Open-8192-X 663MB/s ± 0% 979MB/s ± 0% +47.57% (p=0.000 n=20+20) Seal-8192-X 658MB/s ± 0% 985MB/s ± 0% +49.62% (p=0.000 n=18+20) name old allocs/op new allocs/op delta Open-64 0.00 0.00 ~ (all equal) Seal-64 0.00 0.00 ~ (all equal) Open-64-X 0.00 0.00 ~ (all equal) Seal-64-X 0.00 0.00 ~ (all equal) Open-1350 0.00 0.00 ~ (all equal) Seal-1350 0.00 0.00 ~ (all equal) Open-1350-X 0.00 0.00 ~ (all equal) Seal-1350-X 0.00 0.00 ~ (all equal) Open-8192 0.00 0.00 ~ (all equal) Seal-8192 0.00 0.00 ~ (all equal) Open-8192-X 0.00 0.00 ~ (all equal) Seal-8192-X 0.00 0.00 ~ (all equal) chacha20poly1305 (this vs. using asm Sum interface - pre CL 206977): name old speed new speed delta Open-64 144MB/s ± 0% 156MB/s ± 1% +8.16% (p=0.000 n=20+19) Seal-64 150MB/s ± 0% 164MB/s ± 1% +9.35% (p=0.000 n=20+16) Open-64-X 104MB/s ± 1% 111MB/s ± 1% +6.15% (p=0.000 n=19+20) Seal-64-X 109MB/s ± 1% 111MB/s ± 1% +1.43% (p=0.000 n=19+19) Open-1350 702MB/s ± 1% 751MB/s ± 1% +6.98% (p=0.000 n=20+20) Seal-1350 715MB/s ± 0% 759MB/s ± 0% +6.09% (p=0.000 n=19+20) Open-1350-X 642MB/s ± 0% 683MB/s ± 1% +6.37% (p=0.000 n=19+20) Seal-1350-X 639MB/s ± 0% 683MB/s ± 0% +6.98% (p=0.000 n=20+19) Open-8192 994MB/s ± 0% 1013MB/s ± 0% +1.85% (p=0.000 n=20+19) Seal-8192 1.00GB/s ± 0% 1.02GB/s ± 0% +1.90% (p=0.000 n=20+20) Open-8192-X 965MB/s ± 0% 979MB/s ± 0% +1.43% (p=0.000 n=19+20) Seal-8192-X 962MB/s ± 0% 985MB/s ± 0% +2.39% (p=0.000 n=20+20) name old allocs/op new allocs/op delta Open-64 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Seal-64 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Open-64-X 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Seal-64-X 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Open-1350 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Seal-1350 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Open-1350-X 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Seal-1350-X 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Open-8192 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Seal-8192 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Open-8192-X 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Seal-8192-X 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Updates golang/go#25219. Change-Id: Ib491e3a47b6b3ec8bbbe1f41f7bf42ad82f5c249 Reviewed-on: https://go-review.googlesource.com/c/crypto/+/219057 Run-TryBot: Michael Munday <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Filippo Valsorda <[email protected]>

mundaym · 2020-11-25T16:36:57Z

I think CL 219057 was the final part of this so closing the issue.

mundaym added Proposal Proposal-Crypto Proposal related to crypto packages or other security issues labels May 2, 2018

gopherbot added this to the Proposal milestone May 2, 2018

FiloSottile added the Proposal-Accepted label May 2, 2018

mundaym changed the title ~~proposal: x/crypto/poly1305: implement hash.Hash interface~~ proposal: x/crypto/poly1305: implement a subset of the hash.Hash interface May 3, 2018

FiloSottile added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 11, 2019

FiloSottile modified the milestones: Proposal, Unreleased Nov 11, 2019

FiloSottile changed the title ~~proposal: x/crypto/poly1305: implement a subset of the hash.Hash interface~~ x/crypto/poly1305: implement a subset of the hash.Hash interface Nov 11, 2019

FiloSottile removed the Proposal label Nov 13, 2019

mundaym closed this as completed Nov 25, 2020

golang locked and limited conversation to collaborators Nov 25, 2021

gopherbot added the FrozenDueToAge label Nov 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

mundaym commented May 2, 2018 •

edited

Loading

FiloSottile commented May 2, 2018

aead commented May 2, 2018

mundaym commented May 2, 2018

FiloSottile commented May 2, 2018

gopherbot commented May 3, 2018

FiloSottile commented Nov 11, 2019

gopherbot commented Feb 11, 2020

mundaym commented Nov 25, 2020

x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

x/crypto/poly1305: implement a subset of the hash.Hash interface #25219

Comments

mundaym commented May 2, 2018 • edited Loading

FiloSottile commented May 2, 2018

aead commented May 2, 2018

mundaym commented May 2, 2018

FiloSottile commented May 2, 2018

gopherbot commented May 3, 2018

FiloSottile commented Nov 11, 2019

gopherbot commented Feb 11, 2020

mundaym commented Nov 25, 2020

mundaym commented May 2, 2018 •

edited

Loading