Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pad to multiples of a fixed size, rather than padding randomly #54

Closed
rfk opened this issue Mar 22, 2021 · 5 comments · Fixed by #58
Closed

Pad to multiples of a fixed size, rather than padding randomly #54

rfk opened this issue Mar 22, 2021 · 5 comments · Fixed by #58

Comments

@rfk
Copy link
Contributor

rfk commented Mar 22, 2021

During encryption, the code in this crate currently selects a random integer between zero and the allowed record size, and adds that many bytes of padding to the plaintext.

It's my understanding that adding random padding isn't all that helpful in practice, because if your adversary collects enough samples they can statistically filter out the noise. IMHO we should either do something better here, or do what the Desktop code does and not pad at all because we don't have enough information about the usage context to do it well.

RFC 8188 Section 4.8 has some commentary on padding, and notes that "Developing a padding strategy is difficult". It lists three common strategies, all of which seem tractable and none of which are "add a random amount of padding":

  • padding to a small set of fixed lengths
  • padding to multiples of a value
  • padding to powers of 2

(That section also says "distributing non-padding data across records is recommended to avoid leaking size information", which is something that this crate does do, and which seems worth keeping).

I propose that we change the current padding calculation to round up the plaintext size to multiples of a fixed size; for the sake of argument let's say multiples of 128 bytes. If that doesn't seem worth the effort, then I propose that we remove the padding entirely in the interests of simplicity.

I'll reach out to @martinthomson to see if he'd like to weigh in on the particulars here.

@martinthomson
Copy link
Member

martinthomson commented Mar 22, 2021

Random padding isn't completely useless, depending on context. Here, it's not that much good.

It very much depends on your application when you come to choose a padding scheme. For something like push messaging, you still need to understand what information is being carried and what you want to protect. For instance, if it is price alerts for secret offers on secret stuff, you might like to pad item codes and prices individually so that they are all fixed length, but maybe you also have a variable number of alerts in each message and you want to hide that too.

As a baseline 128 is probably reasonable if you consider a maximum size of 4k. I can see cases for 32 or 64 as well.

Setting a default of 128 and letting people adjust is probably the best that this library can do.

@rfk
Copy link
Contributor Author

rfk commented Mar 22, 2021

I also have this sneaking suspicion that the padding in aesgcm mode doesn't work correctly. The unpad method has correctly-seeming code to read the padding length as a 2-byte integer and then remove that many zero bytes, but the pad method seems to just prefix the plaintext with zero bytes without writing the length (which produces the correct result if you're not adding any padding, will decrypt to a corrupted plaintext otherwise).

@martinthomson
Copy link
Member

Might be time to excise the aesgcm code then. Someone helped me do that and it made the code nicer.

@rfk
Copy link
Contributor Author

rfk commented Mar 22, 2021

Yeah :-/

I'm taking a brief look at #55 and as part of that, going to try to at least disentangle aesgcm from aes128gcm as a start towards one day removing it.

rfk added a commit that referenced this issue Mar 24, 2021
This is one of the padding techniques suggested in the RFC, and while
we can't choose a default scheme that will work well for all applications,
this one at least seems easier to reason about.

Fixes #54.
rfk added a commit that referenced this issue Mar 24, 2021
This is one of the padding techniques suggested in the RFC, and while
we can't choose a default scheme that will work well for all applications,
this one at least seems easier to reason about.

Fixes #54.
@rfk
Copy link
Contributor Author

rfk commented Mar 24, 2021

I'm taking a brief look at #55 and as part of that, going to try to at least disentangle aesgcm from aes128gcm
as a start towards one day removing it.

My attempt to disentangle to two schemes is here: #59

@martinthomson if you have a chance to look at this (probably just at the newly-standalone aes128gcm.rs and aesgcm.rs files) I'd appreciate it, but I understand it's a lot and likely not the highest-priority work. I'm just in a cleaning-up kind of mood.

@rfk rfk closed this as completed in #58 Mar 24, 2021
rfk added a commit that referenced this issue Mar 24, 2021
This is one of the padding techniques suggested in the RFC, and while
we can't choose a default scheme that will work well for all applications,
this one at least seems easier to reason about.

Fixes #54.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants