Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the public API to remove footguns, and document it. #52

Merged
merged 1 commit into from
Mar 22, 2021

Conversation

rfk
Copy link
Contributor

@rfk rfk commented Mar 18, 2021

This is a significant refactor of the public API of the crate, simplifying
the API surface and removing some of the footgun potential noted by Martin
in his review at mozilla/application-services#1068.

In particular:

  • The public encrypt functions no longer take a salt parameter. The
    right thing to do is to generate a new random salt for each encryption
    so we just do that for you automatically.
  • Many internal implementation details are now pub(crate) rather than pub,
    to avoid potential confusion from consumers.
  • We refuse to encrypt or decrypt across multiple records, because our only
    consumer in practice is webpush, and webpush restricts consumers to using
    only a single record.

We still have the code lying around to encrypt/decrypt across record
boundaries, but we don't have high confidence that it works correctly
and intend to remove it in a future commit. So, may as well adjust the
interface to reflect that while we're in here making breaking changes.

To go along with the revised interface, this commit also significantly
expands to docs in order to help set consumer expectations and context.

(Ref #27 for an earlier version of this change).

@rfk
Copy link
Contributor Author

rfk commented Mar 18, 2021

/cc @tiesselune - could you please check that this doesn't interfere with your use of the crate, apart from the obvious breakage from removing the salt param?

@rfk rfk force-pushed the interface-cleanups branch from 10786c3 to 237b3e2 Compare March 18, 2021 06:43
@rfk rfk requested a review from jrconlin March 18, 2021 06:44

// We have some existing test data in b64, and some in hex,
// and it's easy to make a second `try_decrypt` helper function
// than to re-encode all the data.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests have just moved from lib.rs into legacy.rs to be next to their other aesgcm friends, I haven't changed them at all.

@@ -181,61 +200,6 @@ mod aes128gcm_tests {
assert_eq!(ciphertext, "0c6bfaadad67958803092d454676f397000010004104fe33f4ab0dea71914db55823f73b54948f41306d920732dbb9a59a53286482200e597a7b7bc260ba1c227998580992e93973002f3012a28ae8f06bbb78e5ec0ff297de5b429bba7153d3a4ae0caa091fd425f3b4b5414add8ab37a19c1bbb05cf5cb5b2a2e0562d558635641ec52812c6c8ff42e95ccb86be7cd");
}

#[test]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had quite a few tests for for dealing with record boundaries etc. Per the discussion in mozilla/application-services#1068, we do not need this for webpush, and I don't think it works right in the legacy aesgcm scheme anyway.

I intend to push a follow-up PR that removes the multi-record stuff entirely; it seemed like too much churn to do all at once in this PR.

@rfk rfk requested a review from a team March 18, 2021 06:49
@rfk rfk force-pushed the interface-cleanups branch 2 times, most recently from 0542f86 to a50068a Compare March 18, 2021 07:33
///
/// You should only use this is you're implementing a custom `Cryptographer` and want to check
/// that it is working as intended. This function will panic if the tests fail.
///
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disadvantage of hiding things from the public API, is that it makes it harder to test our custom NSS-based ece backend over in the application-services repo. It currently has its own copy of all the aes128gcm tests which use a bunch of the now-hidden implementation details. What if instead, we exposed a function here specifically for testing custom backends?

(My theory with making it generic over T: Cryptographer rather than e.g. taking a Box<dyn Cryptographer> is that it won't produce any code unless you actually try to use a concrete instantiation of it, and since we only use it in tests, it won't contribute to the size of our production builds. I have done exactly zero work to confirm whether that's actually what will happen).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref mozilla/application-services#3941 for the corresponding appservices change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC: you might be able to use a feature flag to make a "test" version that does expose this as public. It's a bit hacky, but we've done something similar in other repos.

Copy link
Collaborator

@jrconlin jrconlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

  • I definitely think this rates a version bump in Cargo.toml
  • might be a bit of debugging detritus in one of the tests.
  • Thanks for adding all the docs!

[RFC8188](https://tools.ietf.org/html/rfc8188)
* `aesgcm`: the draft scheme described in
[draft-ietf-webpush-encryption-04](https://tools.ietf.org/html/draft-ietf-webpush-encryption-04) and
[draft-ietf-httpbis-encryption-encoding-03](https://tools.ietf.org/html/draft-ietf-httpbis-encryption-encoding-03_)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really be in favor of dropping the oldest one of these. I'll file a separate ticket. Most of what we're seeing is either 'aes128gcm' or 'aesgcm'.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can definitely consider dropping the really old aesgcm128 format:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the code actually supports that format, this could just be a documentation issue or misunderstanding on my side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrconlin I'm not very familiar with the various versions of these specs, could you please help me understand what you mean here? The two linked "draft" documents only seem to mention "aesgcm", which IIUC is the "legacy" line in your graph and the corresponding "legacy" API in this crate. Neither contains the string "aesgcm128" that I can see.

I suspect we don't have anything to remove in the code, but I'd be happy to update the doc references here to make things more clear, if you can suggest any revisions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh. The fun of a long lived and somewhat popular draft is that you get stuff like this. I don't expect that this PR should drop support for the very old protocol, but I do think we should add it as a future issue.

In essence, three "versions" of the encryption format started being used over time. The very first version "aesgcm128" used a combination of header and body content in order to encrypt/decrypt the message. It was reasonably early so not a lot of libraries adopted it, but firefox said that they'd support it as part of the WebPush pre-cursor. The next version cleaned things up, but still used header/body format (albeit, with different headers and formats), and was labeled "aesgcm". This version was WIDELY adopted by third party encoder libraries and (as you can see) is still very much in use. The final RFC proposed "aes128gcm", which made encryption/decryption self contained within the body of the message. While folks have been encouraging subscription providers to update their libraries to use the RFC format (since it's both more efficient and far more secure), a lot of subscription services have not.

There are not a lot of changes between "aesgcm128" and "aesgcm", but they are present. Supporting the very old format just adds unneeded complexity and potential surface area for the library. For most client side / decryption models, we can definitely control this from the Push Server by simply blocking messages that use "aesgcm128" with a 415 or something. For encryption, however, it would be best to remove that particular format as a potential option.

All things must come to an end, and for "aesgcm128", it's way overdue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dug in a bit more here, and I don't believe this crate has support for aesgcm128. I've been using the desktop implementation for reference to help get my head around it.

In order to support aesgcm128, we would need to be:

  • Using a KDF info string of "Content-Encoding: aesgcm128"; I can't find such a string anywhere in the repo.
  • Emitting an Encryption-Key header as part of the encryption process; I can only find support for emitting the Crypto-Key and Encryption headers of the newer-but-still-legacy aesgcm scheme.

I've made the following updates to help clarify what I believe the situation is, but if I've misunderstood, please correct me:

  • I've delete a reference to the Encryption-Key header in one of the comments, which was the only place I could find such a string, and which is incorrect since the comment is specifically in the context of aesgcm which uses the Encryption field.
  • I've added an explicit note in the readme saying this scheme is not and will never be supported, along with a link to the older draft RFC for reference (draft-thomson-http-encryption-02, which I dug out of the comments in the desktop implementation).

///
/// You should only use this is you're implementing a custom `Cryptographer` and want to check
/// that it is working as intended. This function will panic if the tests fail.
///
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC: you might be able to use a feature flag to make a "test" version that does expose this as public. It's a bit hacky, but we've done something similar in other repos.

src/legacy.rs Outdated
// since it's a sampled random, endian doesn't really matter.
let pad = ((usize::from(padr[0]) + (usize::from(padr[1]) << 8)) % 4095) + 1;
let params = WebPushParams::new(4096, pad, Vec::from(salt));
let mut pad_length = ((usize::from(padr[0]) + (usize::from(padr[1]) << 8)) % 4095) + 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We do this here and in lib.rs. Would it be worth making this into a pub(crate) function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend to propose a change to how this padding works in a follow-up PR, but yeah, it's probably worth pulling out a quick helper in the meantime.

src/lib.rs Outdated Show resolved Hide resolved
@rfk rfk requested review from lougeniaC64 and removed request for a team March 18, 2021 23:58
@rfk rfk force-pushed the interface-cleanups branch from 5ba0ddd to 347bdbe Compare March 19, 2021 07:06
@rfk rfk requested a review from jrconlin March 19, 2021 07:06
@rfk
Copy link
Contributor Author

rfk commented Mar 19, 2021

Thanks @jrconlin, updated with nits addressed.

@rfk rfk force-pushed the interface-cleanups branch from 347bdbe to c647c29 Compare March 19, 2021 07:21
@rfk
Copy link
Contributor Author

rfk commented Mar 19, 2021

We refuse to encrypt or decrypt across multiple records, because our only
consumer in practice is webpush, and webpush restricts consumers to using
only a single record.

Update: I'm concerned that we may be accidentally using multiple records when we use this encryption scheme for send-tab. Both the rust component and desktop appear to have support for arbitrarily-long payloads and splitting them across multiple records. I need to dig a big deeper into the implications there before we can land this :-/

@jrconlin
Copy link
Collaborator

AFAIK, there are several issues associated with multi-part records that lead to the decision to not support this aspect of the protocol. (no guarantee of delivery, potential for out of order delivery, potential drop via bridge systems, etc.)

For Send Tab, I believe that we created PushBox specifically to address large data associated with a push message, but I have no idea if there may be even more metadata that may exceed the size limits. I do know that the PushServer rejects messages that are too large.

Of course, if the UA wants to deal with the headaches associated with handling multi-part records, they are absolutely free to do so.

@tiesselune
Copy link
Collaborator

/cc @tiesselune - could you please check that this doesn't interfere with your use of the crate, apart from the obvious breakage from removing the salt param?

@rfk I haven't taken the time to actually test this on the newer rust-web-push crate (the pull request is not even merged so really if there is any breakage it's not a big deal, we'll be able to upgrade it gracefully) but well the only two interfaces we're using now are the encrypt_aesgcm and encrypt_aes128gcm which just have the salt removed. With my latest pull request, the salt is directly merged into the headers and hidden away so I see no objection to remove it from the interface entirely. The rest of the udpate shouldn't interfere with our use case.

Copy link

@lougeniaC64 lougeniaC64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit to being initially confused by aes128gcm vs aesgcm128 but otherwise the PR explanation and comments were helpful. 🚀

@jrconlin
Copy link
Collaborator

COMPLETELY understandable.

A lot of folks are.

One of the reasons I really want to get rid of the very old aesgcm128.

Possibly by dragging it behind a woodshed.

@rfk
Copy link
Contributor Author

rfk commented Mar 21, 2021

For Send Tab, I believe that we created PushBox specifically to address large data associated with a push message,

Right, yes. To add more context: Send Tab also encrypts the larger-payload data that it stores in PushBox, and it encrypts it using the same ECE scheme as we use for push (because, well, it's already there and works).

@jrconlin
Copy link
Collaborator

I am absolutely 100% behind the idea of killing off the old aesgcm128 format. I'm happy that there may not be anything to do here, and the push server can just refuse to accept them. The desktop code can degrade out however it wants to.

@rfk rfk force-pushed the interface-cleanups branch 2 times, most recently from a9d065b to e167d74 Compare March 22, 2021 01:32
@rfk
Copy link
Contributor Author

rfk commented Mar 22, 2021

After a bit of poking at send-tab, I've decided to continue to support multiple records when encrypting with aes128gcm, because I can't rule out that this would break and tab-sending use-cases. We should, however, proceed with dropping multi-record support for aesgcm. I've updated the implementation and the docs here accordingly.

@rfk rfk force-pushed the interface-cleanups branch from e167d74 to 60cd817 Compare March 22, 2021 01:34
This is a significant refactor of the public API of the crate, simplifying
the API surface and removing some of the footgun potential noted by Martin
in his review at mozilla/application-services#1068.

In particular:

* The public `encrypt` functions no longer take a `salt` parameter. The
  right thing to do is to generate a new random `salt` for each encryption
  so we just do that for you automatically.
* Many internal implementation details are now `pub(crate)` rather than `pub`,
  to avoid potential confusion from consumers.
* We refuse to encrypt or decrypt across multiple records in the legacy
  `aesgcm` scheme, because the only consumer of that schema is webpush,
  and webpush restricts consumers to using only a single record.

We still have the code lying around to encrypt/decrypt across record
boundaries, but we don't have high confidence that it works correctly
for `aesgcm` and intend to refactor that away in a future commit.
So, may as well adjust the interface to reflect that while we're in here
making breaking changes.

To go along with the revised interface, this commit also significantly
expands to docs in order to help set consumer expectations and context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants