Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AES-CBC with ISO10126 Padding #381

Merged
merged 6 commits into from
May 13, 2024
Merged

Conversation

amirhosv
Copy link
Contributor

@amirhosv amirhosv commented May 9, 2024

Description of changes:

AES-CBC with ISO10126 Padding

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

+ AWS-LC does not support ISO10126.
+ The padding and unpadding are handled in ACCP.
+ The padding is explained here: https://www.w3.org/TR/xmlenc-core1/#sec-Padding
@amirhosv amirhosv marked this pull request as ready for review May 9, 2024 14:26
@amirhosv amirhosv requested a review from a team as a code owner May 9, 2024 14:26
src/com/amazon/corretto/crypto/provider/AesCbcSpi.java Outdated Show resolved Hide resolved
src/com/amazon/corretto/crypto/provider/AesCbcSpi.java Outdated Show resolved Hide resolved
csrc/aes_cbc.cpp Outdated Show resolved Hide resolved
csrc/aes_cbc.cpp Outdated Show resolved Hide resolved
csrc/aes_cbc.cpp Outdated Show resolved Hide resolved
csrc/aes_cbc.cpp Outdated Show resolved Hide resolved
csrc/aes_cbc.cpp Outdated Show resolved Hide resolved
csrc/aes_cbc.cpp Outdated
}
// Record the new size of last_block.
// Since c <= 16, the following cast is fine.
last_block.get()[AES_CBC_BLOCK_SIZE_IN_BYTES] = (uint8_t)c;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following how the last byte in the last_block is guaranteed to not be data. It seems like you're always holding onto the modulo of the last block (safe) or the last complete block in the case that we're aligned (possible data loss).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments in this section are expanded in the next commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... Now I get it, are we using this extra byte at the end of this last-block array as a secret variable that stores the current length stored in the intermediate last-block scratch buffer? That state really needs to be tracked elsewhere, that's really unclear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the added comments help now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments help, but I'd still prefer this tracked elsewhere like as a class member and/or separate variable if possible. This is really unconventional and liable to turn into nasty bugs in the future. I won't push further than this and I'll defer to the rest of the team for whether we want to push for a refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be passed as a separate variable but it needs to be wrapped in something like an array. This value is read and written to. Having it as an integer or a primitive type that gets passed from Java to C++ is not going to work.

@geedo0
Copy link
Contributor

geedo0 commented May 9, 2024

I've done a partial pass, here's the feedback so far.

csrc/aes_cbc.cpp Outdated
return padding == com_amazon_corretto_crypto_provider_AesCbcSpi_ISO10126_PADDING;
}

static bool is_enc(jint op_mode) { return op_mode == com_amazon_corretto_crypto_provider_AesCbcSpi_ENC_MODE; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It's easier to read if function names use complete words. Maybe is_encrypt_mode() or is_encryption_mode()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'll fix it in the next commit.

Comment on lines +84 to +87
// Since the padding is random, the last block of the cipher texts are not necessarily the same.
for (int i = 0; i < accpCipherText.length - 16; i++) {
assertEquals(sunCipherText[i], accpCipherText[i]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the final byte be the same though? Can you add a check for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The padding is randomized and that's why the final blocks are not necessarily the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WShat Alex is saying is that since the last byte is the padding length, it should be the same when given the same plaintext.

Copy link
Contributor Author

@amirhosv amirhosv May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of the padding, which is stored as the last byte of input, is the same in padded plain texts, but then it gets encrypted as part of the last block and it's not predictable what AES encryption produces for the last byte in the Cipher text.

This also makes causing bad padding exception hard. One could modify the cipher text or use a different key, but somehow the last byte of the unpadded plain text ends up being a number less than or equal to 16.

One thing that I can add is to use AES-CBC with NoPadding and decrypt both sunCipherText and accpCipherText that results in unpadded plain texts: in this case the last bytes will be the same.

csrc/aes_cbc.cpp Outdated
@@ -110,7 +126,7 @@ class AesCbcCipher {
}

// This method always returns 1 and succeeds.
if (EVP_CIPHER_CTX_set_padding(ctx_, padding) != 1) {
if (EVP_CIPHER_CTX_set_padding(ctx_, padding_to_be_used(padding)) != 1) {
Copy link
Contributor

@alexw91 alexw91 May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this be simpler? Otherwise the reader needs to read into 2 nested functions to figure out what is going on.

if (padding == com_amazon_corretto_crypto_provider_AesCbcSpi_ISO10126_PADDING) {
    /* AWS-LC doesn't support ISO10126 padding so ensure AWS-LC does not perform any padding so we can do it ourselves. */
    padding = com_amazon_corretto_crypto_provider_AesCbcSpi_NO_PADDING; 
}

if (EVP_CIPHER_CTX_set_padding(ctx_, padding) != 1) {
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

csrc/aes_cbc.cpp Outdated
// The last_block is used to store this information.
JBinaryBlob last_block(jenv_, nullptr, j_last_block);
// The last byte of last_block stores the number of bytes in it from the previous call.
int last_block_len = last_block.get()[AES_CBC_BLOCK_SIZE_IN_BYTES];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this one past the end of the array? should it be AES_CBC_BLOCK_SIZE_IN_BYTES-1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added more comments that describes why this is not an issue.

csrc/aes_cbc.cpp Outdated
}
// Record the new size of last_block.
// Since c <= 16, the following cast is fine.
last_block.get()[AES_CBC_BLOCK_SIZE_IN_BYTES] = (uint8_t)c;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... Now I get it, are we using this extra byte at the end of this last-block array as a secret variable that stores the current length stored in the intermediate last-block scratch buffer? That state really needs to be tracked elsewhere, that's really unclear.

Comment on lines +84 to +87
// Since the padding is random, the last block of the cipher texts are not necessarily the same.
for (int i = 0; i < accpCipherText.length - 16; i++) {
assertEquals(sunCipherText[i], accpCipherText[i]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WShat Alex is saying is that since the last byte is the padding length, it should be the same when given the same plaintext.

+ Add more comments to the code.
+ Add more tests.
+ Update README.md and CHANGELOG.md
+ Sometimes MacOS JDK11 runs out of memory for CBC ByteBuffer tests
@@ -497,7 +485,7 @@ public void testMultiStepByteBuffer(
private static Stream<Arguments> byteBufferTestParams() {
final List<Arguments> result = new ArrayList<>();
for (final int keySize : new int[] {128}) {
for (int i = 0; i != 1024; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this reduced to 512?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Git commit I mentioned this. Some of the MacOS CI tasks sometimes run out of memory with this test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that this PR increases memory usage of AES-CBC? I'm wondering if this is a sign of a potential memory leak somewhere or not? I don't see any potential memory leaks myself, but I think it's worth it for everyone to take an extra 5-10 minutes to re-review with this in mind, and for us to be extra careful once this change is released and keep an eye on memory usage metrics.

Copy link
Contributor Author

@amirhosv amirhosv May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the CBC tests and it's not part of this change. The ISO10126 implementation doesn't allocate anything on the heap and it relies on CBC implementation. If there is any leaks, it must be coming from the CBC implementation. I can see one potential leak from in the CBC implementation in the following two places: InitUpdate, and InitUpdateFinal
new NativeEvpCipherCtx registers the allocated EVP_CIPHER_CTX with the Janitor. In case the registration fails, we could have a leak. However, this style of allocate-then-register has been the norm in other parts of our code: https://github.com/corretto/amazon-corretto-crypto-provider/blob/main/src/com/amazon/corretto/crypto/provider/AesGcmSpi.java#L1004

csrc/aes_cbc.cpp Outdated
Comment on lines 63 to 67
// For ISO10126, we use CBC with no padding.
static int padding_to_be_used(int padding)
{
return is_iso10126_padding(padding) ? com_amazon_corretto_crypto_provider_AesCbcSpi_NO_PADDING : padding;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't used any longer, it can be deleted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot that. I'll remove it in the next commit.

if (!is_iso10126_padding || is_enc) {
return update(input, input_len, output, unprocessed_input);
}
// We are decrypting and padding is ISO10126. In this case, we must not decrypt the last block until do_final.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a concrete request, but I'll say that I can't keep 11 different length variables in my head simultaneously. Is there any way to simplify this? Or provide a clearer explanation of the relationship between all these variables up front? Maybe an Ascii art diagram? Or a diagram image attached as a PR comment? I find myself wishing for something similar to s2n's stuffer diagram in s2n's documentation.

  1. input_len
  2. last_block_len
  3. all
  4. rem
  5. save_bytes
  6. process_bytes
  7. process_bytes_last_block
  8. process_bytes_input
  9. save_bytes_input
  10. save_bytes_last_block
  11. from_index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified the logic and now the first five variables are used.

unsigned int from_index = last_block_len - save_bytes_last_block;
for (unsigned int i = 0; i < save_bytes_last_block; i++) {
last_block.get()[i] = last_block.get()[from_index + i];
if (all <= AES_CBC_BLOCK_SIZE_IN_BYTES) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This latest iteration is a lot easier to follow. I'm still not a fan of having to hold onto a full-block on each update. This is not a problem for the logic in any other block algorithm. This seems to be caused by not having the right flags/context in between invocations, but I understand that this is a pretty annoying refactor to address.

One thing you can do to simplify is to avoid special casing the all=16b case by just letting it flow through with a 0-byte update. That way we have one fewer edge case to track.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an offline convo, Amir walked me through why we need to manage this last full block on the decryption path.

Copy link
Contributor

@geedo0 geedo0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but would really like to see us refactor how the scratch-buffer size is tracked. The way it's handled now is really unconventional and puts us at high-risk of odd bugs.

@amirhosv amirhosv merged commit 09151a3 into corretto:main May 13, 2024
10 checks passed
@amirhosv amirhosv deleted the aes-cbc-iso10126 branch May 13, 2024 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants