Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rom] Change the endianness of FORS indices for SPHINCS+. #22953

Merged
merged 6 commits into from
Jun 14, 2024

Conversation

jadephilipoom
Copy link
Contributor

@jadephilipoom jadephilipoom commented May 3, 2024

This backwards-incompatible change is part of the changes between the round 3 SPHINCS+ submission to the NIST PQC competition and the SLH-DSA draft standard. Part of #21944

Corresponds to this commit from the SPHINCS+ reference implementation (consistent-basew branch):
sphincs/sphincsplus@129b72c

Test vectors were re-generated from the consistent-basew branch of the reference implementation.

Note that I think this might break the cryptotest SPHINCS+ setup, because they are right now hard-coded to fetch a specific self-hosted file for the round3 submission:

"https://storage.googleapis.com/ot-crypto-test-vectors/sphincsplus/sphincsplus_shake256_128s_round3.zip ",

@RyanTorok @milesdai do either of you know if there is another file there, or how we can update it? I don't think NIST is actually hosting a corresponding KAT file for the draft standard yet, so we could also potentially wait until they release final test vectors (but in that case we should probably add a comment that this setup will fail tests on master, and explain why). The round3 tests are still useful for ES sival.

Edit: now addressed, thanks Ryan!

@RyanTorok
Copy link
Contributor

RyanTorok commented May 3, 2024

If we find a new archive with KAT tests for the new version, it would be easy to upload it to the cloud storage and use (we would likely upload it alongside the old one, to preserve compatibility for other branches).

The archive file in the GCP storage for SPHINCS+ is a bit special compared to the rest of the cryptotest framework. Due to size considerations, the SPHINCS+ archive is manually created by us (which is why the GCP storage is primary, unlike the other algorithms, which use the NIST site as primary and GCP as backup). "Manually created" here just means we un-compressed the archive, removed the KAT tests for all the variations of SPHINCS+ we don't support, and re-compressed it.

That original zip came from the NIST round 3 submission available directly on the SPHINCS+ site, and did not involve the SPHINCS+ repo at all. I took a look at the new branch matching the new draft standard, and it is possible to generate the test vectors for this version, but this won't be possible hermetically.

We may be able to create a new archive file to store in the GCP using the temporary files generated by vectors.py, but the process to create the archive would be more complicated than simply removing a bunch of irrelevant files, and I'm not sure what implications that has for reproducibility.

@RyanTorok
Copy link
Contributor

I generated a new archive file with KAT tests using the consistent_basew branch, and uploaded it to the GCP bucket. I updated the manifest to include instructions to reproduce the zip archive.

@RyanTorok
Copy link
Contributor

I updated #21681 to switch the source of the test vectors to the new GCP-backed archive for the consistent-basew version of SPHINCS+.

@moidx moidx requested review from cfrantz and removed request for cfrantz and a team May 6, 2024 18:12
Copy link
Contributor

@moidx moidx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jadephilipoom. I assume this will break the current rom e2e sigverify_spx tests.

Is there an equivalent update for opentitantool?

@cfrantz for visibility as this will also require updates on the HSM integration side.

@jadephilipoom
Copy link
Contributor Author

Thanks @jadephilipoom. I assume this will break the current rom e2e sigverify_spx tests.

Is there an equivalent update for opentitantool?

Good point. I looked at the pqcrypto rust crate we're using, and it looks like it pulls from PQClean, which pulls from the master branch of the reference implementation 😞 so it's hard to just change something "easy" there and update it. There's an open issue on PQClean to update to the to-be-standardized version, but the timeline is unclear. We could try switching to a different crate, but the ones I can see that advertise "slh-dsa" don't seem very well-established. Not sure what we should do about this. We could depend on one of the newer crates, since it's just for testing anyway, or we could always hold off on merging this PR until the last minute or disable the tests that would fail until pqcrypto updates, but it would be good to run end-to-end tests before ROM freeze.

@moidx
Copy link
Contributor

moidx commented May 6, 2024

Thanks for looking into the host side tooling @jadephilipoom. Should we be considering to do a similar thing to what we did on the device side an maintain our own host-side implementation? It seems that there are still additional changes in the pipeline which are going make it difficult for us to run integration testing.

Should we raise this issue to the Software WG?

@jadephilipoom
Copy link
Contributor Author

Thanks for looking into the host side tooling @jadephilipoom. Should we be considering to do a similar thing to what we did on the device side an maintain our own host-side implementation? It seems that there are still additional changes in the pipeline which are going make it difficult for us to run integration testing.

Should we raise this issue to the Software WG?

Yes, seems like a good topic for Software WG (and good timing too!)

@mundaym
Copy link
Contributor

mundaym commented May 7, 2024

This issue was discussed in the software working group today. The consensus was that we should write our own Rust wrapper for the reference implementation.

@jadephilipoom
Copy link
Contributor Author

For visibility: I've cherry-picked the last commit from #23326; this puts the host-side tooling in sync with the device for the endianness swap so that tests work. If it passes CI, I'll just merge this and close #23326,. Since both sets of changes have been reviewed and approved, I don't think it requires any additional review; just wanted to put an explanation for the new commit post-approve.

@jadephilipoom
Copy link
Contributor Author

There are a couple remaining test failures because (I think) some pre-signed binaries need to be manually re-generated. Can anyone tell me what the procedure is for that? I can see that commit c36ad25 re-generated them due to opentitantool updates, which looks like the same thing I need, but I can't figure out which commands to run to do that again. @timothytrippel , do you remember?

@jadephilipoom
Copy link
Contributor Author

jadephilipoom commented Jun 4, 2024

Update: after talking to @cfrantz and @moidx , I understand that none of the pre-signed binaries should need to be regenerated, and the tests are likely failing because the signatures opentitantool are producing are incorrect. I added a KAT on the opentitantool impl and it failed, so that is definitely the case and I'm working on figuring out what the bug is. (edit: it's not actually failing, I just forgot that SPHINCS+ signing is nondeterministic 🤦‍♀️ anyway, the verification KAT passes, so I'm not sure what's happening.)

@jadephilipoom
Copy link
Contributor Author

Re-generating the keys under sw/device/silicon_creator/rom/keys/fake/spx fixed some of the failing tests (last commit). I find this really odd, because the key-generation procedure really should not have been affected by the endianness switch. My best guesses are:

  • something subtle changed in the interpretation of the PEM files (maybe some endianness reversal somewhere that made the new host-side code incompatible with the old)
  • somewhere a signature was saved, and for some reason got re-generated when I updated the keys
  • I'm wrong about key generation not being affected (but I just checked again now and I really don't see it)

Let's see what CI says, and I'll keep investigating in the meantime; I don't really want to leave this as a question mark. Ideally I'll figure out where the switch is and remove the last commit that re-generates the fake keys, but I wanted to pass it through CI to gather some more information.

CC @moidx @cfrantz

@jadephilipoom
Copy link
Contributor Author

Update: CI result has two failing tests, but both appear to be failing on all CI jobs at the moment and I think they're unrelated.

  • //sw/device/silicon_creator/rom/e2e/boot_policy_flash_ecc_error:a_valid_b_corrupt_manifest_security_version_fpga_cw310_rom_with_fake_keys
  • //sw/device/tests:uart_smoketest_fpga_cw340_rom_with_fake_keys

I don't have a CW340, but I can replicate the first one on master and both are also failing on #23598, which only adds test data that's not used by anything. So I think it's safe to say that re-generating the keys fixed all the failures (although I'm not satisfied until I know why).

other types of keys:
- To generate ECDSA keys, replace every occurrence of `spx` with `ecdsa`.
- To generate prod or dev keys, replace every occurrenct of `test_key` with `prod_key` or `dev_key`.
- To generate real keys, use the `keys/real` directory instead of `keys/fake`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For real keys we recommend using an offline HSM, so this procedure is only applicable to fake keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a way of generating those keys ahead of time and checking in simulation that the chip definitely can boot using them, so that we don't end up with a subtle encoding mismatch between the HSM and the ROM?

@jadephilipoom
Copy link
Contributor Author

Okay, I've figured out what's going on. Long story short: we should regenerate the test keys, so I'll leave the commit in. The old pqcrypto setup was using an outdated implementation of keygen/sign. Verification works as long as key generation and signing agree, but not if they disagree, which is what happened when I introduced updated signing code but left the old keys.

Here's a more detailed debugging log:

  • I was able to observe using the keypair_from_seed function that our bindgen for the SPHINCS+ reference implementation produced different public key data (PK.root) given the seed from the test keys.
  • I checked the same bindgen code against a seed from the modern SPHINCS+ KATs and found that they agreed.
  • I looked into the pqcrypto version we're using and found it's 0.6.4, which was released April 2022.
  • It wasn't straightforward to trace through pqcrypto and PQClean and figure out exactly which SPHINCS+ commit that was, so I just went back in history in my local clone of sphincsplus and kept re-generating KATs with the same seed values until I saw the value change.
  • The change happens in this commit from sphincsplus from June 2022, when the signing and key generation code is modified to have slightly different address types:
    sphincs/sphincsplus@8405245
  • Verification code is not changed in that commit, which matches our observations (the modern verification routine worked with the outdated keys/signatures).

@jadephilipoom
Copy link
Contributor Author

Fixed to address @moidx's comment on real keys and also to re-generate "unauthorized" keys. I'll consider this OK to merge once CI passes, unless anyone objects.

This backwards-incompatible change is part of the changes between the
round 3 SPHINCS+ submission to the NIST PQC competition and the SLH-DSA
draft standard.

Corresponds to this commit from the SPHINCS+ reference implementation
(consistent-basew branch):
sphincs/sphincsplus@129b72c

Signed-off-by: Jade Philipoom <[email protected]>
The test vectors were re-generated from the `consistent-basew` branch of
the reference implementation.

Signed-off-by: Jade Philipoom <[email protected]>
Instead of using the pqcrypto crate to indirectly access the reference
implementation, we now directly invoke our own bindings for the
reference implementation.

Signed-off-by: Jade Philipoom <[email protected]>
Also small fixups to the opentitantool instructions so they're easier to
follow.

Signed-off-by: Jade Philipoom <[email protected]>
@jadephilipoom jadephilipoom merged commit 5617058 into lowRISC:master Jun 14, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants