-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: domain separation for Fulcio-issued certificates #1131
Comments
CC @znewman01, @haydentherapper, and @asraa in particular for thoughts here! |
Releasing from the same workflow is effectively reusing the same signing key. I would recommend a separation of workflows. You can use the same underlying reusable workflow but with different workflow inputs, which would make the certificates distinguishable. It's really no different than the example you gave for conditionals in a single workflow.
Won't this have to be a part of a verification policy regardless of domain separation? The package manager needs to know the mapping between signing workflow and artifact.
This is a very significant downside to this approach in my opinion. Too easy to misuse, especially since everything else is authenticated. This also comes with the risk of spam, especially considering that these certificates will go into an immutable log. We could mitigate this by having a predefined set of domains ("nightly", "staging", "prod", etc), but this might not work for everyone. |
Cc @laurentsimon, you might be interested in this |
Yep, I fully agree that this is optimal (and would be best practice, including using a reusable workflow + containing workflow for distinguishability). That being said, I think in practice this will be a difficult hurdle for a lot of packagers: it requires a degree of discipline + commitment to idioms/best practices that aren't widely held (at least in Python packaging).
Yes and no -- I think it's application specific. For contexts like PyPI, I'd argue that default verification policies probably shouldn't include things like the OTOH, for specialized users (or other applications of Sigstore), those will probably be important parts of the verification policy.
Agreed 100% with both of these concerns, although I think that both are mitigable:
|
Hi, as a package maintainer, I'm in the process of setting up release workflows with Sigstore across a couple of my projects. My 2 cents:
(Take these with a grain of salt, the code signing part of the PKI ecosystem is still fairly new to me, so if this doesn't make any sense, carry on) |
I’m not sure these are going to change frequently, as they’re the names, not the digests, of the builder workflows. They really can’t change frequently, otherwise how else does the package repository authenticate the identity in the certificate? FWIW, no package index has implemented this yet, so maybe we will see issues pop up as they do, but I’d assume the workflow looks something like:
You can also add TUF into the mix to make this mapping publicly auditable. Going back to the first comment, what is the threat we are defending against? I don’t disagree that there is an opportunity for a package to be swapped out with similar looking provenance, but again, this seems like a builder-signing problem. The more likely threat is source repo compromise - now as an attacker, I’ll focus on switching out the unauthenticated value so that the vulnerable package’s destination points to whatever index the attacker wants it to go to. This gets into SLSA discussions too, because we now have user provided values, rather than non-forgeable values provided only by the builder. Edit: to add some context to my slsa comemnt, you could imagine an index giving different check marks based on the slsa attestation level. You couldn’t trust unauthenticated values because they might be tampered with (by an attacker, or insider/malicious maintainer). |
it should generalize! There are a new set of CI-platform-agnostic claims, check out oid-info.md. |
It's important to distinguish between trusted publishers, which are an authentication mechanism, and therefore only need to be validated by PyPI itself, and sigstore signatures, which ened to be end user verifiable. If an end user needs to verify a property, they need to know what the correct value is. End users do not have a way of knowing what the correct workflow or environment is. |
Thanks!
Point taken. I was implicitly assuming a human validator checking whether or not the name "looked reasonable" at validation time, but that indeed doesn't scale. Perhaps this disadvantage is less pronounced with OID markers? If the OIDs used for "test builds" and "production builds" are uniform across any given packaging ecosystem, that's something one could standardise on in package managers for end users (or other user-facing validation tooling). And if advanced users need more granular domain separation, they could still inject their own OIDs regardless. |
It's confusing, but we have to distinguish between the authenticated and unauthenticated components here: the certificate's embedded claims might include the package version in the form of a similar-looking In other words: an attacker Mallory could take a package |
PEP480 (which I believe is being rewritten to include Sigstore) touches on this. You could use TUF to provide this mapping, whether it be user managed keys, identities, or CI identities. |
Yeah, I'm perhaps being overly conservative here, based on how complex I'm expecting this implementation to be for PyPI 🙂 I'm realizing there are two separate concerns here:
Both of those are positive concerns, in the sense that I don't think either is what users ought to do, but probably will end up doing due to the size and diversity of most packaging ecosystems. For concern (1), I think I agree with you -- this is handled by the conjunction of trusted publishing and signing, and public auditability concerns can be resolved by TUF. For concern (2) the threat model is the kind of index confusion I mentioned in #1131 (comment):
I agree as a point of order that Bob ought not publish to both Additionally, this has all been in the context of machine identities, when email identities are also something that PyPI will likely support. In that context, domain confusion is an even more salient concern: Bob is unlikely to have two separate email identities for production and staging signatures, so he'll need some other way to convey his signing intent. |
That attack seems like typosquatting with TOFU. Mallory could also take foo-1.0 and publish foo-2.0 on the same index with the same contents and signature and try to convince users her copy is the “real” one. This seems like it’s the responsibility of the package index to mitigate this, that versions (and across indices) should be linked together under one signer. What about using a proof of possession? That would mitigate this risk of signature reuse because you’d have to prove ownership of the private key, whether that be a key, identity (do an OIDC dance), or repo (maybe put some value in the repo ACME-style?). |
I think it's similar, but a distinct attack: Mallory makes herself visible by trying to convince users that her signature is the "real" one, but remains stealthy when trying to convince users that Bob's "staging" domain signature was really intended for the "production" domain.
Could you say some more about this? Is the idea here that each mirroring operation would require Bob (or Mallory, unsuccessfully) to "re-prove" themselves? |
Only on initial upload, since you require the same index identity after that (I think, worst case it’s every upload). Currently, an unsigned package can be renamed, modified and reuploaded without detection. Now let’s add a signature, generated by a key. Now the package can’t be tampered with, but it could be reuploaded under a different name, assuming that name hasn’t been claimed. If the index were to ask for a proof of possession of the signing key for an initial registration, this would prevent anyone from copying a signature. The same procedure can be done for identity based signing, just with a different PoP, asking for proof of ownership of a repo, or of an identity. |
That's an interesting idea, although we'd have to figure out what it means to provide an "equivalent" PoP/proof of ownership for different identity types: for emails it's straightforward, but for machine identities it's a little fiddly (e.g. you can imagine this pessimistically forbidding uploads of a different package that uses the same release workflow legitimately). I also think this wouldn't address a "remirroring" case, e.g.:
In that case, the entity doing the remirroring isn't necessarily the original signer, so they can't provide a PoP/ownership. Instead, they'll probably want to be able to make policy statements like "I accept artifacts accepted by this domain/audience." |
Also, I just realized that "domain" was a bad choice of word by me, since it's very easily confusable with DNS domains 😅 What I'm proposing is effectively the same thing as the OIDC audience ( |
Can I ask a possibly silly question? Why does this have to happen at the Fulcio layer? I think domain separation is a good idea, but I'd prefer to implement it at the level of what you're signing. So instead of signing the artifacts directly, you'd sign a "release attestation" which is repo-scoped. When setting up the workflow, you'd have to specify the In general, I think this comes from a lack of clarity in how we're thinking about the PyPI publication policy. It seems like we want to check something like:
Confusion around (3) seems to be the problem in this issue. I'd really prefer to check these things each separately, rather than together. So we can have signed build provenance for (1), check the OIDC GitHub repo for (2), and check some release attestation for (3). |
That's a great suggestion! That provides a clean separation between the identity layer of Fulcio and user-provided artifact metadata. One last thing about typosquatting cause I'm thinking about that still, I do think it's in the public index's interest to validate ownership over anything provided to it. If the index were to validate repo ownership (which effectively verifies machine identity ownership, given the workflow should be in the repo), then it prevents an attacker from grabbing an old, vulnerable (but still signed!) version and reuploading it under a new name (or to a different index. For the same index, domain separation doesn’t solve this). Provenance gives you where the source is (which must be from the registered repo) and where the build config is (which also must be in the registered repo). |
I'm not sure! My motivation for putting it at the Fulcio layer is mostly expedience 🙂 -- there's a clear roadmap for getting Sigstore signatures into PyPI and similar ecosystems, which will almost certainly look like a tuple of (
Could you say some more about what this would look like? My first thought would that it would essentially be a digital signature with the same key that the certificate attests to, but I'm not 100% clear on where that attestation would "live":
|
I think the idea would be that the platform would mandate a release attestation, so removing it and using a raw signature would not be allowed. It could be ecosystem dependent, so some ecosystems may not require any attestation, some could mandate certain claims. This also works nicely with the idea of using DSSE rather than raw signatures, something that was proposed to me recently. |
Precisely. That would prevent downgrade attacks (or rather, turn them into DoS attacks), which are possible anyway. @kommendorkapten makes a note in the Sigstore Slack about how this works for npm, which is similar to what I proposed:
|
Here is the publish attestation predicate: https://github.com/npm/attestation/tree/main/specs/publish/v0.1 It was also discussed on the last Sigstore clients meeting. Other ecosystem are interested in a similar attestation. My belief is that the one already used by npm is generic enough, and we can see if e.g. the in-toto project is interested in taking ownership of that predicate (in-toto already defines a few predicates ). Sorry if this is off-topic, I don't wish to derail the discussion, just add more context on the publish attestation. |
Sorry for these questions, I think I've lost the thread a little 😅
Making sure I understand: is this release attestation included in the Sigstore bundle, in lieu of an ordinary raw signature (and therefore bound to the same original signing identity), or something else? If that understanding is correct, then this makes a lot of sense to me! |
Ha, we've been a little all-over-the-place. No worries.
Precisely :) (it's not important that the attestation go in the bundle, but it can!)
Good to hear! Yeah, I think that's a lot simpler than adding this feature to Fulcio. |
Agreed! We're now firmly outside of the domain of Fulcio so maybe it makes sense to relocate this conversation, but: what does the expected UX for these attestations look like, versus an ordinary signing slow? For example, here's how I currently produce a Sigstore bundle (including raw signature) for a Python package distribution: # produces some-dist.whl.sigstore
sigstore sign some-dist.whl What would it look like to embed a release attestation instead? Do we expect there to be a registry of common attestations, e.g. could I do something like this? sigstore sign --attest-for pypi some-dist.whl |
Meh, here's fine pending a better spot.
I think it's a non-goal to have users explicitly type out the actual attestations. So I'm okay if there's some command (doesn't need to be baked directly into pip) that de-sugars to something like:
Something like that is probably a good idea. I've hinted at similar in the past: sigstore/cosign#2892 |
That makes a lot of sense, thanks for explaining! So, to tie this all together in my head: where does DSSE fit in? With |
Yea, that sounds correct, and the DSSE payload would be uploaded to Rekor (there's a new DSSE type being worked on, though intoto mostly works too). |
DSSE wraps the publish "statement" (unauthenticated claim like "i wanna publish package foo @ v2.5.6 hash sha256:abcde to test.pypi.org") in an envelope for signing. The signed envelope is now a full "attestation". The raw signed bytes are specified by DSSE, so the signing tool will ideally know about DSSE. But you could also use an intermediate command like |
(Dropped back to Slack because the context has switched from Fulcio to the underlying problem.) I'm going to close this out, since I think the underlying question has been resolved. Thanks a ton @znewman01, @kommendorkapten, and @haydentherapper! |
To summarize:
To make things concrete, PyPI and other ecosystems will probably want a client-side attestation with (at least) the following pieces:
To accomplish this, sigstore-python will need support for DSSE-style signatures. I've opened sigstore/sigstore-python#628 to track that. |
On the PyPI side, it's also worth considering how to prevent reupload of an artifact. A few thoughts:
|
Something else we did for npm, which also adds a layer of protection for a typosquatting-like attack, is to add the package name, version and tarball digest in the attestation subject (signed payload). We then verify this matches the published package at the registry before accepting and creating a publish attestation, and also verify in the CLI when auditing downloaded packages. This means you can't re-use an expired identity certificate from some other package, e.g. The attestation store where we keep these bundles and the npm registry also enforce that published |
This is a summary of a question/potential enhancement that I originally posted to the Sigstore Slack (ref).
Problem statement
One of Sigstore's expected use cases is package indices. Package indices like PyPI are expected to host Sigstore materials (in the form of Sigstore bundles), which can then be used to verify associated artifacts that are also hosted (or referenced) on the index.
Some packaging ecosystems (like PyPI) offer two (or more) indices: a "production" index for ordinary releases to go to, and an (optional) "beta" or "staging" index for packagers to test against. In the case of PyPI, these are
pypi.org
andtest.pypi.org
respectively.In these ecosystems, it's common to combine publishing to both indices into a single workflow, with interior logic for determining which index the release should go to. For example, here is PyCA Cryptography's logic for selecting the index to publish to:
(Permalink: https://github.com/pyca/cryptography/blob/30525e82c77b91963c4f2e8931d2b0257689d364/.github/workflows/pypi-publish.yml#L34-L45)
Consequently, the Sigstore signatures produced for both PyPI and TestPyPI releases look very similar: they might have slightly different repository states, but their workflow claims are identical.
This represents a potential security risk, under the following scenario:
foo
performs stable releases to indexProduction
and nightly releases to indexStaging
, using the same workflow for both.foo
contains a vulnerability. Users would ordinarily not be exposed to this vulnerability, becausefoo
's nightly would only be pushed toStaging
, which is explicitly not used in production.foo
and re-hosts it on indexProduction
, including with its valid signaturefoo
via indexProduction
now receive a correctly signed but unintended (and exploitable) version offoo
Production
, a name takeover offoo
(arguably out of scope here), or a re-hosting offoo
under a different name.foo
entirely with only exploitable versions fromStaging
.This can be summarized as a "domain separation" problem (in the cryptographic sense of "domain," not the DNS sense): the packager's intent about which index the package ends up on is not communicated in the verification materials for that package, giving an attacker some ambiguity to play with.
Proposed solution
Sigstore could enable domain separation for signatures via changes to Fulcio, resulting in changes to Fulcio-issued certificates.
In particular, Fulcio could support an additional, optional extension of the following (rough) form:
(wrapped, in turn, as an X.509v3 extension).
Each
Domain
would be an unstructured value; individual Sigstore-consuming ecosystems would be responsible for interpreting them in context-appropriate ways.This extension would be signed over like all other extensions, but would be "unauthenticated" with respect to the identity token (since it would be derived from something in the CSR, rather than the OIDC token).
This would be surfaced to the user via individual Sigstore clients. For example, for
sigstore-python
, adding domains to a certificate might look like this:# Domains = { hamilcar, hannibal, hasdrubal } sigstore sign --domain hamilcar --domain hannibal --domain hasdrubal important.txt
For ecosystems like PyPI, these domains could correspond to DNS names. For example, PyPI could require that uploaded bundles contain certificates with
pypi.org
in their domains; other domains (or the lack of domains, if desired) would be rejected. Similarly,pip
could reject an otherwise valid bundle retrieved from{domain}
if the bundle's certificate does not list{domain}
as a valid domain (or{domain}
does not explicitly list the indices it is mirroring from).Alternatives considered
These have not been fully considered and need to be discussed more; I'm only listing them here to preserve conversation state!
Alternative: Packagers should use distinct GitHub workflows (or other identifiable claim state) for their staging and production releases. This would result in distinguishable certificates for different kinds of releases.
Problems: Distinct workflows for staging and production releases make it difficult to test the correctness of the production workflow without actually creating a production release. Similarly, the (small) distinctions between claims for different workflows may not be immediately actionable/easily consumable on a policy level (e.g., PyPI cannot reasonably enforce that all certificates contain claims for
release.yml
, notbeta-release.yml
or any other name).Problems identified
This approach is not without problems:
Domains
be aSEQUENCE OF OCTET STRING
is very flexible, and delegates a lot of interpretative power to individual Sigstore-consuming ecosystems. IMO this is necessary given the wide range of expected Sigstore deployments (it's hard to be more opinionated here), but it also means that Sigstore should offer normative guidance about how to use this extension correctly (e.g., discouraging people from adding sub-languages like wildcards to it).Domains
would be signed over like the rest of the certificate, but is "unauthenticated" with respect to the binding identity token. This is arguably confusing, since all other extensions present in Fulcio-issued certificates are authenticated. This could be addressed in documentation, or (ideally) by refactoring how we handle the claim representation in certificates (via something like [RFC] Should Fulcio put the critical bit on its OIDC extensions? #981 (comment)).The text was updated successfully, but these errors were encountered: