Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom signing implementations in Metadata.sign method #1263

Closed
Tracked by #10672
lukpueh opened this issue Jan 14, 2021 · 25 comments · Fixed by secure-systems-lab/securesystemslib#319
Closed
Tracked by #10672
Labels
repository Related to the repository implementation securesystemslib Requires corresponding implementation in securesystemslib

Comments

@lukpueh
Copy link
Member

lukpueh commented Jan 14, 2021

Description of issue or feature request:
Allow TUF integrators to easily use custom metadata signing implementations instead of the currently supported securesystemslib one.

An urgent use case for this is the PEP458/Warehouse integration, where metadata must be signed with keys stored in a Hashicorp Vault and the implementation is already available.

NOTE: This feature request only concerns the tuf.api.Metadata.sign method and not the legacy signing facilities write and writeall in repository tool.

Current behavior:
tuf.api.Metadata.sign(key, ...) calls securesystemslib.keys.create_signature with the passed sslib/json-formatted private key.

Expected behavior:
tuf.api.Metadata.sign takes an abstract signer parameter, which implements signing. A default signer may resolve to the currently used securesystemslib.keys.create_signature.

Implementation considerations:

  • Different signers require completely different "keys", e.g. the currently used generic implementation requires an sslib/json-formatted private key, whereas signing implementations that don't have direct access to the key may only need a keyid (see GPG or HSM references below). As a consequence the key parameter in Metadata.sign should probably be encapsulated within the signer.

  • The abstract signer interface must strictly define the format of the returned signature for metadata interoperability.

  • The abstract signer interface design should also keep in mind other key-related tasks, such as signature verification, public key export to sslib metadata format, and keyid generation.

  • securesystemslib.keys.create_signature is already an abstract interface, where the json-formatted private key parameter format is generic enough to hold different key types and to choose one of many supported signing algorithms, identified by the key's scheme field. (see public key metadata format reference below). However, it is not suited to extend to custom implementations. A new abstract signer interface should integrate well with the existing securesystemslib infrastructure and/or aim to replace it. It should not become yet another co-existing interface and/or layer of abstraction (see public API overview reference below).


Related work and references: (for the very ambitious reader)

@lukpueh lukpueh added repository Related to the repository implementation securesystemslib Requires corresponding implementation in securesystemslib labels Jan 14, 2021
@lukpueh
Copy link
Member Author

lukpueh commented Jan 14, 2021

A very basic implementation could look something like this:

# in securesystemslib
class Signer:
    @abc.abstractmethod
    def sign(bytes: payload) -> Signature:
        raise NotImplementedError

class SSlibSigner(Signer):
    def __init__(self, sslib_private_key):
        self.key = sslib_private_key

    def sign(self, payload):
        return sslib_keys.create_signature(self.key, payload)

class GPGSigner(Signer):
    def __init__(self, gpg_keyid):
        self.keyid = gpg_keyid

    def sign(self, payload):
        return sslib_gpg.create_signature(payload, self.keyid)

# in TUF
class Metadata:
    def sign(sslib.Signer: signer):
        signer.sign(self.signed.to_canonical_bytes())

# in Warehouse
class VaultSigner(Signer):
    def sign(self, payload):
        # ... you get the idea

metadata = tuf.Metadata(...)
metadata.sign(VaultSigner(...))

@trishankatdatadog
Copy link
Member

class VaultSigner(Signer):

Great idea. I would add only that it might be useful to add in SSLib itself...

@lukpueh
Copy link
Member Author

lukpueh commented Jan 14, 2021

I would add only that it might be useful to add in SSLib itself...

Agreed.

@joshuagl
Copy link
Member

Thank you for the detailed issue @lukpueh.

Bit of an aside, but do you imagine the GPG and HSM signing interfaces in securesystemslib becoming implementations of the interface we create here? That seems logical to me, but I'm not familiar with the GPG and HSM signing interfaces.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 15, 2021

Bit of an aside, but do you imagine the GPG and HSM signing interfaces in securesystemslib becoming implementations of the interface we create here?

I would suggest so, yes. Do you have reservations?

@joshuagl
Copy link
Member

No reservations, just checking whether my expectation was reasonable. Thank you.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 22, 2021

I'd like to add some ontological thoughts here. In particular, whether the Signer class could instead be a SigningKey class, given that a Signer, as described above, is associated with exactly one key (or key identifier) anyway.

What speaks against combining Signer and Key in one class:
In pure modeling terms a key data container and an entity that implements a signing protocol are quite different things. Also, a private key might be used with different signers (protocols, algorithms, schemes, etc.), which would require duplicating key data on multiple signers.

What speaks for it:
securesystemslib needs private keys only for their signing capabilities and it is unlikely that any given private key will be used with different signers during its in-memory life time.

Outlook
Maybe we should at least tentatively explore the expected use cases for private keys in a new repository library (#1136).

I have already summarized the main tasks for keys (private and public!) in secure-systems-lab/securesystemslib#310, albeit with a focus on public keys, which need to implement serialization to and deserialization from TUF metadata format (secure-systems-lab/securesystemslib#308) and verification. Both may or may not be tied to the protocol that also implements the signing.

@trishankatdatadog, maybe you can share your experience from implementing tuf-on-a-plane?

@MVrachev
Copy link
Collaborator

MVrachev commented Jan 25, 2021

What speaks against combining Signer and Key in one class:
In pure modeling terms a key data container and an entity that implements a signing protocol are quite different things. Also, a private key might be used with different signers (protocols, algorithms, schemes, etc.), which would require duplicating key data on multiple signers.

I am asking myself does the Key class has meaning outside the signing operations? If not, then probably it makes sense to keep it there?
Also, I am not sure if there is a modeling contradiction. SignerKey will be a container class for key data and at the same time that class would have functionality only related to that data, so they are already in a sense strongly connected.

I am not sure if we have a SignerKey for the signing operations, then how we are going to do the verification operations?
There is no sense to have a separate VerifierKey class.

Maybe we can look the other way around and have a Key class with signature and verification functionality?
This seems more logical to me.

@trishankatdatadog
Copy link
Member

@trishankatdatadog, maybe you can share your experience from implementing tuf-on-a-plane?

It's a good question. I didn't implement any signing interface, only verification interfaces. But, if it helps, what I found is that rather than designing on pen and paper, writing, trying, and feeling the code worked out very well. So try writing what comes naturally, and see if it makes sense? I know I sound like a mystic, but that's what worked for me.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 26, 2021

Thanks for your comments, @MVrachev and @trishankatdatadog!

SignerKey will be a container class for key data and at the same time that class would have functionality only related to that data, so they are already in a sense strongly connected.

Yes they are, but there are some variables that change for the signer but not for the key, such as padding, hashing, scheme, etc. To me it feels a bit odd to modify the key object, in order to use a different signing scheme, e.g.:

k = SigningKey(...) # Initialize once e.g. with RSA secret exponent
k.scheme = "rsassa-pss-sha256"
k.sign(data) # Sign using one scheme

k.scheme = "rsa-pkcs1v15-sha512"
k.sign(data) # Sign using another scheme

But maybe switching schemes during life time is just not a use case, and even then, the odd looks alone don't seem to be a strong argument against it.

An alternative would be to accept these variables as a parameters to SigningKey.sign. But given that our Metadata.sign will call the sign method of whichever passed SigningKey subclass, the interface needs to be generic for all key types and as simple as possible.

Maybe we can look the other way around and have a Key class with signature and verification functionality?

Yes, but I think there is an argument for separating public and private key in the model. If they were combined, we'd often have half-empty objects, e.g. on the client where only the public parts are available. And even in the signing context we only really need the private portion.

Furthermore, given that public keys are included in TUF metadata their class representation should look similar to the metadata representation for recognizability and have a focus on type/schema checking and serialization tasks. Whereas we care a lot less about what the private key class looks like as long as it can be used for signing.

Does this make sense? I have a feeling that hands-on @trishankatdatadog might say, I'm overthinking this. :P

@trishankatdatadog
Copy link
Member

Does this make sense? I have a feeling that hands-on @trishankatdatadog might say, I'm overthinking this. :P

This requires some meditation TBH. There are generic cryptographic algorithms and then there are specific schemes. What code do we expect delegators and delegatees to write? I would approach it that way.

@MVrachev
Copy link
Collaborator

MVrachev commented Jan 27, 2021

Does this make sense? I have a feeling that hands-on @trishankatdatadog might say, I'm overthinking this. :P

It makes sense and those are valid points if you ask me.
After all of your thoughts, I feel like the best option is to have a Key class in tuf/metadata/api representing exactly what is written in the spec and a different Signer class which contains a Key class instance, basically the way you initially proposed it.
That way you have a model literally representing a key the way it's defined in the spec, you can easily change the key schema, won't have the problem with half-empty objects, and have a logical encapsulation for each of the classes.

Additional thinks we discussed with @jku:

  1. The interface @lukpueh proposed is fairly clear and easy to implement.
  2. The important question in the interface seems to be what should Signature be or the return type of the sign() operation?

Jussi looked into what is the returned value of some of the possible implementors of the Signer interface:

So, to summarize it seems achievable to return bytes instead of a custom Signature object which is unclear what should it be.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 28, 2021

So, to summarize it seems achievable to return bytes instead of a custom Signature object which is unclear what should it be.

That does make sense. But then every SignerKey must also expose a keyid. Because the caller is likely to store the signature bytes together with a keyid, otherwise the signature can't be mapped to a public key for later verification.

class Metadata:
  ...
  def sign(self, signer):
    # NOTE: Let's assume signatures are still in the old/current dictionary format
    self.signatures.append({
        "keyid": signer.keyid,
        "sig": signer.sign(self.signed.to_canonical_bytes())
      })

I think your proposal makes the SignerKey.sign interface a bit cleaner, but the overall SignerKey class a bit less blackboxy.

Gpg: Return value is also a dict: it is used to return multiple other pieces of data as well -- it is unclear if these are actually useful to create_signature() callers or only for intermediate processing before that

Yeah, unfortunately these "other_headers" are required for verification. Although it might be possible to return them as part of the signature bytes and then parse upon verification.

EDIT: FYI, this is where we parse GPG signature packets.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 28, 2021

...custom fields like the one on gpg signatures might also be an argument for SigningKey.sign returning a complex signature object.

@MVrachev
Copy link
Collaborator

It makes sense and those are valid points if you ask me.
After all of your thoughts, I feel like the best option is to have a Key class in tuf/metadata/api representing exactly what is written in the spec and a different Signer class which contains a Key class instance, basically the way you initially proposed it.
That way you have a model literally representing a key the way it's defined in the spec, you can easily change the key schema, won't have the problem with half-empty objects, and have a logical encapsulation for each of the classes.

It seems I misunderstand @lukpueh. Lukas seems to mean to combine Signer and the private part from the key.
I am only not sure about the SignerKey name. It feels a strange concatenation of the functionality it provides - signing and the data it stores - key data. Probably it makes sense just to call it Signer as it was originally proposed and it will have attributes representing key data.

That does make sense. But then every SignerKey must also expose a keyid. Because the caller is likely to store the signature bytes together with a keyid, otherwise the signature can't be mapped to a public key for later verification.

You have a point here. We can indeed create a Signature class and store there both the bytes signature and keyid. Then we would return an instance of it when calling sign().
We can also use this Signature to add functionality connected to the signature.
Like verification operations?

@lukpueh
Copy link
Member Author

lukpueh commented Jan 29, 2021

Probably it makes sense just to call it Signer as it was originally proposed and it will have attributes representing key data.

I'm fine with Signer. :)

We can indeed create a Signature class and store there both the bytes signature and keyid. Then we would return an instance of it when calling sign().

SGTM! :)

We can also use this Signature to add functionality connected to the signature. Like verification operations?

I'm a bit unsure about that. I think in most cases sign can just return a generic Signature with a keyid and some raw signature bytes, independently of the key type, scheme, etc. used to create the signature. If we implement verify on the Signature class it has to account for all those variables, or, more likely, we'd use different Signature subclasses for different verify implementations.

As a consequence a concrete Signer subclass implementation must know which concrete Signature subclass to return, in order to provide the correct verify method. I don't know if we want to put that extra responsibility on the Signer. To me, this feels more like a task for the public key.

@MVrachev
Copy link
Collaborator

MVrachev commented Jan 29, 2021

I'm a bit unsure about that. I think in most cases sign can just return a generic Signature with a keyid and some raw signature bytes, independently of the key type, scheme, etc. used to create the signature. If we implement verify on the Signature class it has to account for all those variables, or, more likely, we'd use different Signature subclasses for different verify implementations.

As a consequence a concrete Signer subclass implementation must know which concrete Signature subclass to return, in order to provide the correct verify method. I don't know if we want to put that extra responsibility on the Signer. To me, this feels more like a task for the public key.

Yes, you are right. Those arguments are logical.

Today we had a quick discussion with @jku about the Signer return type and he shared some concerns if we return our own Signature type.
Jussi, can you please add your opinion to the thread?

@jku
Copy link
Member

jku commented Feb 1, 2021

As a consequence a concrete Signer subclass implementation must know which concrete Signature subclass to return, in order to provide the correct verify method. I don't know if we want to put that extra responsibility on the Signer. To me, this feels more like a task for the public key.

Yes, you are right. Those arguments are logical.

Today we had a quick discussion with @jku about the Signer return type and he shared some concerns if we return our own Signature type.

I think it was mostly that it seemed like an object isn't needed in this case at all (I'm assuming that usually when you sign something you are doing that to create metadata for the purpose of serializing it: an array of bytes is fine for that)?

Also I don't know how the opposite case (verifying portions of metadata while de-serializing it) is going to look like and that seemed more relevant for a potential Signature object? As in how is Metadata API used with e.g. Vault in that case?

@lukpueh
Copy link
Member Author

lukpueh commented Feb 3, 2021

@jku, this is roughly how canonicalization, signing/verifying and de/serialization currently play together:

# On the server
1. create canonical bytes for payload
2. create signature over canonical bytes
3. serialize non-canonical payload + signature(s)

# On the client
1. deserialize non-canonical payload + signature(s)
2. create canonical bytes for payload
3. verify signature over canonical bytes

Note that this might change in the future. Currently we are discussing an alternative protocol, which does not require parsing the full payload before verifying the signature (see Secure Systems Lab signing specification).

Especially, in the light of that new proposal and the likely short-livedness of signature objects I agree with you that a class model might not be needed.

However, I think a class helps to clearly define the required signature format, even if it only consists of two fields (keyid and signature bytes). And it also aligns with our decision in ADR 4 to create classes for all complex metadata attributes, e.g. in order to add self-validation behavior.

@MVrachev
Copy link
Collaborator

MVrachev commented Feb 3, 2021

I have pushed the initial version of the signing interface and the necessary TUF changes to support it in the prs linked just above this. comment.

@MVrachev
Copy link
Collaborator

MVrachev commented Feb 5, 2021

While working on SSlibSigner and GPGSigner I realized that if we want to store the necessary information to verify the signatures with securesystemslib.kets.verify_signature() and securesystemslib.gpg.functions.verify_signature() we would need to add more key metadata in Signature.

For the SSlibSigner, we would need to add the key_dict: https://github.com/secure-systems-lab/securesystemslib/blob/dff4425e5663c58c954447a698efb17c4b23b0f8/securesystemslib/keys.py#L738

and for the GPGSigner, we would need to add the pubkey info https://github.com/secure-systems-lab/securesystemslib/blob/dff4425e5663c58c954447a698efb17c4b23b0f8/securesystemslib/gpg/functions.py#L170
the way we are doing this in the tests: https://github.com/secure-systems-lab/securesystemslib/blob/dff4425e5663c58c954447a698efb17c4b23b0f8/tests/test_gpg.py#L571

@MVrachev
Copy link
Collaborator

MVrachev commented Feb 5, 2021

Because of the above problem, I am not sure if I can create a relationship between the Signature class and the GPGSignature.
Before, I thought I can just inherit the Signature class used by the SSLibSigner and add specific fields related to GPG, but I am not that sure anymore.

With the observations from the above comment, I made the Signature class like this:
image

the problem is that for GPGSignature I won't need the self.keydict = keydict field or it will be totally different.
The pubkey info could be one of three other schemas
image
and each one of them with its own specifics...

The logical solution will be to create a separate Signature interface implemented by the SSlibSignature and GPGSignature, but do we want yet another interface?

What should we do from here? @lukpueh?

@lukpueh
Copy link
Member Author

lukpueh commented Feb 8, 2021

@MVrachev, I thought you agreed with my arguments above for not implementing verification as method of the Signature class? (And even if we did, I wouldn't store the public key on the Signature object but rather pass it to the verification method as argument.)

@MVrachev
Copy link
Collaborator

MVrachev commented Feb 8, 2021

Yes, we agreed that we would do the verification elsewhere.
I just thought we want to store all of the information needed for the signer verification in the Signer class.
We discussed that with Lukas and we would store only the keyid and the signature.
The keyid will give us the necessary additional information for the verification.

@MVrachev
Copy link
Collaborator

I will add a GPGSigner in another pr, but I decided to mark this issue as fixed in my pr #319 because it fixed the main point of this issue - creating a Signer interface and a securesystemslib implementation of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repository Related to the repository implementation securesystemslib Requires corresponding implementation in securesystemslib
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants