Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Attribute research: KEY #1438

Closed
MVrachev opened this issue Jun 7, 2021 · 11 comments · Fixed by #1449
Closed

Metadata Attribute research: KEY #1438

MVrachev opened this issue Jun 7, 2021 · 11 comments · Fixed by #1449
Assignees
Labels
backlog Issues to address with priority for current development goals discussion Discussions related to the design, implementation and operation of the project
Milestone

Comments

@MVrachev
Copy link
Collaborator

MVrachev commented Jun 7, 2021

This issue aims to document thoughts about the KEY attribute from the Root and Delegations classes.
The goal is to understand how we use that attribute, what might go wrong with it and how we can validate it.

When thinking about how we use KEY worth reading what is implemented in pr #1408 and pr #1436 implementing threshold validation.

We want to answer/address the following 6 questions/points based on my comment here:

  1. How do we use it?
  2. What might go wrong?
  3. Rethink how we store it?
  4. If, after the changes in step 3 there are concerns, then add a validation function
  5. Make the necessary changes for points 1 - 4, think and propose a way to integrate it into the class/classes (decorators, descriptors, etc.)
  6. create a pr for all additional changes on top of the validation functions

PS: The 7-th point is covered by documenting this issue.

@MVrachev MVrachev added the backlog Issues to address with priority for current development goals label Jun 7, 2021
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jun 8, 2021

Should I add thoughts about keyid here considering in #1436 we are adding it to the Key class?

@MVrachev
Copy link
Collaborator Author

MVrachev commented Jun 8, 2021

  1. How do we use it?
  • During Root and Delegations initialization.
  • When passing a Key argument in Root.add_key().
  1. What might go wrong?
  • If somebody passes a Key without the required fields it will just fail during initialization when Key.from_dict() is called. So, no problem here.
  • If somebody passes invalid keytype, scheme or keyval -> those attributes are validated through securesystemslib.schemas.
    Currently, in the MetadataBundle we are using keys when calling verify_with_threshold and there we actually call securesystemslib.keys.format_metadata_to_key() which checks the formats of all of the three keyval, keytype and keytype.
    The situation is the same we call any function from securesystemslib.keys accepting key_dict as an argument.
  • Duplicating keys Metadata API: (maybe) Enforce public key content uniqueness in keys #1429 for context-> verifying that keys are uniques can be done during verification Metadata API: Implement threshold verification #1436 or in Root and Targets.
  • Do we have to worry if somebody passes a wrong Key class instance when calling Root.add_key()? Can this be used to validate wrong metadata format? -> If wrong key is passed, then when they are using during verification (Metadata API: Implement threshold verification #1436) the key itself will be verified.
  1. Rethink how we store it?
  • This depends if we want to add any type of validation. It's possible that the securesystemslib validation is enough in this case.
  • Do we want to validate keys uniqueness in Root and Targets?
  1. If, after the changes in step 3 there are concerns, then add a validation function
  • None for now
  1. Make the necessary changes for points 1 - 4, think and propose a way to integrate it into the class/classes (decorators, descriptors, etc.)
  • None for now

@jku
Copy link
Member

jku commented Jun 8, 2021

  • If somebody passes invalid keytype, scheme or keyval -> those attributes are validated through securesystemslib.schemas.

When does this happen? I think we do want some validation at Key init time

@MVrachev
Copy link
Collaborator Author

MVrachev commented Jun 8, 2021

The question is do we trust securesystemslib for validation?
Right now, each of the times we do operations with keys we end up calling a securesystemslib function which makes argument validation.
The problem is that tomorrow this behavior could change and we can end up with no validation at all.

I agree that we should add some validation in __init__ time.

  1. I think in issue Metadata API: (maybe) Enforce public key content uniqueness in keys #1429 we agreed we would want to throw an exception if we have duplicate keys. This check should be done in Root and Targets.
  2. Do we want to validate the values of keytype, scheme and keyval?
    a) In securesystemslib keytype is one of rsa, ed25519, ecdsa, ecdsa-sha2-nistp256 or ecdsa-sha2-nistp384.
    do we want to hardcode those values somewhere in securesystemslib and use them?
    b) schema in the spec is defined as a string, maybe we can at least check that?
    c) keyval["public"] is defined as a string in PEM format. Can we do any checks there?

PS: @jku do you think we should discuss keyid here considering you are adding it to the Key class in #1436?

@MVrachev MVrachev added the discussion Discussions related to the design, implementation and operation of the project label Jun 9, 2021
@sechkova sechkova added this to the weeks24-25 milestone Jun 9, 2021
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jun 9, 2021

Now that keyid is officially part of the Key class we should think about it as well.

@jku
Copy link
Member

jku commented Jun 10, 2021

My recollection is that the real validity checks in SSLib are hidden inside verify_signature() and other functions like that, so we can't really use them. We could run things like sslib_formats.KEYTYPE_SCHEMA.check_match(keytype) (which would at least check for the possible values) but my understanding is that there's no way to check that e.g. the schema is valid for the keytype without more code. But do have a look, if you find a way to make SSLib do the validation that would be good.

keyval["public"] is defined as a string in PEM format

I don't think this is true for all keytypes. I believe we should consider this just a string (content is internal implementation detail of the specific key).

So maybe

  • use sslib schema validation iff it brings actual value and is simple to do (keytype looks like one?)
  • otherwise I guess we do want to check if the variable type is str? I'm not fond of type checking but I guess it's better than pushing random objects as arguments to library calls we make...

As in most other cases, I think we're primarily interested in the deserialization case... Making sure the validation happens on init() should make this useful for new created keys as well (we don't have code that does that yet but that's what I imagine at least).

@MVrachev
Copy link
Collaborator Author

I experimented with a solution testing all of the attributes reusing the securesystemslib schemas.
Here what I got:
MVrachev@e7cdab0

@jku
Copy link
Member

jku commented Jun 10, 2021

Yeah I think that reinforces my opinion:

  • the moment we start writing constants like "rsa" or "ecdsa-sha2-nistp256" in TUF code we're doing something wrong, we shouldn't have to care about details like that (as we should not even know all valid values).
  • In fact I wonder if we want to use even KEYTYPE_SCHEMA at deserialization time: nothing in the spec promises that securesystemslib can recognise the key type that a specific key uses -- maybe we only want to fail if that key is actually used for something. I'm thinking of a situation where some key uses a new type/scheme that our older client does not know about but we don't actually end up using that key (and the keys we do use are of type that we do understand) -- this is maybe a contrived example but I think according to spec we should not actually assume we know about the keytypes/schemes

@MVrachev
Copy link
Collaborator Author

MVrachev commented Jun 10, 2021

As I understand it, the source of knowledge about supported keys and schemas is securesystemslib.
The best we can do is make sure securesystemslib continues to be the source of truth for supported keytypes and schemas.
What we can do is probably provide a utility function in securesystemslib which for a given keytype calls the
correct schema check.
That way adding/removing new keys in securesystemslib won't require any new code in TUF and we would have validation in Key initialization time.

@jku
Copy link
Member

jku commented Jun 14, 2021

I've not responded since I wasn't quite sure why this felt wrong, now I think I've got it: Metadata validity and our implementations ability to sign or verify signatures are two different things. For a client or even a repository implementation it might be reasonable to consider both instant failures but I do not think that applies to Metadata API.

In other words, I think it is wrong for Metadata API to fail de/serialization of data just because the data contains an algorithm that our securesystemslib does not recognize: New/unknown algorithms do not make the metadata invalid in any way: they just mean we can't currently verify some specific signatures or hashes in it.

@MVrachev
Copy link
Collaborator Author

MVrachev commented Jun 14, 2021

We had a discussion with @jku and agreed that we shouldn't focus on checking if keytype, scheme, keyval, or keyid have valid semantics, but rather check that all of those are with the expected types and we can compare them the way
we expect.

The reasons are that:

  1. we could have a metadata file with one or a couple of keys with unsupported or invalid values for their attributes, but that doesn't mean we have to stop instantiating the Key class.
    When verifying with threshold with keys it's allowed to have multiple invalid keys until the required threshold is achieved.
  2. it's better to throw an exception during verification that certain keys have unsupported algorithms instead of doing that when nobody is expecting - during initialization.

In order to consider this issue as do I see three tasks:

  • Add basic validation that all of the Key fields are instances of the expected types.
  • Add a simple comment that Key is not validated semantically during initialization.
  • Make sure we have a test verifying that a list of keys containing an element with an unsupported algorithm can
    be verified successfully if it threshold of keys is satisfied.

MVrachev added a commit to MVrachev/tuf that referenced this issue Jun 15, 2021
In our discussion with Jussi we come to the conclusion that we want
to verify that all Key attributes contain values in the expected types,
but at the same time we don't want to focus validating the semantics
behind them.
The reason is that having a Key instance with invalid attributes is
possible and supported by the spec.
That's why we have "threshold" for the roles meaning we can have up to
a certain number of invalid Keys until we satisfy
the required threshold.

Also, for deeper semantic validation it's better to be done in
securesystemslib which does the actual work with keys.

For context see: theupdateframework#1438

Signed-off-by: Martin Vrachev <[email protected]>
MVrachev added a commit to MVrachev/tuf that referenced this issue Jun 15, 2021
In our discussion with Jussi we come to the conclusion that we want
to verify that all Key attributes contain values in the expected types,
but at the same time, we don't want to focus on validating the semantics
behind them.
The reason is that having a Key instance with invalid attributes is
possible and supported by the spec.
That's why we have a "threshold" for the roles meaning we can have up to
a certain number of invalid Keys until we satisfy
the required threshold.

Also, for deeper semantic validation it's better to be done in
securesystemslib which does the actual work with keys.

For context see: theupdateframework#1438

Signed-off-by: Martin Vrachev <[email protected]>
@jku jku closed this as completed in #1449 Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Issues to address with priority for current development goals discussion Discussions related to the design, implementation and operation of the project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants