Metadata Attribute research: KEY #1438

MVrachev · 2021-06-07T15:47:58Z

This issue aims to document thoughts about the KEY attribute from the Root and Delegations classes.
The goal is to understand how we use that attribute, what might go wrong with it and how we can validate it.

When thinking about how we use KEY worth reading what is implemented in pr #1408 and pr #1436 implementing threshold validation.

We want to answer/address the following 6 questions/points based on my comment here:

How do we use it?
What might go wrong?
Rethink how we store it?
If, after the changes in step 3 there are concerns, then add a validation function
Make the necessary changes for points 1 - 4, think and propose a way to integrate it into the class/classes (decorators, descriptors, etc.)
create a pr for all additional changes on top of the validation functions

PS: The 7-th point is covered by documenting this issue.

The text was updated successfully, but these errors were encountered:

MVrachev · 2021-06-08T10:33:34Z

Should I add thoughts about keyid here considering in #1436 we are adding it to the Key class?

MVrachev · 2021-06-08T11:28:42Z

How do we use it?

During Root and Delegations initialization.
When passing a Key argument in Root.add_key().

What might go wrong?

If somebody passes a Key without the required fields it will just fail during initialization when Key.from_dict() is called. So, no problem here.
If somebody passes invalid keytype, scheme or keyval -> those attributes are validated through securesystemslib.schemas.
Currently, in the MetadataBundle we are using keys when calling verify_with_threshold and there we actually call securesystemslib.keys.format_metadata_to_key() which checks the formats of all of the three keyval, keytype and keytype.
The situation is the same we call any function from securesystemslib.keys accepting key_dict as an argument.
Duplicating keys Metadata API: (maybe) Enforce public key content uniqueness in keys #1429 for context-> verifying that keys are uniques can be done during verification Metadata API: Implement threshold verification #1436 or in Root and Targets.
Do we have to worry if somebody passes a wrong Key class instance when calling Root.add_key()? Can this be used to validate wrong metadata format? -> If wrong key is passed, then when they are using during verification (Metadata API: Implement threshold verification #1436) the key itself will be verified.

Rethink how we store it?

This depends if we want to add any type of validation. It's possible that the securesystemslib validation is enough in this case.
Do we want to validate keys uniqueness in Root and Targets?

If, after the changes in step 3 there are concerns, then add a validation function

None for now

Make the necessary changes for points 1 - 4, think and propose a way to integrate it into the class/classes (decorators, descriptors, etc.)

None for now

jku · 2021-06-08T12:52:10Z

If somebody passes invalid keytype, scheme or keyval -> those attributes are validated through securesystemslib.schemas.

When does this happen? I think we do want some validation at Key init time

MVrachev · 2021-06-08T14:42:15Z

The question is do we trust securesystemslib for validation?
Right now, each of the times we do operations with keys we end up calling a securesystemslib function which makes argument validation.
The problem is that tomorrow this behavior could change and we can end up with no validation at all.

I agree that we should add some validation in __init__ time.

I think in issue Metadata API: (maybe) Enforce public key content uniqueness in keys #1429 we agreed we would want to throw an exception if we have duplicate keys. This check should be done in Root and Targets.
Do we want to validate the values of keytype, scheme and keyval?
a) In securesystemslib keytype is one of rsa, ed25519, ecdsa, ecdsa-sha2-nistp256 or ecdsa-sha2-nistp384.
do we want to hardcode those values somewhere in securesystemslib and use them?
b) schema in the spec is defined as a string, maybe we can at least check that?
c) keyval["public"] is defined as a string in PEM format. Can we do any checks there?

PS: @jku do you think we should discuss keyid here considering you are adding it to the Key class in #1436?

MVrachev · 2021-06-09T12:46:53Z

Now that keyid is officially part of the Key class we should think about it as well.

jku · 2021-06-10T06:38:57Z

My recollection is that the real validity checks in SSLib are hidden inside verify_signature() and other functions like that, so we can't really use them. We could run things like sslib_formats.KEYTYPE_SCHEMA.check_match(keytype) (which would at least check for the possible values) but my understanding is that there's no way to check that e.g. the schema is valid for the keytype without more code. But do have a look, if you find a way to make SSLib do the validation that would be good.

keyval["public"] is defined as a string in PEM format

I don't think this is true for all keytypes. I believe we should consider this just a string (content is internal implementation detail of the specific key).

So maybe

use sslib schema validation iff it brings actual value and is simple to do (keytype looks like one?)
otherwise I guess we do want to check if the variable type is str? I'm not fond of type checking but I guess it's better than pushing random objects as arguments to library calls we make...

As in most other cases, I think we're primarily interested in the deserialization case... Making sure the validation happens on init() should make this useful for new created keys as well (we don't have code that does that yet but that's what I imagine at least).

MVrachev · 2021-06-10T14:32:59Z

I experimented with a solution testing all of the attributes reusing the securesystemslib schemas.
Here what I got:
MVrachev@e7cdab0

jku · 2021-06-10T15:44:15Z

Yeah I think that reinforces my opinion:

the moment we start writing constants like "rsa" or "ecdsa-sha2-nistp256" in TUF code we're doing something wrong, we shouldn't have to care about details like that (as we should not even know all valid values).
In fact I wonder if we want to use even KEYTYPE_SCHEMA at deserialization time: nothing in the spec promises that securesystemslib can recognise the key type that a specific key uses -- maybe we only want to fail if that key is actually used for something. I'm thinking of a situation where some key uses a new type/scheme that our older client does not know about but we don't actually end up using that key (and the keys we do use are of type that we do understand) -- this is maybe a contrived example but I think according to spec we should not actually assume we know about the keytypes/schemes

MVrachev · 2021-06-10T16:41:42Z

As I understand it, the source of knowledge about supported keys and schemas is securesystemslib.
The best we can do is make sure securesystemslib continues to be the source of truth for supported keytypes and schemas.
What we can do is probably provide a utility function in securesystemslib which for a given keytype calls the
correct schema check.
That way adding/removing new keys in securesystemslib won't require any new code in TUF and we would have validation in Key initialization time.

jku · 2021-06-14T06:40:36Z

I've not responded since I wasn't quite sure why this felt wrong, now I think I've got it: Metadata validity and our implementations ability to sign or verify signatures are two different things. For a client or even a repository implementation it might be reasonable to consider both instant failures but I do not think that applies to Metadata API.

In other words, I think it is wrong for Metadata API to fail de/serialization of data just because the data contains an algorithm that our securesystemslib does not recognize: New/unknown algorithms do not make the metadata invalid in any way: they just mean we can't currently verify some specific signatures or hashes in it.

MVrachev · 2021-06-14T15:37:00Z

We had a discussion with @jku and agreed that we shouldn't focus on checking if keytype, scheme, keyval, or keyid have valid semantics, but rather check that all of those are with the expected types and we can compare them the way
we expect.

The reasons are that:

we could have a metadata file with one or a couple of keys with unsupported or invalid values for their attributes, but that doesn't mean we have to stop instantiating the Key class.
When verifying with threshold with keys it's allowed to have multiple invalid keys until the required threshold is achieved.
it's better to throw an exception during verification that certain keys have unsupported algorithms instead of doing that when nobody is expecting - during initialization.

In order to consider this issue as do I see three tasks:

Add basic validation that all of the Key fields are instances of the expected types.
Add a simple comment that Key is not validated semantically during initialization.
Make sure we have a test verifying that a list of keys containing an element with an unsupported algorithm can
be verified successfully if it threshold of keys is satisfied.

In our discussion with Jussi we come to the conclusion that we want to verify that all Key attributes contain values in the expected types, but at the same time we don't want to focus validating the semantics behind them. The reason is that having a Key instance with invalid attributes is possible and supported by the spec. That's why we have "threshold" for the roles meaning we can have up to a certain number of invalid Keys until we satisfy the required threshold. Also, for deeper semantic validation it's better to be done in securesystemslib which does the actual work with keys. For context see: theupdateframework#1438 Signed-off-by: Martin Vrachev <[email protected]>

In our discussion with Jussi we come to the conclusion that we want to verify that all Key attributes contain values in the expected types, but at the same time, we don't want to focus on validating the semantics behind them. The reason is that having a Key instance with invalid attributes is possible and supported by the spec. That's why we have a "threshold" for the roles meaning we can have up to a certain number of invalid Keys until we satisfy the required threshold. Also, for deeper semantic validation it's better to be done in securesystemslib which does the actual work with keys. For context see: theupdateframework#1438 Signed-off-by: Martin Vrachev <[email protected]>

MVrachev added the backlog Issues to address with priority for current development goals label Jun 7, 2021

MVrachev added the discussion Discussions related to the design, implementation and operation of the project label Jun 9, 2021

sechkova added this to the weeks24-25 milestone Jun 9, 2021

sechkova assigned MVrachev Jun 9, 2021

jku mentioned this issue Jun 14, 2021

api/metadata input validation: hashes #1441

Closed

This was referenced Jun 15, 2021

Metadata API: Add Key attributes types validation #1449

Merged

Metadata API: Implement threshold verification #1436

Merged

jku mentioned this issue Jun 16, 2021

Add hash and length verification to MetaFile and TargetFile #1437

Merged

3 tasks

jku closed this as completed in #1449 Jun 16, 2021

MVrachev mentioned this issue Nov 17, 2021

Add input validation to simple metadata api #1140

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata Attribute research: KEY #1438

Metadata Attribute research: KEY #1438

MVrachev commented Jun 7, 2021 •

edited

Loading

MVrachev commented Jun 8, 2021 •

edited

Loading

MVrachev commented Jun 8, 2021

jku commented Jun 8, 2021

MVrachev commented Jun 8, 2021

MVrachev commented Jun 9, 2021

jku commented Jun 10, 2021 •

edited

Loading

MVrachev commented Jun 10, 2021

jku commented Jun 10, 2021

MVrachev commented Jun 10, 2021 •

edited

Loading

jku commented Jun 14, 2021 •

edited

Loading

MVrachev commented Jun 14, 2021 •

edited

Loading

Metadata Attribute research: KEY #1438

Metadata Attribute research: KEY #1438

Comments

MVrachev commented Jun 7, 2021 • edited Loading

MVrachev commented Jun 8, 2021 • edited Loading

MVrachev commented Jun 8, 2021

jku commented Jun 8, 2021

MVrachev commented Jun 8, 2021

MVrachev commented Jun 9, 2021

jku commented Jun 10, 2021 • edited Loading

MVrachev commented Jun 10, 2021

jku commented Jun 10, 2021

MVrachev commented Jun 10, 2021 • edited Loading

jku commented Jun 14, 2021 • edited Loading

MVrachev commented Jun 14, 2021 • edited Loading

MVrachev commented Jun 7, 2021 •

edited

Loading

MVrachev commented Jun 8, 2021 •

edited

Loading

jku commented Jun 10, 2021 •

edited

Loading

MVrachev commented Jun 10, 2021 •

edited

Loading

jku commented Jun 14, 2021 •

edited

Loading

MVrachev commented Jun 14, 2021 •

edited

Loading