Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API : Add hash verification for timestamp/snapshot/targets #1361

Closed
jku opened this issue Apr 21, 2021 · 4 comments · Fixed by #1437
Closed

API : Add hash verification for timestamp/snapshot/targets #1361

jku opened this issue Apr 21, 2021 · 4 comments · Fixed by #1437
Assignees
Labels
backlog Issues to address with priority for current development goals enhancement
Milestone

Comments

@jku
Copy link
Member

jku commented Apr 21, 2021

Feature request: Would be nice to have hash verification builtin to the API: it's not a massive amount of code but a single call to a function with a descriptive name is a lot better than 6 lines of sslib-calling code...

I can see a couple of options:

Timestamp.verify_snapshot_hashes(data: bytes)
Snapshot.verify_hashes(role_name: str, data: bytes)

or, as alternative to both of those, with MetadataInfo Martin is building in #1329 :

MetadataInfo.verify_hashes(data: bytes)

Last one is probably the correct place.

I'm not sure what should happen on failure though: Exception probably makes sense but then we probably want to change Metadata.verify() to do that as well

@jku
Copy link
Member Author

jku commented Apr 26, 2021

Targets/TargetInfo (#1329) should support something similar but not necessarily the same function: with targets we should avoid loading the whole file so the function needs to accept a fileobject -- otherwise it's the same idea

So updating the examples: The high level objects option:

Timestamp.verify_snapshot_hashes(data: bytes)
Snapshot.verify_hashes(role_name: str, data: bytes)
Targets.verify_hashes(target_name, file_object: BinaryIO)

MetadataInfo and TargetInfo option:

MetadataInfo.verify_hashes(data: bytes)
TargetInfo.verify_hashes(file_object: BinaryIO)

I still like the latter I think.

Also: both could offer multiple methods for the different argument types, but at least the listed argument versions should exist

@jku jku changed the title API : Add hash verification for timestamp/snapshot API : Add hash verification for timestamp/snapshot/targets Apr 28, 2021
@jku
Copy link
Member Author

jku commented May 7, 2021

Status here is that #1329 needs to progress first (assuming we implement this functionality in MetadataInfo and TargetInfo).

Thinking about the API, are variable type arguments frowned upon? I would have objected to this before type hinting (and suggested duplicate methods) but I'm not so sure now:

MetadataInfo.verify_hashes(data: Union[bytes, BinaryIO])
TargetInfo.verify_hashes(data: Union[bytes, BinaryIO])

This keeps the api tidy and provides verification for both immutable byte arrays and binary streams (aka file like objects). Looks ok to me.
In the implementation side, recognising binary streams / file like objects is pretty impossible but luckily bytes are always bytes:

if isinstance(data, bytes):
    # use securesystemslib.hash.digest()
else:
    # use securesystemslib.hash.digest_fileobject()

@joshuagl
Copy link
Member

Agree that MetadataInfo and TargetInfo seem to be the right places. We should figure out what happens if MetadataInfo.verify_hashes() is called and the optional hashes are not in use? Is that an exception? I think so, because the caller is clearly expecting there to be hashes to verify and their absence is unexpected.

Thinking about the API, are variable type arguments frowned upon?

I would have objected to functions with the same name on different classes having different signatures, but the typing and Union makes this quite nice. In fact, this seems to be a common enough pattern in Python that the Union type was added to support it.
From PEP 484:

Since accepting a small, limited set of expected types for a single argument is common, there is a new special factory called Union.

@jku
Copy link
Member Author

jku commented May 20, 2021

I just realized that we may want to extend this to length checks as well:

TargetFile.verify_length_and_hashes(data: Union[bytes, BinaryIO]) # checks first length, then hashes

This is basically an optimization for the case where the file is the wrong length.

MetaFile is a more interesting case:

  • first of all neither length nor hashes are required (as Joshua pointed out). I don't think I want an exception if they are not present though: the client cannot choose whether they are there and metadata can be perfectly valid without hashes or length
  • second, there's also 'version' which I think Metafile should not verify itself (because that requires deserializing the data)

So good docs are needed at least to make this all clear

@jku jku added the backlog Issues to address with priority for current development goals label May 26, 2021
@sechkova sechkova added this to the weeks22-23 milestone May 26, 2021
@sechkova sechkova self-assigned this May 26, 2021
@sechkova sechkova modified the milestones: weeks22-23, weeks24-25 Jun 9, 2021
@jku jku closed this as completed in #1437 Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Issues to address with priority for current development goals enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants