Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Payload Parser for DSSE #3

Closed

Conversation

PradyumnaKrishna
Copy link

Fixes: #

Description of the changes being introduced by the pull request:

payload is a byte sequence of serialized body, stored in Envelope, that
need to be parsed according to given payload_type in Envelope.

Generic Parser is added, so that a type of Parser can be created
according to the requirements of in-toto/tuf.

In form of an example JSONParser is added which parse serialized JSON
payloads and return them in form of dict. It will be used to test the
Payload parsing capabilities of Generic Parser.

Please verify and check that the pull request fulfils the following requirements:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

Copy link
Author

@PradyumnaKrishna PradyumnaKrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adityasaky, take a look at this Generic Payload Parser for DSSE which can be implemented in in-toto when we transit between wrappers.

Whats your opinion about this type of Parser?

@@ -97,3 +99,79 @@ def pae(self) -> bytes:
len(self.payload),
self.payload,
)


class Parser:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I developed this type of Generic Payload Parser which can be derived according to tuf/in-toto requirements.
This parser has two methods, check_type to check payload_type and parse to parse the payload.
Similarly a serialize method can be added to this parser which generate the Envelope by serializing the payload.

@adityasaky adityasaky self-requested a review July 3, 2022 13:58
Copy link
Member

@adityasaky adityasaky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a parser of this sort. We can use them in in-toto at least. 😄

class JSONParser(Parser):
"""A JSON/dict Parser for DSSE Envelope."""

_supported_payload_types: List[str] = ["json"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this isn't a legal payload type as per the DSSE spec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/secure-systems-lab/dsse/blob/master/protocol.md

PAYLOAD_TYPE: Opaque, case-sensitive string that uniquely and unambiguously identifies how to interpret payload. This includes both the encoding (JSON, CBOR, etc.) as well as the meaning/schema. To prevent collisions, the value SHOULD be either:

    [Media Type](https://www.iana.org/assignments/media-types/), a.k.a. MIME type or Content Type
        Example: application/vnd.in-toto+json.
        IMPORTANT: This SHOULD be an application-specific type describing both encoding and schema, NOT a generic type like application/json. The problem with generic types is that two different applications could use the same encoding (e.g. JSON) but interpret the payload differently.
        SHOULD be lowercase.
    [URI](https://tools.ietf.org/html/rfc3986)
        Example: https://example.com/MyMessage/v1-json.
        SHOULD resolve to a human-readable description but MAY be unresolvable.
        SHOULD be case-normalized (section 6.2.2.1)

@PradyumnaKrishna
Copy link
Author

@lukpueh please review it.

payload is a byte sequence of serialized body, stored in Envelope, that
need to be parsed according to given payload_type in Envelope.

Generic Parser is added, so that a type of Parser can be created
according to the requirements of in-toto/tuf.

In form of an example JSONParser is added which parse serialized JSON
payloads and return them in form of dict. It will be used to test the
Payload parsing capabilities of Generic Parser.

Signed-off-by: Pradyumna Krishna <[email protected]>
@lukpueh
Copy link
Member

lukpueh commented Jul 11, 2022

Solid work, @PradyumnaKrishna! I'm curious to see how this will be used downstream, e.g. to parse in-toto links/layouts. How do you want to proceed with this PR? Do you also plan to provide a generic signature wrapper parser that can handle both DSSE envelopes and traditional in-toto/tuf envelopes, as described in ITE-5?

On a related side-note, may I suggest you take a look at theupdateframework/python-tuf#1279 for reference? It implements a de/serialization subpackage for the traditional metadata wrapper, used in python-tuf's Metadata API. Maybe there is some inspiration in it (e.g. tuf@3d8cade4 lists some thoughts about naming).

@PradyumnaKrishna
Copy link
Author

How do you want to proceed with this PR?

A payload parser for in-toto will be created which will deserialise the payload into a Link|Layout object. Similarly a serializer will be created which will serialize the Link|Layout into either DSSE Envelope or payload.

Here is some psedocode for in-toto:

class DSSE(Parser):
    _supported_payload_types: List[str] = ["application/vnd.in-toto+json"]

    def deserialize(envelope: Envelope) -> Union[Link, Layout]:
        # check payload type using check_type
        # decode the payload
        # identify the type (Link or Layout)
        # construct and return Link or Layout

    def serialize(object: Union[Link, Layout]) -> Envelope:
        # convert Link or Layout into a object of supported payload type
        # encode that object to payload
        # construct and return the envelope

This class can split into two different classes if required, and we can decide the naming later sometime because this is only a draft that we can implement into DSSE and in-toto. You are welcome to suggest some edits to this draft.

Do you also plan to provide a generic signature wrapper parser that can handle both DSSE envelopes and traditional in-toto/tuf envelopes, as described in ITE-5?

No, Current in-toto envelope doesn't have any payload and the metadata stores the Link/Layout in format of json in files. There is nothing to parse. So, it doesn't make sense to use this kind of parser to handle both wrappers.
Also, after a certain period of time i.e. completion of transition period, in-toto will not use old signature wrapper.

On a related side-note, may I suggest you take a look at theupdateframework/python-tuf#1279 for reference? It implements a de/serialization subpackage for the traditional metadata wrapper, used in python-tuf's Metadata API. Maybe there is some inspiration in it (e.g. tuf@3d8cade4 lists some thoughts about naming).

Thank you for providing this information, we can decide the naming in the next weekly meeting.

@PradyumnaKrishna
Copy link
Author

The aim of this PR was to provide a common parser metaclass. Integrating DSSE into in-toto requires a convertor that converts payload -> Link|Layout.
This parser is not a requirement. We can create our own payload parser, independently, based on the requirements for in-toto and tuf.

Are we going to continue with this? or drop this idea and start creating a parser in in-toto itself?

@lukpueh
Copy link
Member

lukpueh commented Aug 2, 2022

Current in-toto envelope doesn't have any payload and the metadata stores the Link/Layout in format of json in files. There is nothing to parse.

I would argue that taking the signed contents from the current envelope and creating a Link or Layout object from it can also be called "parsing the payload". But yes, I'm not sure if it makes sense to use the same parser infrastructure for traditional envelope and dsse envelope.

Although, the way DSSE Envelope is implemented (see {from, to}_dict methods) we could use the same JSONSerializer and JSONDeserializer as we use for the traditional tuf metadata (ignoring type hints).

@lukpueh
Copy link
Member

lukpueh commented Aug 2, 2022

Are we going to continue with this? or drop this idea and start creating a parser in in-toto itself?

I suggest to explore whether we can use a common serialization infrastructure for in-toto and tuf traditional and dsse envelopes and payloads and then decide how to continue with this PR. In the meanwhile we just leave it as a draft here.

@lukpueh
Copy link
Member

lukpueh commented Aug 25, 2022

Closing in favor of #9.

@lukpueh lukpueh closed this Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants