Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise schema and formats facility #183

Closed
lukpueh opened this issue Sep 19, 2019 · 4 comments · Fixed by #776
Closed

Revise schema and formats facility #183

lukpueh opened this issue Sep 19, 2019 · 4 comments · Fixed by #776
Labels
legacy Issues related to legacy interfaces (obsolete with #731)

Comments

@lukpueh
Copy link
Member

lukpueh commented Sep 19, 2019

Description of issue or feature request:
Here are several loosely ordered observations in regards to securesystemslib's schema/formats facility:

Current behavior:

  • Some schemas sound more specific than they are, e.g. RELPATH_SCHEMA == PATH_SCHEMA == AnyString. This is misleading, when assessing a function's capability of sanitizing inputs.

  • Some schemas are an odd replacement for constants, e.g. schemas that define hard-coded strings, like "rsassa-pss-sha256", and are then used to check the same strings, which have been hardcoded into function default arguments.

  • Schema validation seems generally overused. In TUF nearly every function runs check_match for each argument. IMHO argument sanitizing is mostly important in public-facing interfaces. Programming errors, on the other, should be caught through extensive testing and code review.
    Using it everywhere makes the code bloated, also because there are a lot of generic comments describing the checks, and the obligatory and also quite generic FormatError, if <arg> is not properly formatted- blocks in the <Exceptions> section of docstrings.

  • Schema checking sometimes makes execution branches unreachable (also see Unreachable else best practice code-style-guidelines#18):

    X_OR_Y_SCHEMA.check_match(arg)
    
    if arg == X:
      # ...
    elif arg == Y:
      # ...
    else:
      raise WillNeverBeRaisedError()

    And I've even seen cases where WillNeverBeRaisedError is also listed in the docstring as Exception that might be raised (documentation rot)

  • The error messages from checking schemas with the check_match method are often not helpful, because they usually don't show the value of the checked variable or lack context.

Expected behavior:

  • Review existing schemas for their validity/strictness, especially when chained.

  • Only use in public facing interfaces (open for discussion).
    This would also require a clearer definition of what functions should be public interfaces, which btw. TUF/in-toto integrators would greatly benefit from.

  • At least don't blindly add <arg>.<SCHEMA>.check_match to every function. Coordinate with the rest of the function, and its purpose.

  • Make sure the error message is helpful, e.g. by:

    • supporting an error message override argument in check_match, or by
    • using matches plus raise FormatError (or maybe even just ValueError) with a custom message instead of check_match.
  • Disambiguate schemas and constants.

@lukpueh
Copy link
Member Author

lukpueh commented Dec 16, 2019

  • The error messages from checking schemas with the check_match method are often not helpful, because they usually don't show the value of the checked variable or lack context.

Similar observations were made in theupdateframework/python-tuf#243

@lukpueh
Copy link
Member Author

lukpueh commented Mar 10, 2020

The many check_match-calls also have an impact on performance. E.g. when delegating 16K hashed bins for warehouse, >5% of the execution time was just check_match-ing.

lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 2, 2020
Note: This commit tries to blend in with the current sslib design.

In future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
- re-think "key type" vs. "signature scheme"

This commit also removes the obsolete REQUIRED_LIBRARIES_SCHEMA,
which only used to be used in the long-gone check_crypto_libraries
tuf function.
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 3, 2020
Note: This commit tries to blend in with the current sslib design.

In future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
- re-think "key type" vs. "signature scheme"

This commit also removes the obsolete REQUIRED_LIBRARIES_SCHEMA,
which only used to be used in the long-gone check_crypto_libraries
tuf function.
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 6, 2020
Support nistp384 in addition to nistp256 in ECDSA signing scheme
and key formats, required for PEP458.

This commit also removes the related but obsolete
REQUIRED_LIBRARIES_SCHEMA, which was only used in the
long-gone check_crypto_libraries tuf function.

Note: This commit tries to blend in with the current sslib design.
I future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
- re-think "key type" vs. "signature scheme"
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 6, 2020
This commit adds support for verifying ecdsa signature on the
nistp384 curve with sha384 digests to the internal ecdsa_keys
module.

It does so by adding module global helper dictionary to map schemes
to hash algorithms.

Note: This commit tries to blend in with the current sslib design.
In future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 6, 2020
Support nistp384 in addition to nistp256 in the
public keys.verify_signature interface.

Note: This commit tries to blend in with the current sslib design.
In future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
- re-think "key type" vs. "signature scheme"
woodruffw pushed a commit to trail-of-forks/securesystemslib that referenced this issue Jun 18, 2020
Support nistp384 in addition to nistp256 in ECDSA signing scheme
and key formats, required for PEP458.

This commit also removes the related but obsolete
REQUIRED_LIBRARIES_SCHEMA, which was only used in the
long-gone check_crypto_libraries tuf function.

Note: This commit tries to blend in with the current sslib design.
I future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
- re-think "key type" vs. "signature scheme"
woodruffw pushed a commit to trail-of-forks/securesystemslib that referenced this issue Jun 18, 2020
This commit adds support for verifying ecdsa signature on the
nistp384 curve with sha384 digests to the internal ecdsa_keys
module.

It does so by adding module global helper dictionary to map schemes
to hash algorithms.

Note: This commit tries to blend in with the current sslib design.
In future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
woodruffw pushed a commit to trail-of-forks/securesystemslib that referenced this issue Jun 18, 2020
Support nistp384 in addition to nistp256 in the
public keys.verify_signature interface.

Note: This commit tries to blend in with the current sslib design.
In future work we should:
- define securesystemslib-wide constants instead of hardcoding
  strings over and over again (see item 3 in secure-systems-lab#183)
- re-think "key type" vs. "signature scheme"
@lukpueh
Copy link
Member Author

lukpueh commented Jul 6, 2020

Just discovered another documented, albeit (to me) unexpected behavior:

  • Unrecognized keys in an object schema are allowed
>>> from securesystemslib import schema
>>> FOO_SCHEMA = schema.Object("FOO_SCHEMA", foo=schema.AnyString())
>>> FOO_SCHEMA.matches({"foo": "ABC", "bar": "interesting..."}) # Would expect to return False due to "bar"
True

@lukpueh
Copy link
Member Author

lukpueh commented Oct 2, 2020

While testing a format errors on different Python versions, I just discovered another issue with the Object(Schema):

Given that in Python <3.7 the dict key order is not guaranteed, iterating over the "required" fields ...

self._required = list(required.items())

... of a schema object in check_match may yield different error messages, if multiple required fields are missing.

for key, schema in self._required:
# Check if 'object' has all the required dict keys. If not one of the
# required keys, check if it is an Optional().
try:
item = object[key]
except KeyError:
# If not an Optional schema, raise an exception.
if not isinstance(schema, Optional):
message = 'Missing key ' + repr(key) + ' in ' + repr(self._object_name)

@lukpueh lukpueh added the legacy Issues related to legacy interfaces (obsolete with #731) label Mar 14, 2024
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 10, 2024
The util.py module provides a mix of minimal functions related to local files
and directories. These functions are mostly unused, exceptions are
(according to sourcegraph.com):

- persist_temp_file in python-tuf (cc @jku)
- get_file_hashes in scai-demos (cc @marcelamelara, @adityasaky)
- get_file_details, load_json_file in taf (cc @renatav, @adityasaky)

It seems feasible to reimplement the desired functionality downstream.

The benefits of removing util.py are:

- minimizes the API surface
- removes one dependent of the about to be removed schema.py (secure-systems-lab#183)
- makes unittest_toolbox obsolete

Ping python-tuf (will need to port one function) and scai-demo

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 10, 2024
The util.py module provides a mix of minimal functions related to local files
and directories. These functions are mostly unused, exceptions are
(according to sourcegraph.com):

- persist_temp_file in python-tuf (cc @jku)
- get_file_hashes in scai-demos (cc @marcelamelara, @adityasaky)
- get_file_details, load_json_file in taf (cc @renatav, @adityasaky)

It seems feasible to reimplement the desired functionality downstream.

The benefits of removing util.py are:

- minimizes the API surface
- removes one dependent of the about to be removed schema.py (secure-systems-lab#183)
- makes unittest_toolbox obsolete

Ping python-tuf (will need to port one function) and scai-demo

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 10, 2024
The util.py module provides a mix of minimal functions related to local files
and directories. These functions are mostly unused, exceptions are
(according to sourcegraph.com):

- persist_temp_file in python-tuf (cc @jku)
- get_file_hashes in scai-demos (cc @marcelamelara, @adityasaky)
- get_file_details, load_json_file in taf (cc @renatav, @adityasaky)

It seems feasible to reimplement the desired functionality downstream.

The benefits of removing util.py are:

- minimizes the API surface
- removes one dependent of the about to be removed schema.py (secure-systems-lab#183)
- makes unittest_toolbox obsolete

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 10, 2024
In prepartion for the removal of schema.py (secure-systems-lab#183), this patch removes
schema checks of function arguments in hash.py and one entire function.

The removed checks are obfuscated "is string" checks, and without them
invalid args are still caught in the `digest` function, where they all
end up and raise a more meaningful UnsupportedLibraryError or
UnsupportedAlgorithmError if invalid.

The removed function `digest_from_rsa_scheme` doesn't seem to be used
anywhere (according to sourcegraph.com) not even in
securesystemslib.signer, where the same functionality is replicated
several times (see secure-systems-lab#594). Removing it here allows to ignore a
slightly more complex schema check.

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 10, 2024
In preparation for the removal of schema.py (secure-systems-lab#183), this patch removes
schema checks in the following modules of the `securesystemslib.gpg`
subpackage:

* internal modules `rsa`, `dsa`, `eddsa`, `common`. These checks are
  redundant with schema checks that are already performed in the calling
  functions of the `functions` module.

* in previously public `functions` module: * keyid in `create_signature`
  and `export_pubkey` functions * public key and signature dict in
  `verify_signature` function

This is okay for two reasons:

1. the preferred way of interacting with
   `securesystemslib.gpg.functions` is via `GPGSigner`, which controls
   the format of the passed arguments to some extent

2. securesystemslib.gpg still raises meaningful and even more consistent
   errors for invalid arguments anyway, than it did before. E.g. a keyid
   passed to `export_pubkey` that doesn't conform to the previously
   checked hex schema, now raises a `KeyNotFoundError`.

Other changes include:

* move string literal `GPG_HASH_ALGORITHM_STRING` from
  `securesystemslib.schema` to the better suited
  `secureystemslib.gpg.constants` module.
* remove mentions of schema definitions in docstrings
* adopt changes in tests

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 10, 2024
The usage of the securesystemslib schema/formats facility has been
little by little phased out and is now nowhere used anymore. Several
reasons to not use it are listed in secure-systems-lab#183.

This patch removes schema.py along with all securesystemslib schema
definitions in formats.py.

Additionally, it removes unrelated and unused helper functions in
formats.py.

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 15, 2024
The util.py module provides a mix of minimal functions related to local files
and directories. These functions are mostly unused, exceptions are
(according to sourcegraph.com):

- persist_temp_file in python-tuf (cc @jku)
- get_file_hashes in scai-demos (cc @marcelamelara, @adityasaky)
- get_file_details, load_json_file in taf (cc @renatav, @adityasaky)

It seems feasible to reimplement the desired functionality downstream.

The benefits of removing util.py are:

- minimizes the API surface
- removes one dependent of the about to be removed schema.py (secure-systems-lab#183)
- makes unittest_toolbox obsolete

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 15, 2024
In prepartion for the removal of schema.py (secure-systems-lab#183), this patch removes
schema checks of function arguments in hash.py and one entire function.

The removed checks are obfuscated "is string" checks, and without them
invalid args are still caught in the `digest` function, where they all
end up and raise a more meaningful UnsupportedLibraryError or
UnsupportedAlgorithmError if invalid.

The removed function `digest_from_rsa_scheme` doesn't seem to be used
anywhere (according to sourcegraph.com) not even in
securesystemslib.signer, where the same functionality is replicated
several times (see secure-systems-lab#594). Removing it here allows to ignore a
slightly more complex schema check.

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 15, 2024
In preparation for the removal of schema.py (secure-systems-lab#183), this patch removes
schema checks in the following modules of the `securesystemslib.gpg`
subpackage:

* internal modules `rsa`, `dsa`, `eddsa`, `common`. These checks are
  redundant with schema checks that are already performed in the calling
  functions of the `functions` module.

* in previously public `functions` module: * keyid in `create_signature`
  and `export_pubkey` functions * public key and signature dict in
  `verify_signature` function

This is okay for two reasons:

1. the preferred way of interacting with
   `securesystemslib.gpg.functions` is via `GPGSigner`, which controls
   the format of the passed arguments to some extent

2. securesystemslib.gpg still raises meaningful and even more consistent
   errors for invalid arguments anyway, than it did before. E.g. a keyid
   passed to `export_pubkey` that doesn't conform to the previously
   checked hex schema, now raises a `KeyNotFoundError`.

Other changes include:

* move string literal `GPG_HASH_ALGORITHM_STRING` from
  `securesystemslib.schema` to the better suited
  `secureystemslib.gpg.constants` module.
* remove mentions of schema definitions in docstrings
* adopt changes in tests

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Apr 16, 2024
The usage of the securesystemslib schema/formats facility has been
little by little phased out and is now nowhere used anymore. Several
reasons to not use it are listed in secure-systems-lab#183.

This patch removes schema.py along with all securesystemslib schema
definitions in formats.py.

Additionally, it removes unrelated and unused helper functions in
formats.py.

Signed-off-by: Lukas Puehringer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
legacy Issues related to legacy interfaces (obsolete with #731)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant