From 5c29e1e2de48c35edfccf6b0af5d10ba42bed22d Mon Sep 17 00:00:00 2001
From: Martin Vrachev <mvrachev@vmware.com>
Date: Wed, 10 Mar 2021 17:10:30 +0200
Subject: [PATCH] ADR0007: How we are going to validate the new API codebase

Add a decision record about the different options which we can use
to validate the new code under tuf.api.

TODO: Make decision.

Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
---
 docs/adr/0007-validation-guideliness.md | 142 ++++++++++++++++++++++++
 docs/adr/index.md                       |   1 +
 2 files changed, 143 insertions(+)
 create mode 100644 docs/adr/0007-validation-guideliness.md

diff --git a/docs/adr/0007-validation-guideliness.md b/docs/adr/0007-validation-guideliness.md
new file mode 100644
index 0000000000..f0a8ad0a75
--- /dev/null
+++ b/docs/adr/0007-validation-guideliness.md
@@ -0,0 +1,142 @@
+# How we are going to validate the new API codebase
+
+* Date: 2021-03-10
+
+Technical Story:
+- [securesystemslib schema checker issues](https://github.com/secure-systems-lab/securesystemslib/issues/183)
+- [new TUF validation guidelines](https://github.com/theupdateframework/tuf/issues/1130)
+
+## Context and Problem Statement
+
+1. Some schemas sound more specific than they are.
+2. Some schemas are an odd replacement for constants.
+3. Schema validation is generally **overused**. Together with user input,
+we are validating input programmatically generated from our private functions.
+4. There are instances where some attributes are validated multiple times
+when executing one API call.
+5. Schema checking sometimes makes execution branches unreachable.
+Multiple checks of the same attribute
+6. The error messages from checking schemas are often not helpful.
+
+## Decision Drivers and Requirements
+Some of the requirements we want to meet are:
+1. The ability to decide which functions to validate and which not.
+2. Allow for custom deeper validation beyond type check.
+3. As little as possible performance overhead.
+4. Add as minimal number of dependencies as possible.
+5. Support for all python versions we are using.
+
+## Considered Options
+1. Usage of a `ValidationMixin`.
+2. Usage of a third-party library called `pydantic`.
+
+## Pros, Cons, and Considerations of the Options
+
+### Option 1: Usage of a ValidationMixin
+
+**Note:** All pros, cons, and considerations are documented with the assumption
+we would implement the `ValidationMixin` the same way it is implemented in
+[in-toto](https://github.com/in-toto) until version 1.0.1 (the latest
+version at the time of writing.)
+
+* Good, because it's shorter by calling one function and validating
+multiple fields.
+
+* Good, because it allows reuse of the validation code through
+`securesystemslib.schemas` or another schema of our choice.
+
+* Bad, because there could be different code paths and return statements, and as
+a consequence we can easily forget to call `validate()` at the end of our function.
+
+Examle:
+```python
+class User(ValidationMixin):
+
+  def __init__(self, id: int, nickname: str) -> None:
+      self.id = id
+      self.nickname = nickname
+      self.pro_user = False
+
+      self.validate()
+
+  def _validate_id(self):
+    if not isinstance(self.id, int):
+          raise FormatError(f'id should be from type int')
+
+    if self.id < 0:
+      raise ValueError(f'id is expected to be a positive number')
+
+  def update_profile(self, new_id: int, new_nickname: str):
+    self.id = new_id
+
+    if not self.pro_user:
+      print(f'Standart users can only change their id! '
+            f'If you want to change your nickname become a pro user.)
+
+      return
+
+    self.nickname = new_nickname
+    # Be careful if you rely on _validate_id() to verify self.id!
+    # This won't be called if new_name is "".
+    self.validate()
+```
+
+* *Personal opinion*: bad, because it's not a clean solution from an OOP
+perspective to inherit `ValidationMixin` from classes without a "IS A"
+relationship with it.
+
+* Consideration: if we use this option, we are limited on what can be validated.
+With the `in-toto` implementation of the `ValidationMixin`, we can only validate
+class attributes inside class methods.
+If we want to validate functions outside classes or function arguments we would
+have to enhance this solution.
+
+* Consideration: if we use this option, we would be responsible for the code
+and all identified issues related to `securesystemslib.schemas` should be
+resolved by us.
+
+* Consideration: if we want to enforce assignment validation, this solution
+should be combined with custom "setter" properties.
+
+### Option 2: Usage of a third-party library called "pydantic"
+
+* Good, because it's flexible:
+1. There is a `@validate_arguments` decorator which allows us to decide which
+functions do we want to validate and to validate functions outside classes.
+2. There is a `@validator` decorator which allows us to make a deeper validation
+beyond type check for our class attributes.
+3. We can use an embedded `Config` class inside our classes which allows for
+even more customization (for example enforce assignment validation).
+
+* Good, because (according to their documentation) `pydantic` is the fastest
+validation library compared to others (including our other third-party library
+option `marshmallow`).
+See: https://pydantic-docs.helpmanual.io/benchmarks/
+
+* Good, because it uses the built-in types from `python 3.6` onwards.
+
+* Bad, because this library **has not yet implemented** a `strict` mode and
+the default behaviour when validating a certain argument or field is to **try
+a cast to the expected type from the received type**.
+To enable strict mode, we would have to add this manually through
+`validators` that are called before the cast.
+See: https://github.com/samuelcolvin/pydantic/issues/1098
+
+* Bad, because there is a learning curve when using `pydantic`.
+1. For example, when I had to handle the `_type` attribute in `Signed` it took me
+a lot of reading to understand that standard attributes whose name begin with
+"_" are ignored. The `_type` attribute can only be `PrivateAttr`
+(defined in `pydantic`) even though we don't handle it as a typical private
+attribute.
+2. Also, I had difficulties using pydantic when there is inheritance.
+The initialization and validation of new objects was tricky.
+
+* Bad, because it adds `2` new dependencies: `pydantic` and `typing-extensions`.
+This was concluded by performing the following steps:
+1. Creating a fresh virtual environment with python3.8.
+2. Installing all dependencies in `requirements-dev.txt` from `tuf`.
+3. Install `pydantic` with `pip install pydantic`.
+
+## Decision Outcome
+
+*TODO: Make and describe the decision*
diff --git a/docs/adr/index.md b/docs/adr/index.md
index 914c681b96..da0324d49d 100644
--- a/docs/adr/index.md
+++ b/docs/adr/index.md
@@ -11,6 +11,7 @@ This log lists the architectural decisions for tuf.
 - [ADR-0004](0004-extent-of-OOP-in-metadata-model.md) - Add classes for complex metadata attributes
 - [ADR-0005](0005-use-google-python-style-guide.md) - Use Google Python style guide with minimal refinements
 
+TODO: ADD TO INDEX
 <!-- adrlogstop -->
 
 For new ADRs, please use [template.md](template.md) as basis.