Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model constraints #961

Open
bterlson opened this issue Sep 1, 2022 · 7 comments
Open

Add model constraints #961

bterlson opened this issue Sep 1, 2022 · 7 comments
Labels
design:needed A design request has been raised that needs a proposal
Milestone

Comments

@bterlson
Copy link
Member

bterlson commented Sep 1, 2022

A feature that Bosque has that I covet is the ability to describe constraints on the values of a model in order to e.g. generate code to construct a model that ensures that the data inside the model is correct. We have decorators for simple scenarios but there are many cases where the validations are more complex than the decorators can handle.

Imagine syntax like the following:

model AccountOwner {
  name: string;
  age: int8;

  ensure age > 18;
  ensure name.length > 2;
}

model Account {
  balance: int32;
  kind: "joint" | "single";
  owners: AccountOwner[];
  ensure balance > 0;
  ensure if (kind == "joint") { owners.length > 1 }
}

Syntax wise I think we leverage what we've already done for projections. Semantically, I think these clauses are validated but not run in Cadl. Emitters can access the AST and convert these expressions into target languages.

There are various constraints that don't necessarily apply to projections to keep in mind, i.e. these expressions must be pure in the sense that they should not make modifications to the cadl program at all.

@mrkmarron
Copy link

This looks like a great feature. Tons of value in being able to add and check these constraints. Bosque is also working on tools to use them to do structured input fuzzing, semantic diffing for versioning, and auto-generating mocks of services. Would be great to expose a hook for using other languages (e.g. Bosque) for writing these logical constraints.

I opened issue #505 in the Bosque repo to track things on the Bosque side.

@timotheeguerin
Copy link
Member

Think we'd need to make sure we can do this for model is, maybe just using self:

model Foo is int32 {
  ensure self > 32;
  ensure self < 64;
}

this feels like partially related to this prospal #602 but we have this constraint as cadl language

@bterlson
Copy link
Member Author

bterlson commented Sep 1, 2022

That's a good point. @mrkmarron is that something Bosque can express, say an integer type with values between 32 and 64?

@mrkmarron
Copy link

Yeah, definitely possible. In fact Bosque has special support for doing typedecl on most primitive types (Nat, Int, String, ...):

typedecl Foo = Int & {
    invariant $value > 32i;
    invariant $value < 64i;
}

We are basically using the special name $value as the "self". Just keeping in line with the motivation that no invalid instantiations of Foo can ever exist -- i.e. the type of $value is Int not Foo.

@allenjzhang allenjzhang added design:needed A design request has been raised that needs a proposal and removed Needs Triage labels Sep 2, 2022
@allenjzhang allenjzhang added this to the Backlog milestone Sep 2, 2022
@markcowl markcowl removed this from the Backlog milestone Sep 13, 2022
@markcowl
Copy link
Contributor

I wonder if we can express similar kinds of constraints for operations? For example, asserting that casing / values are identical across a request and response, or asserting conditional response values based on request values. This strikes me a s useful in service testing.

@mrkmarron
Copy link

mrkmarron commented Sep 13, 2022

I spent a bit of time on a proof-of-concept implementation of this feature last week. The fork is here. I took the example @bterlson had and tweaked it a bit to try out some various things, including pre/post constraints like @markcowl mentioned.

@invariant("$age > 18i")
@invariant("!$name.empty()")
model AccountOwner {
  name: string;
  age: int8;
}

enum AccountKind {
  joint,
  single,
}

@invariant("$balance > 0i")
@invariant("$kind === AccountKind::joint ==> $owners.size() > 1n")
model Account {
  @pattern("^[0-9]{4}-[a-z0-9]{7}$") accountID: string;
  balance: int32;
  kind: AccountKind;
  owners: AccountOwner[];
}

enum ErrorKind {
  network,
  access,
  invalid,
}

model Error {
  info: ErrorKind;
}

interface AccountService {
  list(): Account[] | Error;

  read(@pattern("^[0-9]{4}-[a-z0-9]{7}$") id: string): Account | Error;

  @ensures("$return.is<Account>() ==> $return.accountID === id && $return.kind === AccountKind::joint && $return.owners.someOf(pred(oo) => oo.name === owner)")
  addOwner(@pattern("^[0-9]{4}-[a-z0-9]{7}$") id: string, owner: string): Account;
}

The @invariant, @requires, and @ensures annotations and the code they run should be pretty self explanatory. I haven't checked the full projections expression language but from my experiments it looks like the expression language and CADL shapes map pretty well into Bosque.

From this spec we can (of course) generate executable bosque code to validate the pre/post/invariant conditions at runtime. In addition Bosque also is also designed explicitly with solver support so we there are (at least) 3 scenarios that this would also light up -- fuzz input generation, auto service test mock generation, and API delta generation. I have implemented the first 2 of these and can explain the third.

Fuzzing: Given just the specs we can solve for valid inputs, e.g., for read we might generate "0000-0000000 or 1234-a999bcd there are ways we can easily explore generating diverse input sets -- via fixed value region sampling, genetic mutation operators, or GPT-3 style natural input sampling.

Auto-Mocking: From the specs we can solve for a valid output given a concrete input. So, if you want to test against a service you use, instead of writing mocks by hand, you can auto generate them on demand as your test runs and makes calls to various service interface endpoints. In this example you might call addOwner and want to get back a result that satisfies all the constraints of the @ensures clause given the values of the input. Again we will probably want some ways to control or sample from the possible outputs, i.e., always assume success vs. allow errors or generate adversarial values vs. natural happy path results.

I have a little animation of these scenarios working here:

In the third scenario you can imagine semantic changes to a model occurring, like allowing new account ids say where they may have more digits or allow an optional prefix. In this case we can use the same underlying solver to generate inputs (or outputs) that are not valid in version A but are now valid in version B to help test for regressions when updating dependecies.

@markcowl markcowl added this to the [2022] December milestone Oct 10, 2022
@abatishchev
Copy link
Contributor

Here's a scenario that might fall under this category too.

Given:

{
  "properties": {
    "auth": {
      "cert": { },
      "aad": { }
    }
  }
}

I want to be able to require at least one properly specified: either cert or aad or (what's important) both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design:needed A design request has been raised that needs a proposal
Projects
None yet
Development

No branches or pull requests

6 participants