Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Ballerina Constraint Package #2850

Open
ldclakmal opened this issue Apr 20, 2022 · 22 comments
Open

Proposal: Ballerina Constraint Package #2850

ldclakmal opened this issue Apr 20, 2022 · 22 comments
Assignees
Labels
module/constraint Status/Accepted Accepted proposals Team/PCM Protocol connector packages related issues Type/NewFeature Type/Proposal

Comments

@ldclakmal
Copy link
Member

ldclakmal commented Apr 20, 2022

Summary

Ballerina Constraint package will provide features to validate the values that have been assigned to Ballerina types. This proposal is to introduce the new package that supports for the validation.

Goals

Introduce a new standard library package which has APIs to validate the values that have been assigned to Ballerina types.

Motivation

Right now, the values assigned to Ballerina types cannot be validated further. As an example, according to the definition of int type in Ballerina specification:

The int type consists of integers between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807 (i.e. signed integers than can fit into 64 bits using a two's complement representation).

It cannot be further constrained as the user wishes. As an example, the age of the Person cannot be validated for a positive integer. Likewise, there is no way to constraint the values assigned to Ballerina types as of now. With this proposed package, that can be done with the use of an annotation which is binded to the type.

Also, this support is available in the other language specification such as XML Schema Part 2, JSON schema validation, OpenAPI specification and JSR 303.

Description

The XML Schema Part 2, JSON schema validation, OpenAPI specification and JSR 303 considered as references for designing this package. The highlighted validation rules/keywords are used for the proposed design for Ballerina.

Constraints of XML Schema

type validation rule
string length, minLength, maxLength, pattern, enumeration, whiteSpace
boolean pattern, whiteSpace
float, double pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
decimal totalDigits, fractionDigits, pattern, whiteSpace, enumeration, maxInclusive, maxExclusive, minInclusive, minExclusive
duration, dateTime, time, date, gYearMonth, gYear, gMonthDay, gDay, gMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
hexBinary, base64Binary, anyURI, QName, NOTATION length, minLength, maxLength, pattern, enumeration, whiteSpace

Example:

<simpleType name='password-string'>
 <restriction base='string'>
   <minLength value='8'/>
   <maxLength value='12'/>
 </restriction>
</simpleType>

References:

Constraints of OpenAPI Specification

type validation keyword values for format keyword
integer minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf int32, int64
number minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf float, double
string minLength, maxLength, pattern byte, binary, date, date-time, password
array minItems, maxItems, uniqueItems -
object minProperties, maxProperties -
boolean - -

Example:

components:
  schema:
    type: string
    minLength: 8
    maxLength: 12
    format: password

References:

Constraints of JSON Schema

type validation keyword values for format keyword
integer minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf -
number minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf -
string minLength, maxLength, pattern date, date-time, time, duration, email, idn-email, hostname, idn-hostname, ipv4, ipv6, uri, uri-reference, iri, iri-reference, uuid, uri-template, json-pointer, relative-json-pointer, regex
array minItems, maxItems, uniqueItems, maxContains, minContains -
object minProperties, maxProperties, required, dependentRequired -
boolean - -

Example:

{
   "type": "string",
   "minLength": 8,
   "maxLength": 12,
   "pattern": "^(?=.*[A-Za-z])(?=.*\\d)[A-Za-z\\d]{8,12}$"
}

References:

Constraints of Java

NOTE: A whitespace have been added between @ symbol and the constraint name, in order to remove tagging GitHub users and organizations.

@ Null
@ NotNull
@ AssertTrue
@ AssertFalse
@ Min
@ Max
@ DecimalMin
@ DecimalMax
@ Negative
@ NegativeOrZero
@ Positive
@ PositiveOrZero
@ Size
@ Digits
@ Past
@ PastOrPresent
@ Future
@ FutureOrPresent
@ Pattern
@ NotEmpty
@ NotBlank
@ Email

public class User {
    private String name;

    @Min(value = 18, message = "Age should not be less than 18")
    private int age;

    @Email(message = "Email should be valid")
    private String email;

    // standard setters and getters 
}

References:

Proposed Constraints for Ballerina

The following constraints are proposed for Ballerina.

Constraint name Applies to type Constraint value type Semantics (v is value being constrained, c is constraint value)
minValue any ordered type T T v >= c
maxValue v <= c
minValueExclusive v > c
maxValueExclusive v < c
multipleOf int, decimal int, decimal v % c = 0
length string, xml, table, list, map int v.length() == c
minLength v.length() >= c
maxLength v.length() <= c
uniqueMembers anydata[], map<anydata> boolean for any value k & k' in v, v[k] != v[k']
pattern string regexp v matches c (need to decide whether match is anchored or not)
schemaValid xml SchemaValid record (defined below) v must be valid according to an XSD schema as described by the SchemaValid record c
fractionDigits decimal int v must have not more than c fraction digits
oneOf mapping string[][] protobuf oneof semantics; [["a", "b"], ["c", "d"]] allowed when a, b, c, d are optional fields; must have exactly one of a and b, and exactly one of c and d
dependentRequired mapping map<string[]> if field k is present, then all fields in v[k] must be present

SchemaValid record definition used for schemaValid constrain of above table:

type SchemaValid record {|
     // top-level can contain pi,comment, whitespace before/after element
     boolean document = true;
     // "{ns}localName" (works with xmlns declaration)
     string elementName;
     map<string> schemaLocation?;
     string noNamespaceSchemaLocation?;
|};

Proposed APIs

The ballerina/constraint package provides different annotations for different basic types e.g. @constraint:String for strings, @constraint:Map for maps etc. each of these will define a separate associated record type. These annotations are attached to the type or record field attachment points.

Annotation

public annotation IntConstraints Int on type, record field;
public annotation FloatConstraints Float on type, record field;
public annotation NumberConstraints Number on type, record field;
public annotation StringConstraints String on type, record field;
public annotation ArrayConstraints Array on type, record field;
// ... rest of the annotation definitions

Associated Record Types

type IntConstraints record {|
   int minValue?;
   int maxValue?;
   int minValueExclusive?;
   int maxValueExclusive?;
   // ... all the finalized constraints for int type should go here
|};

type FloatConstraints record {|
   float minValue?;
   float maxValue?;
   float minValueExclusive?;
   float maxValueExclusive?;
   // ... all the finalized constraints for float type should go here
|};

type NumberConstraints record {|
   decimal minValue?;
   decimal maxValue?;
   decimal minValueExclusive?;
   decimal maxValueExclusive?;
   // ... all the finalized constraints for decimal type should go here
|};

type StringConstraints record {|
   int length?;
   int minLength?;
   int maxLength?;
   string pattern?;
   // ... all the finalized constraints for string type should go here
|};

type ArrayConstraints record {|
   int length?;
   int minLength?;
   int maxLength?;
   // ... all the finalized constraints for any[] type should go here
|};

// ... rest of the associated record types

Annotation Mappings

type annotation
int @constraint:Int
float @constraint:float
int|float|decimal @constraint:Number
string @constraint:String
any[] @constraint:Array
... ...

Function

The package has the public function that the developer is expected to call with the value that needs to be validated along with its type descriptor. Returns typedesc<anydata> if the validation is successful, or else an error if the validation is unsuccessful or if there is an issue with the constraint value.

public function validate(anydata v, typedesc<anydata> td = <>) returns td|error {
   // ...
}

NOTE: In general the constraint checker code will need to do some checking on the annotation with the attached basic data type. It won't all be done declaratively by the annotation mechanism.

Examples

import ballerina/constraint;
import ballerina/log;
 
type Person record {|
    string name;
    @constraint:Int {
        minValue: 18
    }
    int age;
    @constraint:String {
        pattern: "^[\\w-\\.]+@([\\w-]+\\.)+[\\w-]{2,4}$"
    }
    string email;
|};
 
public function main() {
    Person person = {name: "Chanaka", age: 16, email: "[email protected]"}; 
    Person|error validation = constraint:validate(person);
    if validation is error {
        log:printError("Failed to validate person details", validation);
    }
    // business logic
}

Related issue: #2788

@ldclakmal ldclakmal self-assigned this Apr 20, 2022
@ldclakmal ldclakmal added Type/Proposal Team/PCM Protocol connector packages related issues Status/Active Proposals that are under review labels Apr 20, 2022
@jclark
Copy link

jclark commented Apr 21, 2022

Ordered type is defined here: https://ballerina.io/spec/lang/2022R1/#ordering. It cannot be described by a Ballerina type definition. OrderedType is an approximation: any Ballerina ordered type will be an OrderedType.

@jclark
Copy link

jclark commented Apr 21, 2022

uniqueItems only makes sense when member type is subtype of anydata, i.e. applicable only to subtype of map<anydata> or anydata[]: Items is not the right word: Ballerina terminology is members.

@jclark
Copy link

jclark commented Apr 21, 2022

I don't understand what minContains and maxContains mean.

Is there a real use case for multipleOf?

This syntax is wrong:

@constrain {
        minValue = 18
    }

Needs a colon not an equals.

@jclark
Copy link

jclark commented Apr 21, 2022

Constraints defined in the Constraints record is valid only for some types. Validity of constraints for a particular type have to be validated separately. How can it be done?

You might be able to handle some cases with a constraint on the annotation record type.

An annotation is a qualified name. So you can have a constrain module that provides different annotations for different basic types e.g. @constrain:String for strings, @constrain:Map for maps etc. each of these will define a separate associated record type.

But in general the constraint checker code will need to do some checking. It won't all be done declaratively by the annotation mechanism.

@jclark
Copy link

jclark commented Apr 21, 2022

Hence, the constraint value type of pattern constraint is extended to regexp or predefined formats. Predefined formats have a regexp assigned internally.

This is a terrible idea. They are totally different things.

format is not a constraint in the way all these other things are.

@jclark
Copy link

jclark commented Apr 21, 2022

This is wrong pattern = "^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$": you would need to use \\w not \w.

@ldclakmal
Copy link
Member Author

uniqueItems only makes sense when member type is subtype of anydata, i.e. applicable only to subtype of map<anydata> or anydata[]: Items is not the right word: Ballerina terminology is members.

Updated the proposal for the applicable types with the correct terminology.

@ldclakmal
Copy link
Member Author

ldclakmal commented Apr 21, 2022

I don't understand what minContains and maxContains mean.

These 2 constraints were added with the idea of validating the minimum and the maximum value that an any ordered type array or map can have.

Is there a real use case for multipleOf?

// Cash withdrawal from an ATM
type CashWithdrawal record {|
    string currency;
    @constrain {
         multipleOf: 100
    }
    int amount;
|};

This syntax is wrong:

@constrain {
        minValue = 18
    }

Needs a colon not an equals.

Updated the proposal.

@ldclakmal
Copy link
Member Author

This is wrong pattern = "^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$": you would need to use \\w not \w.

Updated the proposal.

@ldclakmal
Copy link
Member Author

Hence, the constraint value type of pattern constraint is extended to regexp or predefined formats. Predefined formats have a regexp assigned internally.

This is a terrible idea. They are totally different things.

format is not a constraint in the way all these other things are.

Agree. It acts as a metadata / hint of the content. So, we will remove this format thing from the proposal and revisit that later. Updated the proposal.

@ldclakmal
Copy link
Member Author

ldclakmal commented Apr 21, 2022

Constraints defined in the Constraints record is valid only for some types. Validity of constraints for a particular type have to be validated separately. How can it be done?

You might be able to handle some cases with a constraint on the annotation record type.

An annotation is a qualified name. So you can have a constrain module that provides different annotations for different basic types e.g. @constrain:String for strings, @constrain:Map for maps etc. each of these will define a separate associated record type.

But in general the constraint checker code will need to do some checking. It won't all be done declaratively by the annotation mechanism.

This is much better than the proposed method IMO. As explained, we can have different annotations for different basic types where each of these will define a separate associated record type.

public annotation StringConstraints String on type, record field;
public annotation IntConstraints Int on type, record field;
...
type StringConstraints record {|
   int length?;
   int minLength?;
   int maxLength?;
   string pattern?;
   // ... all the finalized constraints for string type should go here
|};

type IntConstraints record {|
   int minValue?;
   int maxValue?;
   int minValueExclusive?;
   int maxValueExclusive?;
   // ... all the finalized constraints for int type should go here
|};

@jclark Since package name is ballerina/constraint, shouldn't the annotation-tag be like @constraint:String/@constraint:Map instead of @constrain:String/@constrain:Map?

If so, the example would be as follows:

import ballerina/constraint;
import ballerina/log;
 
type Person record {|
    string name;
    @constraint:Int {
        minValue: 18
    }
    int age;
    @constraint:String {
        pattern: "^[\\w-\\.]+@([\\w-]+\\.)+[\\w-]{2,4}$"
    }
    string email;
|};
 
public function main() {
    Person person = {name: "Chanaka", age: 16, email: "[email protected]"}; 
    error? validation = constraint:validate(person);
    if validation is error {
        log:printError("Failed to validate person details", validation);
    }
    // business logic
}

@shafreenAnfar
Copy link
Contributor

@sameerajayasoma do let us know if you have any feedback.

@ldclakmal
Copy link
Member Author

ldclakmal commented Apr 26, 2022

We had a meeting today with @jclark to discuss about the proposal and decided followings:

  • different annotations for different basic types where each of these will define a separate associated record type

    public annotation StringConstraints String on type, record field;
    public annotation IntConstraints Int on type, record field;
    ...
  • agreed to have the package named as ballerina/constraint

  • decided to work on phases as follows:

    Phase 1
    constraints

    1. minValue
    2. maxValue
    3. minValueExclusive
    4. maxValueExclusive
    5. length
    6. minLength
    7. maxLength

    annotations

    type annotation
    int @constraint:Int
    float @constraint:Float
    int|float|decimal @constraint:Number
    string @constraint:String
    any[] @constraint:Array

    Following Phases
    In the following phases, we are planning to discuss and implement other constraints, for an example:

    • pattern: depends on regexp
      • applies to constraint:String
    • structural constraints that constrain the use of constraints
      @constraint:Map {
         atMostOneOf: [["minValue", "minValueExclusive"]]
      }
      type Float record {|
          float minValue;
          float minValueExclusive;
      |};

Updated the proposal according to these.

@jclark
Copy link

jclark commented Apr 26, 2022

I thought we were going to have e.g. constraint:Int and constraint:String. Numbers require some subtlety to deal with the case where a value might allow a union:

  • constraint:Int is a constraint that applies when the value is an int; the annotation record declares the min/maxValue fields as type int
  • constraint:Float is a constraint that applies when the value is an float; the annotation record declares the min/maxValue fields as type float
  • constraint:Number is a constraint that applies when the value is int|float|decimal; the annotation declares the min/maxValues fields as type decimal; the semantics are that the value when converted to decimal is within the specified limits

decimal is special because it includes the range of the other numeric types, and it represents floating point decimal values exactly.

multipleOf makes sense for only int and decimal.

@ldclakmal
Copy link
Member Author

Yes. That is true. I meant to express the same but sorry for not explaining it properly. Thanks for the clarification. Updated my last comment to be less ambiguous.

@shafreenAnfar shafreenAnfar added Status/Accepted Accepted proposals and removed Status/Active Proposals that are under review labels May 7, 2022
@ldclakmal
Copy link
Member Author

@jclark The proposed API works successfully with the record field annotation attachment point, but there is an issue with retrieving the annotations form the type annotation attachment point.

Case 1: record field annotation attachment point

import ballerina/constraint;
import ballerina/log;

type Foo record {
    @constraint:String {
        length: 6
    }
    string value;
};

public function main() {
    Foo foo = {value: "s3cr3t"};
    error? validation = constraint:validate(foo);
    if validation is error {
        log:printError("Failed to validate details", validation);
    }
    // business logic
}

Case 2: type annotation attachment point.

import ballerina/constraint;
import ballerina/log;

@constraint:Int { minValue: 0 }
type PositiveInt int;

public function main() {
    PositiveInt age = 18;
    error? validation = constraint:validate(age);
    if validation is error {
        log:printError("Failed to validate age", validation);
    }
    // business logic
}

In case-2, there is no way to get the annotations attached to the PositiveInt type in the runtime. Because, in runtime, the value (which is 18) is passed to the validate function and the type is resolved with inherent type (which is int in this case). Therefore, we might have to come up a different API to solve this.

@jclark
Copy link

jclark commented May 17, 2022

I agree the API isn't right.

Does it really make sense to get the constraints from the value being validated? I don't think so. I think you should pass in the typedesc to be used for validation as a parameter, something like:

function validate(anydata value, typedesc<anydata> t = <>) returns t|error;

e.g.

Foo foo = {value: "s3cr3t"};
Foo validFoo = check constraint:validate(foo);

@ldclakmal
Copy link
Member Author

Thanks @jclark. This API is better and solve the problem we have. Also, it works successfully with the both cases mentioned above.

@sameerajayasoma
Copy link
Contributor

I would not return the validated value cg

@sameerajayasoma
Copy link
Contributor

I understand why we had to design the validate method to return the value it validates.

But I would use the following signature for the validation method. I know that we don't have the syntax to specify some typedesc values.

function validate(anydata value, typedesc<anydata> t) returns error?;

e.g.,

@constraint:Int { minValue: 0 }
type PositiveInt int;

public function main() returns error?{
    PositiveInt age = 18;
    check constraint:validate(age, PositiveInt);
    // ...
}

@jclark
Copy link

jclark commented Jun 9, 2022

My experience of writing Ballerina code is that it works better to have a function validFoo that returns Foo|error rather than validateFoo that returns error?. The former works better with an expression-oriented, functional style of programming, which tries to do more work in each expression, rather than a procedural style, which breaks things down into lots of little statements. Think of validFoo as a function that given a possibly invalid Foo gives you a known valid Foo.

Your example is not realistic: you almost always want to do something with the value you have validated (otherwise why would you validate it?).

Apart from that, this approach nicely allows the typedesc value to be defaulted.

@TharmiganK
Copy link
Contributor

Regarding the regexp match for string types, we would go with the name pattern for the constraint since the same name is used in the references mentioned in the proposal.

Sample :

string:RegExp regExp = re `([0-9]{10})|(\+[0-9]{11})`;

@constraint:String {pattern: regExp}
type PhoneNumber string;

type User record {|
    string name;
    @constraint:String {pattern: re `male|female`}
    string gender?;
    int age;
    @constraint:String {pattern: re `([0-9]{9}[v|V]|[0-9]{12})`}
    string nic;
|};

type UserAdvanced record {|
    *User;
    PhoneNumber contactNumber;
    @constraint:String {pattern: re `([a-zA-Z0-9._%\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,6})*`}
    string email;
|};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module/constraint Status/Accepted Accepted proposals Team/PCM Protocol connector packages related issues Type/NewFeature Type/Proposal
Projects
None yet
Development

No branches or pull requests

7 participants