Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOSCA 2.0 custom function usage and definition #123

Closed
calincurescu opened this issue Jul 5, 2022 · 13 comments
Closed

TOSCA 2.0 custom function usage and definition #123

calincurescu opened this issue Jul 5, 2022 · 13 comments

Comments

@calincurescu
Copy link

calincurescu commented Jul 5, 2022

Custom function usage syntax and semantics:

  • any identifier that starts with $ is going to be treated as a function name
  • a function must be of the key-value pair form, where key is a string starting with $ and the value is a yaml list
    • $function_a1: [arg1, arg2, …]
    • $function_a2: []
      • this is a function with no arguments
  • nested functions are supported
    • (i.e. functions in the arguments of another function will be recognized by their $ prefix and resolved, with the result passed as an argument to the outer function
    • e.g.: $function_a3: [{$func_a1: […]}, arg2, …]
      • this is a nested usage of function in a function
  • for the occasions when the $ character to be passed as a literal, it needs to be escaped using $$
    • for example:
      properties:
        prop1: $$myid
        prop2: my_value
    • prop1 will be assigned with string value “$myid”
@calincurescu
Copy link
Author

calincurescu commented Jul 5, 2022

Custom function definition syntax and semantics:

Specification of the functions is a YAML map under the functions keyname:

functions:
  <function_def>
  <function_def>
  …
  <function_def>

Specification of a <function_def> is a list of signature definitions for each function name:

  <function_name>: 
    - <signature_def>
    - <signature_def>
    - <signature_def>
    - …

Each <signature_def> is a map of following keywords definitions (where only the result is mandatory):

parameters:
  - <schema_def>
  - <schema_def>
  - <schema_def>
  - …
unbounded: <boolean>
result: <schema_def>
description: <string>
implementation: <implementation_def>
  • The unbounded keyname defines if the last defined parameter in the parameters list is fixed or unbounded.
    • As a general rule all parameters before the last parameter must appear exactly once in the function usage
      • and in the order of their definition in the parameters list.
    • If unbounded is false, the last defined parameter is considered fixed
      • i.e. it must appear exactly once in the function usage.
    • If unbounded is true, any number (including 0) of the last parameter type may appear in the function usage.
      • Note that there is no possibility to confuse which parameters is which since their position uniquely identifies them.
    • Default value of unbounded is false
  • If no implementation is specified then it’s assumed that the orchestrator needs to be preconfigured to handle the function call
  • The functions section can be defined both outside or inside a service_template section
    • Function definitions outside a service_template can be within a profile TOSCA file or imported TOSCA file
      • Namespacing works like for types
        • Overlapping definitions under the same <function_name> are not allowed
    • Function definitions inside a service_template that are having the same <function_name> are considered a refinement of the homonymous definition outside the service_template
    • This allows for two separated functional design moments in function design:
      • At profile design time (outside service_template)
        • when e.g. the parameters and result is defined and can be used in the types definitions
      • At service template design time (inside service_template)
        • when the implementation or description may be added/changed
        • file references within a current CSAR can be calculated

@calincurescu
Copy link
Author

calincurescu commented Jul 5, 2022

Examples

functions:
  sqrt:
    - parameters:
      - type: integer
        description: This is the parameter
        constraints: [ greater_or_equal: 0] 
      result:
        type: float
        description: Returns a float
      unbounded: false
      description: takes the square root of an integer, returns a float
      implementation: scripts/sqrt.py
    - parameters:
      - type: float
        constraints: [ greater_or_equal: 0.0] 
      result:
        type: float
        description: Returns a float
      description: Takes the square root of a float, returns a float
      implementation: scripts/sqrt.py

The next sqrt is similar to above, but uses a simplified type notation (no constraints check can be expressed):

functions:
  sqrt:
    - parameters: [ integer ]
      result: float
      implementation: scripts/sqrt.py
    - parameters: [ float ]
      result: float
      implementation: scripts/sqrt.py
functions:
  my_func_with_diff_param_types:
    - parameters:
      - type: MyType1
        description: "this is the first parmeter reprez..."
      - type: string
        description: "this is the second parmeter reprez..."
      - type: string
        description: "this is the third parmeter reprez..."
      - type: MyType2
        description: "this is the unbounded parmeter (zero to infinite) reprez..."
      unbounded: true
      result: 
        type: MyTypeRez
      implementation: scripts/my.py

Same as the above, but in compact notation:

functions:
  my_func_with_diff_param_types:
    - parameters: [MyType1, string, string, MyType2]
      unbounded: true
      result: MyTypeRez
      implementation: scripts/my.py

Polymorphism allows a function to accept parameters and return results of different kinds:

functions:
  union:
    - parameters:
      - type: list
        entry_schema: integer
      unbounded: true
      result:
        type: list
        entry_schema: integer
      implementation: scripts/libpi.py
    - parameters:
      - type: list
        entry_schema: float
      unbounded: true
      result:
        type: list
        entry_schema: float
      implementation: scripts/libpi.py

Defining a list in a map parameter:

functions:
  complex_arg_function:
    - parameters: 
      - string
      - type: list
        entry_schema: integer
      - integer
      - type: map
        key_schema: string
        entry_schema:
          type: list
          entry_schema: MyType5
      - string
      unbounded: true
      result: string
      implementation: scripts/complex.py

@tliron
Copy link
Contributor

tliron commented Jul 7, 2022

I'm not in favor of adding this complex DSL for declaring function signatures. I'd prefer that we leave function validation up to the implementation. This proposal limits the usability of custom functions.

Indeed, some built-in functions can't be expressed by this DSL, specifically get_property and get_attribute which use a complex TOSCA path and a return type that is dependent on resolving that path. Furthermore, I imagine that a common use case for custom functions would involve accessing environmental data, where parameter and return types are not known until Day 1 or Day 2, or maybe it otherwise depends on an external tool.

So, if the implementation gets a bad parameter, it should emit an error. Also, if the return value does not match the type and constraints of its value site, there should also be an error. The advantages of checking types in design-time is small and limited: the values are often just as important as the types, and these cannot always be validated by TOSCA constrained (e.g. the ID for an external object). Validation happens in the way the TOSCA processor will implement these connected functions. This extra layer of syntactical validation is overly complex, limited in scope, and has limited benefits. Note that most dynamic programming languages do not require function signatures.

Assuming others insist on having this complex DSL, I would insist that we make it entirely optional. So, if a function is used without a declared signature it will not be validated by parsers. Call it a "signature-less" function if you will. Since we are requiring the $ prefix there is no ambivalence as to whether a map-with-one-key is as function call or not.

Also, a fix for the initial post. I think the proposed syntax is imprecise. Here's my proposal:

For TOSCA values, for any YAML map at any nested depth, the following parsing rules apply to its keys:

  • Is the key a string starting with $ and with length > 1? If yes:
    • Is the second character $? If so, discard the first $ and stop here (escape). If yes:
      • Is this the only key? If not, emit a parsing syntax error ("malformed function"). If yes:
        • Is the value of the key a seq? If not, emit a parsing syntax error ("malformed function"). If yes:
          • This is a function call!

Note that these rules do not apply to string values that are not map keys. Thus there is no need to escape a $ prefix for them. [To fix your initial example]:

properties:
  prop1: $myid # valid as is, no need to escape
  prop2: my_value

But here we do need to escape:

properties:
  $$prop1: myid # if we don't escape we will get a "malformed function" syntax error
  prop2: my_value

@calincurescu
Copy link
Author

calincurescu commented Jul 11, 2022

  • I fully agree that the function definition should not be mandatory, if a non-defined function is encountered by the parser it is not validated/invalidated and it's assumed that the TOSCA processor implementation knows how to deal with it.
    • Moreover function names in function definitions should not overlap with the built-in TOSCA functions.
  • In Ericsson we need such a custom function definition feature since we would like to ship custom function implementations in the CSAR, so that there is a certainty that the functions will be supported at the destination.
  • I think that there is a value to have a signature-less function definition w.r.t. the parameters. I propose that we use a new non-required boolean keyname: freeform.
    • If freeform is true and no parameters are defined then any list can be the input to the function.
    • If freeform is false and parameters are not defined then we have a function without parameters.
    • It is not allowed freeform to be true and parameters to be defined.
    • Default value for freeform is false.
    • A function with freeform true can have only one signature.
    • If we would apply a freeform to the result then there would be a complexity problem with nested functions since it would be hard to decide if the inner function freeform result should be casted to a certain type or a different function signature should be used for the outer function.
  • I think that if functions are defined then the result type must be defined, since these are textual functions, used for data transformation, so there is no reason to have no clearly typed result or having no result. Note that the built-in functions (such as get_property) can be connected to TOSCA processor behavior which is different from these functions.
  • I am fine with only checking the $ and applying escaping in the key of a key-value pair to see it's a function. Though users will then need to understand that the escaping works only for keys not for values. That is, if they have a map as a value for say a property then only the $ in the keys in that map need to be escaped.

@tliron
Copy link
Contributor

tliron commented Jul 11, 2022

* In Ericsson we need such a custom function definition feature since we would like to **ship custom function implementations in the CSAR**, so that there is a certainty that the functions will be supported at the destination.

"Certainty" is a very big word. :) Note that we have not established (and I don't think we should) the mechanism for calling that "implementation" artifact, whatever it is (is it a TOSCA artifact?). Khutulun, for example, uses gRPC for all delegates/plugins. At best you have certainty that other parsers would just check parameter and return types (without constraints).

And also I fail to see how this complex signature DSL really helps Ericsson. I think all you need here is a map connecting function names to the implementation "artifacts". When the implementation is called the function will be fully validated, including actual values (and not just types) with constraints. This proposed DSL adds a lot of complexity for not much tangible benefits in my view -- some limited Day 0 syntax validation for a subset of functions.

 * I think that there is a value to have a signature-less function definition w.r.t. the parameters. I propose that we use a new non-required boolean keyname: **freeform**.

Maybe "dynamic"? That's what it's called in programming languages.

  * If freeform is true and no parameters are defined then any list can be the input to the function.

"dynamic" functions should also have an unknown-typed return value. That's probably the more important differentiator.

  * If we would apply a freeform to the result then there would be a complexity problem with nested functions since it would be hard to decide if the inner function freeform result should be casted to a certain type or a different function signature should be used for the outer function.

Of course. But we already such dynamic functions in TOSCA, actually the most important functions: get_property and get_attribute. You would have to follow the TOSCA path, which might only be available in Day 1.

* I think that if functions are defined then **the result type must be defined**, since these are textual functions, used for data transformation, so there is no reason to have no clearly typed result or having no result. Note that the built-in functions (such as get_property) can be connected to TOSCA processor behavior which is different from these functions.

This might be our disconnect. You are thinking of functions like concat. I am thinking of functions like get_security_context.

Though even many textual functions are "dynamic". For example, a str function that converts any value to a string, or a sprintf function, etc. It seems to me that dynamic functions will be more common that strictly-typed signatures.

@calincurescu
Copy link
Author

calincurescu commented Jul 11, 2022

In this definition of custom functions I was meaning only the concat type functions.

Though even many textual functions are "dynamic". For example, a str function that converts any value to a string, or a sprintf function, etc. It seems to me that dynamic functions will be more common that strictly-typed signatures.

Agree, this was the idea with the freeform keyname.

@tliron
Copy link
Contributor

tliron commented Jul 12, 2022

See discussion in issue #68

@lauwers
Copy link
Contributor

lauwers commented Sep 1, 2022

How do we deal with optional parameters in custom function definitions? Let's use the current get_artifact intrinsic function as an example. It defines two mandatory arguments (the modellable entity name and the artifact name) and two optional arguments (the location and the remove flag). I imagine we could define multiple signatures, one for each of the possible combinations of optional arguments, as follows:

functions:
  get_artifact:
    - parameters:
      - type: string
      - type: string
    - parameters:
      - type: string
      - type: string
      - type: string
    - parameters:
      - type: string
      - type: string
      - type: string
      - type: boolean

That seems a bit cumbersome. Could we instead add support for the required keyname to the parameter definitions as follows:

functions:
  get_artifact:
    - parameters:
      - type: string
      - type: string
      - type: string
        required: False
      - type: boolean
        required: False

The validator would of course have to make sure that no required parameters are defined after optional parameters.

@lauwers
Copy link
Contributor

lauwers commented Sep 1, 2022

Even when a function defines multiple signatures, I find that in most cases I want to use the same implementation for all signatures of the same function. Would it make more sense to associate the implementation with the function as a whole rather than (or in addition to) each individual signature?

@lauwers
Copy link
Contributor

lauwers commented Sep 1, 2022

We currently support description on function signatures only, not on functions as a whole. I find that in most cases I want to associate a description with a function as a whole, rather than (or in addition to) each individual signature. On a related note, we should also support metadata on functions and function signatures.

@calincurescu
Copy link
Author

calincurescu commented Sep 5, 2022

Regarding the optional parameters for a function part, all parameters after the first optional parameter must be optional, and if optional parameter on position x is used, then all the optional parameters on positions before x must be be used too (exactly like in the multi signature get_artifact example). Letting designers set optional on each parameter lets them set an "inner" parameter optional since the usage would not be distinguishable.

I propose to use one signature-level parameter optional_from that designates from which index the parameters are optional (index 0 from first parameter including, index 1 from second parameter including, etc.). Taking the get_artifact example:

functions:
  get_artifact:
    - parameters: [string, string, string, boolean]
      optional_from: 2
      unbounded: false
      result: string
      implementation: scripts/get_artifact.py

@calincurescu
Copy link
Author

calincurescu commented Sep 5, 2022

Regarding the implementation, I think it's useful to have the possibility to give it also per signature, i.e. don't want to touch the implementation later, but extend the signature. Several signatures can still share the same artifact in the implementation specification.

Allowing the description, metadata, and implementation on the function level, then we would need to gather all the signatures under a keyname: e.g. signatures:
functions:

  get_artifact:
    signatures:
      - parameters: [string, string, string, boolean]
        optional_from: 2
        unbounded: false
        result: string
    implementation: scripts/get_artifact.py
    description: "function level description"

@lauwers
Copy link
Contributor

lauwers commented Feb 20, 2023

The proposed function definition syntax has been approved and is documented in https://docs.oasis-open.org/tosca/TOSCA/v2.0/csd05/TOSCA-v2.0-csd05.html#_Toc125468802. Issue #140 has been created to track discussions about function refinement

@lauwers lauwers closed this as completed Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants