TOSCA 2.0 custom function usage and definition #123

calincurescu · 2022-07-05T09:05:23Z

Custom function usage syntax and semantics:

any identifier that starts with $ is going to be treated as a function name
a function must be of the key-value pair form, where key is a string starting with $ and the value is a yaml list
- $function_a1: [arg1, arg2, …]
- $function_a2: []
  - this is a function with no arguments
nested functions are supported
- (i.e. functions in the arguments of another function will be recognized by their $ prefix and resolved, with the result passed as an argument to the outer function
- e.g.: $function_a3: [{$func_a1: […]}, arg2, …]
  - this is a nested usage of function in a function
for the occasions when the $ character to be passed as a literal, it needs to be escaped using $$
- for example:
  properties:
  prop1: $$myid
  prop2: my_value
- prop1 will be assigned with string value “$myid”

The text was updated successfully, but these errors were encountered:

calincurescu · 2022-07-05T09:07:23Z

Custom function definition syntax and semantics:

Specification of the functions is a YAML map under the functions keyname:

functions:
  <function_def>
  <function_def>
  …
  <function_def>

Specification of a <function_def> is a list of signature definitions for each function name:

  <function_name>: 
    - <signature_def>
    - <signature_def>
    - <signature_def>
    - …

Each <signature_def> is a map of following keywords definitions (where only the result is mandatory):

parameters:
  - <schema_def>
  - <schema_def>
  - <schema_def>
  - …
unbounded: <boolean>
result: <schema_def>
description: <string>
implementation: <implementation_def>

The unbounded keyname defines if the last defined parameter in the parameters list is fixed or unbounded.
- As a general rule all parameters before the last parameter must appear exactly once in the function usage
  - and in the order of their definition in the parameters list.
- If unbounded is false, the last defined parameter is considered fixed
  - i.e. it must appear exactly once in the function usage.
- If unbounded is true, any number (including 0) of the last parameter type may appear in the function usage.
  - Note that there is no possibility to confuse which parameters is which since their position uniquely identifies them.
- Default value of unbounded is false
If no implementation is specified then it’s assumed that the orchestrator needs to be preconfigured to handle the function call
The functions section can be defined both outside or inside a service_template section
- Function definitions outside a service_template can be within a profile TOSCA file or imported TOSCA file
  - Namespacing works like for types
    - Overlapping definitions under the same <function_name> are not allowed
- Function definitions inside a service_template that are having the same <function_name> are considered a refinement of the homonymous definition outside the service_template
- This allows for two separated functional design moments in function design:
  - At profile design time (outside service_template)
    - when e.g. the parameters and result is defined and can be used in the types definitions
  - At service template design time (inside service_template)
    - when the implementation or description may be added/changed
    - file references within a current CSAR can be calculated

calincurescu · 2022-07-05T09:38:28Z

Examples

functions:
  sqrt:
    - parameters:
      - type: integer
        description: This is the parameter
        constraints: [ greater_or_equal: 0] 
      result:
        type: float
        description: Returns a float
      unbounded: false
      description: takes the square root of an integer, returns a float
      implementation: scripts/sqrt.py
    - parameters:
      - type: float
        constraints: [ greater_or_equal: 0.0] 
      result:
        type: float
        description: Returns a float
      description: Takes the square root of a float, returns a float
      implementation: scripts/sqrt.py

The next sqrt is similar to above, but uses a simplified type notation (no constraints check can be expressed):

functions:
  sqrt:
    - parameters: [ integer ]
      result: float
      implementation: scripts/sqrt.py
    - parameters: [ float ]
      result: float
      implementation: scripts/sqrt.py

functions:
  my_func_with_diff_param_types:
    - parameters:
      - type: MyType1
        description: "this is the first parmeter reprez..."
      - type: string
        description: "this is the second parmeter reprez..."
      - type: string
        description: "this is the third parmeter reprez..."
      - type: MyType2
        description: "this is the unbounded parmeter (zero to infinite) reprez..."
      unbounded: true
      result: 
        type: MyTypeRez
      implementation: scripts/my.py

Same as the above, but in compact notation:

functions:
  my_func_with_diff_param_types:
    - parameters: [MyType1, string, string, MyType2]
      unbounded: true
      result: MyTypeRez
      implementation: scripts/my.py

Polymorphism allows a function to accept parameters and return results of different kinds:

functions:
  union:
    - parameters:
      - type: list
        entry_schema: integer
      unbounded: true
      result:
        type: list
        entry_schema: integer
      implementation: scripts/libpi.py
    - parameters:
      - type: list
        entry_schema: float
      unbounded: true
      result:
        type: list
        entry_schema: float
      implementation: scripts/libpi.py

Defining a list in a map parameter:

functions:
  complex_arg_function:
    - parameters: 
      - string
      - type: list
        entry_schema: integer
      - integer
      - type: map
        key_schema: string
        entry_schema:
          type: list
          entry_schema: MyType5
      - string
      unbounded: true
      result: string
      implementation: scripts/complex.py

tliron · 2022-07-07T01:15:50Z

I'm not in favor of adding this complex DSL for declaring function signatures. I'd prefer that we leave function validation up to the implementation. This proposal limits the usability of custom functions.

Indeed, some built-in functions can't be expressed by this DSL, specifically get_property and get_attribute which use a complex TOSCA path and a return type that is dependent on resolving that path. Furthermore, I imagine that a common use case for custom functions would involve accessing environmental data, where parameter and return types are not known until Day 1 or Day 2, or maybe it otherwise depends on an external tool.

So, if the implementation gets a bad parameter, it should emit an error. Also, if the return value does not match the type and constraints of its value site, there should also be an error. The advantages of checking types in design-time is small and limited: the values are often just as important as the types, and these cannot always be validated by TOSCA constrained (e.g. the ID for an external object). Validation happens in the way the TOSCA processor will implement these connected functions. This extra layer of syntactical validation is overly complex, limited in scope, and has limited benefits. Note that most dynamic programming languages do not require function signatures.

Assuming others insist on having this complex DSL, I would insist that we make it entirely optional. So, if a function is used without a declared signature it will not be validated by parsers. Call it a "signature-less" function if you will. Since we are requiring the $ prefix there is no ambivalence as to whether a map-with-one-key is as function call or not.

Also, a fix for the initial post. I think the proposed syntax is imprecise. Here's my proposal:

For TOSCA values, for any YAML map at any nested depth, the following parsing rules apply to its keys:

Is the key a string starting with $ and with length > 1? If yes:
- Is the second character $? If so, discard the first $ and stop here (escape). If yes:
  - Is this the only key? If not, emit a parsing syntax error ("malformed function"). If yes:
    - Is the value of the key a seq? If not, emit a parsing syntax error ("malformed function"). If yes:
      - This is a function call!

Note that these rules do not apply to string values that are not map keys. Thus there is no need to escape a $ prefix for them. [To fix your initial example]:

properties:
  prop1: $myid # valid as is, no need to escape
  prop2: my_value

But here we do need to escape:

properties:
  $$prop1: myid # if we don't escape we will get a "malformed function" syntax error
  prop2: my_value

calincurescu · 2022-07-11T14:11:46Z

I fully agree that the function definition should not be mandatory, if a non-defined function is encountered by the parser it is not validated/invalidated and it's assumed that the TOSCA processor implementation knows how to deal with it.
- Moreover function names in function definitions should not overlap with the built-in TOSCA functions.
In Ericsson we need such a custom function definition feature since we would like to ship custom function implementations in the CSAR, so that there is a certainty that the functions will be supported at the destination.
I think that there is a value to have a signature-less function definition w.r.t. the parameters. I propose that we use a new non-required boolean keyname: freeform.
- If freeform is true and no parameters are defined then any list can be the input to the function.
- If freeform is false and parameters are not defined then we have a function without parameters.
- It is not allowed freeform to be true and parameters to be defined.
- Default value for freeform is false.
- A function with freeform true can have only one signature.
- If we would apply a freeform to the result then there would be a complexity problem with nested functions since it would be hard to decide if the inner function freeform result should be casted to a certain type or a different function signature should be used for the outer function.
I think that if functions are defined then the result type must be defined, since these are textual functions, used for data transformation, so there is no reason to have no clearly typed result or having no result. Note that the built-in functions (such as get_property) can be connected to TOSCA processor behavior which is different from these functions.
I am fine with only checking the $ and applying escaping in the key of a key-value pair to see it's a function. Though users will then need to understand that the escaping works only for keys not for values. That is, if they have a map as a value for say a property then only the $ in the keys in that map need to be escaped.

tliron · 2022-07-11T15:45:42Z

* In Ericsson we need such a custom function definition feature since we would like to **ship custom function implementations in the CSAR**, so that there is a certainty that the functions will be supported at the destination.

"Certainty" is a very big word. :) Note that we have not established (and I don't think we should) the mechanism for calling that "implementation" artifact, whatever it is (is it a TOSCA artifact?). Khutulun, for example, uses gRPC for all delegates/plugins. At best you have certainty that other parsers would just check parameter and return types (without constraints).

And also I fail to see how this complex signature DSL really helps Ericsson. I think all you need here is a map connecting function names to the implementation "artifacts". When the implementation is called the function will be fully validated, including actual values (and not just types) with constraints. This proposed DSL adds a lot of complexity for not much tangible benefits in my view -- some limited Day 0 syntax validation for a subset of functions.

 * I think that there is a value to have a signature-less function definition w.r.t. the parameters. I propose that we use a new non-required boolean keyname: **freeform**.

Maybe "dynamic"? That's what it's called in programming languages.

  * If freeform is true and no parameters are defined then any list can be the input to the function.

"dynamic" functions should also have an unknown-typed return value. That's probably the more important differentiator.

  * If we would apply a freeform to the result then there would be a complexity problem with nested functions since it would be hard to decide if the inner function freeform result should be casted to a certain type or a different function signature should be used for the outer function.

Of course. But we already such dynamic functions in TOSCA, actually the most important functions: get_property and get_attribute. You would have to follow the TOSCA path, which might only be available in Day 1.

* I think that if functions are defined then **the result type must be defined**, since these are textual functions, used for data transformation, so there is no reason to have no clearly typed result or having no result. Note that the built-in functions (such as get_property) can be connected to TOSCA processor behavior which is different from these functions.

This might be our disconnect. You are thinking of functions like concat. I am thinking of functions like get_security_context.

Though even many textual functions are "dynamic". For example, a str function that converts any value to a string, or a sprintf function, etc. It seems to me that dynamic functions will be more common that strictly-typed signatures.

calincurescu · 2022-07-11T16:31:40Z

In this definition of custom functions I was meaning only the concat type functions.

Though even many textual functions are "dynamic". For example, a str function that converts any value to a string, or a sprintf function, etc. It seems to me that dynamic functions will be more common that strictly-typed signatures.

Agree, this was the idea with the freeform keyname.

tliron · 2022-07-12T15:42:50Z

See discussion in issue #68

lauwers · 2022-09-01T22:05:23Z

How do we deal with optional parameters in custom function definitions? Let's use the current get_artifact intrinsic function as an example. It defines two mandatory arguments (the modellable entity name and the artifact name) and two optional arguments (the location and the remove flag). I imagine we could define multiple signatures, one for each of the possible combinations of optional arguments, as follows:

functions:
  get_artifact:
    - parameters:
      - type: string
      - type: string
    - parameters:
      - type: string
      - type: string
      - type: string
    - parameters:
      - type: string
      - type: string
      - type: string
      - type: boolean

That seems a bit cumbersome. Could we instead add support for the required keyname to the parameter definitions as follows:

functions:
  get_artifact:
    - parameters:
      - type: string
      - type: string
      - type: string
        required: False
      - type: boolean
        required: False

The validator would of course have to make sure that no required parameters are defined after optional parameters.

lauwers · 2022-09-01T22:30:04Z

Even when a function defines multiple signatures, I find that in most cases I want to use the same implementation for all signatures of the same function. Would it make more sense to associate the implementation with the function as a whole rather than (or in addition to) each individual signature?

lauwers · 2022-09-01T22:32:43Z

We currently support description on function signatures only, not on functions as a whole. I find that in most cases I want to associate a description with a function as a whole, rather than (or in addition to) each individual signature. On a related note, we should also support metadata on functions and function signatures.

calincurescu · 2022-09-05T10:09:05Z

Regarding the optional parameters for a function part, all parameters after the first optional parameter must be optional, and if optional parameter on position x is used, then all the optional parameters on positions before x must be be used too (exactly like in the multi signature get_artifact example). Letting designers set optional on each parameter lets them set an "inner" parameter optional since the usage would not be distinguishable.

I propose to use one signature-level parameter optional_from that designates from which index the parameters are optional (index 0 from first parameter including, index 1 from second parameter including, etc.). Taking the get_artifact example:

functions:
  get_artifact:
    - parameters: [string, string, string, boolean]
      optional_from: 2
      unbounded: false
      result: string
      implementation: scripts/get_artifact.py

calincurescu · 2022-09-05T10:22:45Z

Regarding the implementation, I think it's useful to have the possibility to give it also per signature, i.e. don't want to touch the implementation later, but extend the signature. Several signatures can still share the same artifact in the implementation specification.

Allowing the description, metadata, and implementation on the function level, then we would need to gather all the signatures under a keyname: e.g. signatures:
functions:

  get_artifact:
    signatures:
      - parameters: [string, string, string, boolean]
        optional_from: 2
        unbounded: false
        result: string
    implementation: scripts/get_artifact.py
    description: "function level description"

lauwers · 2023-02-20T00:38:06Z

The proposed function definition syntax has been approved and is documented in https://docs.oasis-open.org/tosca/TOSCA/v2.0/csd05/TOSCA-v2.0-csd05.html#_Toc125468802. Issue #140 has been created to track discussions about function refinement

tliron mentioned this issue Jul 12, 2022

Function notation: prefix character #67

Closed

pmbruun mentioned this issue Jul 12, 2022

Harmonize condition syntax across workflows and policies #122

Open

lauwers closed this as completed Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TOSCA 2.0 custom function usage and definition #123

TOSCA 2.0 custom function usage and definition #123

calincurescu commented Jul 5, 2022 •

edited

Loading

calincurescu commented Jul 5, 2022 •

edited

Loading

calincurescu commented Jul 5, 2022 •

edited

Loading

tliron commented Jul 7, 2022 •

edited

Loading

calincurescu commented Jul 11, 2022 •

edited

Loading

tliron commented Jul 11, 2022 •

edited

Loading

calincurescu commented Jul 11, 2022 •

edited

Loading

tliron commented Jul 12, 2022

lauwers commented Sep 1, 2022

lauwers commented Sep 1, 2022

lauwers commented Sep 1, 2022

calincurescu commented Sep 5, 2022 •

edited

Loading

calincurescu commented Sep 5, 2022 •

edited

Loading

lauwers commented Feb 20, 2023

TOSCA 2.0 custom function usage and definition #123

TOSCA 2.0 custom function usage and definition #123

Comments

calincurescu commented Jul 5, 2022 • edited Loading

calincurescu commented Jul 5, 2022 • edited Loading

calincurescu commented Jul 5, 2022 • edited Loading

tliron commented Jul 7, 2022 • edited Loading

calincurescu commented Jul 11, 2022 • edited Loading

tliron commented Jul 11, 2022 • edited Loading

calincurescu commented Jul 11, 2022 • edited Loading

tliron commented Jul 12, 2022

lauwers commented Sep 1, 2022

lauwers commented Sep 1, 2022

lauwers commented Sep 1, 2022

calincurescu commented Sep 5, 2022 • edited Loading

calincurescu commented Sep 5, 2022 • edited Loading

lauwers commented Feb 20, 2023

calincurescu commented Jul 5, 2022 •

edited

Loading

calincurescu commented Jul 5, 2022 •

edited

Loading

calincurescu commented Jul 5, 2022 •

edited

Loading

tliron commented Jul 7, 2022 •

edited

Loading

calincurescu commented Jul 11, 2022 •

edited

Loading

tliron commented Jul 11, 2022 •

edited

Loading

calincurescu commented Jul 11, 2022 •

edited

Loading

calincurescu commented Sep 5, 2022 •

edited

Loading

calincurescu commented Sep 5, 2022 •

edited

Loading