Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] Add condition parser #14783

Closed

Conversation

TylerHelmuth
Copy link
Member

@TylerHelmuth TylerHelmuth commented Oct 7, 2022

Description:
Adds a new function to Parser that allows parsing conditionStatements, which is a new type of statement that represents only a booleanExpression. These types of statements do not include an Invocation or where token.

This new functionality can be used by any component to create complex boolean expressions and then invoke the expression during the hot path to make decisions. The routingprocessor is an example of a component that could use this functionality instead of the existing functionality, where it is forced to add a noop function to the statements.

Link to tracking Issue:
Closes #13545

Testing:
Updated unit tests

Documentation:
If we like this approach I will update documentation.

@kovrus
Copy link
Member

kovrus commented Oct 10, 2022

@TylerHelmuth Thanks for doing that! :) I think this change was inspired by OTTL usage in the routing processor #13158 (comment).

We added the OTTL support to the routing processor, as you mentioned above, by adding the Noop function to the only currently supported OTTL statement type. In the future, ideally, it would be great if the routing processor can have support for both statement types: invocation + boolean expression and boolean expression, so we can express routing conditions like resource.attributes["X-Tenant"] == "acme" or delete_key(resource.attributes, "X-Tenant") where resource.attributes["X-Tenant"] == "acme". That will be harder to do with the parser instance per statement type approach (what is introduced by this PR); one of the ways to implement it would be first parsing a routing condition with one parser and then trying another parser if the first fails. Therefore, I was thinking whether it is worth looking into changing the parser structure in a long perspective.

So currently, in this PR, we have two mutually exclusive ASTs that are expressed via conditionStatement and transformationStatement structs, two separate parser instances that handle them, and ParseConditions and ParseStamenets methods that perform statements parsing plus "analysis" of those statements, e.g. in case of transformationStatement we are checking whether a function name that was parsed corresponds to registered/implemented functions, etc. After that, we perform an evaluation of those statements in the transform, routing processors, or elsewhere.

I was thinking about the following, what if we can have a single instance of the parser that is able to parse all supported by OTTL grammar statements? So we will have one AST with all possible OTTL statements, so the grammar can look as follows:

statement:
    condition (boolean expression) 
  | transformation (ident + optional boolean expression) 

So we will have two types of parsed statements (maybe more later). After statements parsing, we can introduce a new layer of abstraction ("analysis") that would basically do all required validations of the parsed statements, e.g. functions look-ups, validation of whether the parsed statement type can be used in a certain context/component (we cannot use the condition statement in the transform processor), etc. It can be some kind of interface that has to be implemented when we create an instance of OTTL in components, but it is probably impl details. After the "analysis" phase, we will have some kind of analyzed statements that can be already evaluated. So essentially,

ottl package

ottl:
- parser -> parsed statement(s)
- analyzer -> analyzed statement(s) 


components

ottl:
- evaluation -> based on an analyzed statement

I guess this can introduce more flexibility in the adoption and usage of OTTL. Let's look at two examples:

  1. OTTL in routing processor: See the requirements for OTTL usage in this component above. Basically, the routing condition can be a transformation or condition statement, then after parsing and analyzing the routing condition we will end up with either condition or transformation analyzed statement that we will later evaluate based on its type.
  2. Future OTTL extensions: Maybe at some point, there will be a need to extend OTTL with a new statement type, e.g. some kind of a groupingStatement, etc. Imho, then we would need to extend the grammar and the analysis part without adding a new type of parser. (Maybe it is not the best example of a new statement :) since it can be a part of the transform statement).

I hope some of it makes some sense, however, I am sure that I am missing a lot of OTTL design discussions and some decisions, so this input might sound weird :)

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Oct 10, 2022

@kovrus thanks for the detailed feedback.

This PR proposes adding parsing for condition functions via a new Parser member function ParseConditions. As a result, users of the OTTL have to know what type of OTTL statement is being used (conditionStatement or transformationStatement) so that either ParseConditions or ParseStatements (which I meant to rename to ParseTransformations) can be called correctly.

If I am understanding correctly, from the routing processor's perspective, having to differentiate between a conditionStatement and transformationStatement is not preferred. You'd like the routing processor to be able to mix conditionStatements and transformationStatements in the same list, passing the full list to a designated parsing function. This is because the routing processor would like to be able to, when making a routing decision, transform the data using a condition and also making a routing condition with the condition.

The OTTL grammar could definitely be updated to make the Invocation and where keyword conditional and the parsing logic could definitely be updated to return Statements with a nil Function field. I purposely went a different route to keep the grammar simpler and because I liked the separation of concerns.

I am worried if we try to combine statement types into 1 return type from ParseStatements things could get complicated quickly and we'd end up with struct that has a lot of conditionally nil fields. Or are you suggesting the value returned by ParseStatements would be an interface that could call Execute and Execute would handle the condition and transformation itself, like @bogdandrutu suggested? That would solve the ugly pattern we have today in the transform processor of calling Condition and then Function.

I was instead anticipating that components that rely on OTTL would use either different config sections, or in the case of the routing processor, a type field that would identify an expression as Transformation or Condition.

From a technical perspective, I think it will be tricky to update parsedStatement to conditionally require Invocation, as the presence of the invocation will dictate whether or not where is necessary. I'm not sure if the grammar library allows look ahead/look behind.
There are ways to handle this.

@kovrus
Copy link
Member

kovrus commented Oct 11, 2022

If I am understanding correctly, from the routing processor's perspective, having to differentiate between a conditionStatement and transformationStatement is not preferred. You'd like the routing processor to be able to mix conditionStatements and transformationStatements in the same list, passing the full list to a designated parsing function. This is because the routing processor would like to be able to, when making a routing decision, transform the data using a condition and also making a routing condition with the condition.

Yes, that's correct.

The OTTL grammar could definitely be updated to make the Invocation and where keyword conditional and the parsing logic could definitely be updated to return Statements with a nil Function field. I purposely went a different route to keep the grammar simpler and because I liked the separation of concerns.

Not exactly what I was trying to suggest. I was thinking more of modifying the grammar in the way that the root struct of the grammar would contain valid ottl statements (invocation + bool condition, bool condition, etc.) no need to modify valid statements by nulling some of its parts. So I guess, it is basically what you have in this PR, but just with placing conditionStatement and transformationStatement under the common root struct, so then the parser can capture those structs (statements) recursively. In the next step, we can convert those parsed statements to corresponding structs that now (your recent PR with changes to the Statement struct) will implement the Statement interface. I am not sure how the conversion of ParsedStatement to Statement can look like, you are right it can be a bit ugly.

I was instead anticipating that components that rely on OTTL would use either different config sections, or in the case of the routing processor, a type field that would identify an expression as Transformation or Condition.

That sounds like a solution.

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Oct 11, 2022

So I guess, it is basically what you have in this PR, but just with placing conditionStatement and transformationStatement under the common root struct, so then the parser can capture those structs (statements) recursively.

This is interesting. Were you thinking something like

type statement struct {
	TransformationStatement transformationStatement `parser:"( @@"`
	ConditionStatement      conditionStatement      `parser:"| @@ )"`
}

type transformationStatement struct {
	Invocation  invocation         `parser:"@@"`
	WhereClause *booleanExpression `parser:"( 'where' @@ )?"`
}

type conditionStatement struct {
	BooleanExpression *booleanExpression `parser:"@@"`
}

With this structure statement acts exactly like the grammar's Value struct, where exactly one of the types of statements would be matched. I think with the new Statement interface we could remove the need for the user to know what type of statement was just parsed; .Execute could handle the logic. And .Execute would continue to handle proper execution if more statement types were added.

One thing I don't like with this type of struct is when a new statement type requires new functions on the Statement interface that transformationStatement and conditionStatement don't require. standardStatement wouldn't want to define the "one-off" function, and the interface can't be implemented by the user because ParseStatements needs to know how to use it. If standardStatement did implement it, it would be no-op for some types of statements.

@kovrus
Copy link
Member

kovrus commented Oct 11, 2022

type statement struct {
	TransformationStatement transformationStatement `parser:"( @@"`
	ConditionStatement      conditionStatement      `parser:"| @@ )"`
}

Exactly!

With this structure statement acts exactly like the grammar's Value struct, where exactly one of the types of statements would be matched.

Yes, exactly one (valid in OTTL terms) statement type would be matched.

I think with the new Statement interface we could remove the need for the user to know what type of statement was just parsed; .Execute could handle the logic. And .Execute would continue to handle proper execution if more statement types were added.

Yes, components should not be aware of the grammar and parser implementation details (parsed statements, types, etc.). However, using some statements might not make sense for some components (e.g. conditionStatement in transform processor) or even can be forbidden, thus, I was proposing some kind of "analysis" phase that can verify such cases. (i kind of briefly described it in the first message)

One thing I don't like with this type of struct is when a new statement type requires new functions on the Statement interface that transformationStatement and conditionStatement don't require.

I guess we should be fine with this abstractionStatement#Execute for most statements, but I agree, it is a valid concern.

All in all, with this pattern it would be possible to avoid adding Parser member functions like ParseConditions and other similar functions if we want to add support for new statements. Also, we wouldn't have to create an instance of participle.Parser in OTTL Parser for each new statement as well.

@TylerHelmuth
Copy link
Member Author

Once the Statement Interface PR is merged I will play with this concept.

@bogdandrutu
Copy link
Member

@TylerHelmuth expect significant changes on this to make it consistent with the Statement PR.

@TylerHelmuth
Copy link
Member Author

@bogdandrutu are you referring to the conversation @kovrus and I have been having or simply merging in main and adjusting the 2 Parse* functions?

@TylerHelmuth
Copy link
Member Author

@kovrus here is a quick draft of what it could look like if the parser knows how to handle both types of statements: #14918

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Oct 27, 2022
@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pkg/ottl] expose a parser explicitly for parsing conditions
5 participants