-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/ottl] Update statements to reflect their context #29017
Comments
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
For this issue, I was thinking of proposing the following syntax:
You could then allow statements to either be what they are today, or this new syntax which carries the context. Example:
Note: the |
@jsuereth for context, during a working session at Kubecon we imagine the solution would be to include the context in all paths. So something like: transform:
error_mode: ignore
trace_statements:
- context: resource
statements:
- keep_keys(attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
- context: span
statements:
- set(status.code, 1) where attributes["http.path"] == "/health" becomes transform:
error_mode: ignore
traces_statements:
- keep_keys(resource.attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
- set(span.status.code, 1) where span.attributes["http.path"] == "/health" And the parser could infer the context based on the "lowest" specified context in the statement. A big benefit to this solution is the ability to build the new statements from existing config. I believe there is a way to implement this solution without incurring a breaking change, leaving both configurations available until OTTL goes stable. |
Hi @TylerHelmuth! I'm not sure if you already have an implementation in mind, but I did prototype a solution for this issue locally, and it would be nice to discuss it with you. The main changes I'm proposing would be: Transform Processor Configuration: For supporting both configuration styles, we could implement a custom
Example of mixed configuration that would also work with the spiked Unmarshal function: trace_statements:
- set(span.status.code, 1) where span.attributes["http.path"] == "/health"
- set(span.status.code, 1) where span.attributes["http.path"] == "/health"
- context: spanevent
statements:
- set(attributes["domain"], "test.com") OTTL contexts: Given the context name would become part of the path, all OTTL contexts An alternative to skipping the path's context would be changing the grammar to extract the context name to another field, which demonstrated to be much more complex giving that valid contexts names vary depending on the parent's context. In addition to that, It wouldn't be useful to infer the context, as the transform processor needs to know it beforehand to then execute the proper parser. Expressing that on participle is also challenging. If we had a different syntax in mind, e.g. the suggested That extra context condition wouldn't affect other processor's OTTL usages ( Transform processor:
Considering we know at parsing time which contexts names are valid, and that each flat configuration statement line becomes a separate For the given configuration:
It would use pre-compiled regexes (per statement type) to find all valid context names on the statement. For this (?:[^"])(?P<context>span|spanevent|resource|scope)(?:\.(?P<field>\w+)) Given the above regex captured context names, it would infer the Trying parsing the statement in each context would probably be easier & safer, but in terms of performance, I think the regex-based infer mechanism - although not optimal - would be faster depending on the statement. For example, parsing a statement with a lower context
As you can see on the benchmark, the regex has an overhead, but it's still much smaller than trying to parse the statement in another context ( As I mentioned, I've ran this proposal code locally and it seems to be working fine so far, I'm still testing it, and I'm probably missing some points. It would be great if you could give it a thought and share your opinion. I'd be happy to implement any other idea you might have in mind. Thank you! |
@edmocosta thanks for the excellent investigation into this issue.
The original discussion we had at Kubecon NA 2023 was that we do want to build this concept into the grammar. We liked this idea because it makes OTTL syntax consistent everywhere. One of the problems today is that an OTTL statement (or condition) cannot be copy and pasted from 1 component to another unless you're use the context is the same. There is also nothing enforcing to components express the desired context in the same way. Requiring the context as part of the grammar would solve that problem. Another reason we liked this idea is it made paths more readable. When the current context is defined somewhere else in the config (or even the statement), then a reader must remember what the current context is when reading a path that has not prefix. But if the context must always prefix a path, then there is no ambiguity. As for the technical implementation I believe it is possible to add all this business logic somewhere in the parser/ottlcontext and not the grammar itself. I believe our idea was something like "give a path has been parsed, the first part MUST be the context. If a path is only 1 part, it is invalid", or something like that. An actual change to the grammar is probably a better solution bc we could enforce all paths have a context and capture it in its own field type path struct {
Context string `parser:"@Lowercase '.'"`
Fields []field `parser:"@@ ( '.' @@ )*"`
} Or something like that.
I believe this issue could be solved by adding some more features to OTTL's package. At Kubecon we discussed a brute force solution (because it happens at startup) where the parser infers the context to be the "lowest" context defined in the statement. In addition, the parser would need to be told which contexts to consider (and their order), and it would try parsing at the "highest" context, returning the parsed statements for the fist context that was able to successfully parse all the statements. It would probably look pretty similar to the transformprocessor's existing Using regex to identify the concept also seems reasonable, but I believe it should be a OTTL concern, not a component concern. Performance will end up being a secondary concern since this is all happening at startup. I like the concept of not needing to brute force to check which context should be used.
I really like the sound of this idea, at least for some amount of transition time. Our goal with this change was to try and not break existing OTTL users - we figured we'd be able to build the new statements using the existing config structs. It sounds like you've confirmed this is possible. As for breaking changes in OTTL's public API, we can do those all we want. If you're interested in continuing this work, I am very interested in seeing what it is like to implement a solution via the grammar instead of via the transformprocessor. |
Hi @TylerHelmuth!
Yes, I completely agree with that idea, it's simpler to read and to understand and I think that's the way to go. That's definitely not a blocker for implementing the proposed solution in a BC way, during the transition time, we could maybe handle it on the parser, and instead of using the valid contexts list to validate the path prefix, we could use it to temporary keep BC, changing the parser fields := path.Fields
...
if path.Context != "" {
if _, ok := p.validContexts[path.Context]; !ok {
// if the context is not recognised, it's likely to be part of the field name (old structure without context).
// when the transition is complete, we can just return an error here.
fields = append([]field{{Name: path.Context}}, fields...)
// maybe log some warn/deprecation message about paths without context?
}
} Pros of changing this function instead of adding the logic into the ottlcontexts are backward compatibility from the source (parser), no need to change the OTTL context's On the other other hand, the components consuming those paths would still need to handle both versions, with and without the context prefix. No validations would happen at the transition time. Do you have any thoughts on this and how it should be handled?
I see, that makes sense! As you mentioned, I think we could come up with a very similar I'm not sure yet if we can implement it totally generic, without any dependency on the ottlcontexts for example (probably not). Either way, IMO, having the ottlcontext as a dependency wouldn't be an issue considering the parser collection's purpose. I'll investigate it further during the implementation.
Agreed.
Correct! that's possible. Please let me know your thoughts! Thanks! |
I agree that a change in the grammar would be a breaking change for parsing a statement like transform:
trace_statements:
- context: span
statements:
- set(attributes["test"], "pass") Then the string passed into the parser would be changed to If we can modify the input string using the configured context then I believe we can do this change with no breaking changes to user config. There will definitely be breaking changes to OTTL APIs in places, but we aren't worried about breaking those. With this strategy components can then expose support for both the old style and the new style allowing a transition period. The component could even print out the modified statements as a way to help users transition. An additional idea to play around with would be making the
I would definitely love to avoid having to change parsePath functions. I think adding the |
Hi @TylerHelmuth! Thank you again for your patience and support, I'm still getting up to speed with the OTel codebase/way of working/standards/etc, so apologise for so many questions and perhaps suggestions that might not make much sense :)
I forgot to mention it, but that was the idea behind my previous suggestion. If we make the
We would be able to make it non-breaking to all components - without modifying them - by changing the func (p *Parser[K]) newPath(path *path) (*basePath[K], error) {
fields := path.Fields
...
if path.Context != "" {
if p.validContexts == nil || len(p.validPathContexts) == 0 {
fields = append([]field{{Name: path.Context}}, fields...)
} else {
if _, ok := p.validContexts[path.Context]; !ok {
return nil, fmt.Errorf("invalid path context %s.%s: %[1]s", path.Context, originalText)
}
}
}
...
} Given that all existing I might be wrong, but I think this should be enough to keep the parser and grammar backward compatible, and we could introduce these changes in an individual PR. The next step would be building the statements rewrite utility, so components would be able to change their config text adding the missing path's context. I'm still unsure how this utility would be implemented: using regex then appending the context perhaps, or maybe parsing the statements text into the grammar AST, changing the paths, and then parsing it back to string (somehow). Any ideas on this? Alternatively to the rewrite utility, when I mentioned
I meant (coded like this to compare to the above version): func (p *Parser[K]) newPath(path *path) (*basePath[K], error) {
fields := path.Fields
...
if path.Context != "" {
if p.validContexts == nil || len(p.validContexts) == 0 {
fields = append([]field{{Name: path.Context}}, fields...)
} else {
if _, ok := p.validContexts[path.Context]; !ok {
- return nil, fmt.Errorf("invalid path context %s.%s: %[1]s", path.Context, originalText)
+ fields = append([]field{{Name: path.Context}}, fields...)
}
}
}
...
} This change would make the parser support both versions, with and without the path's context prefix. When we finish migrating everything, we could change it back to the validation mode. Please note that the Finally, we would need to add the parser wrapper implementation, that would infer the context from the statements, pick the right parser, and execute the parse statements operation. At this point, we could introduce an extra parser config for complementing the p, err := ottl.NewParser[ottllog.TransformContext](
functions,
pep.parsePath,
telemetrySettings,
ottl.WithEnumParser[ottllog.TransformContext](parseEnum),
ottl.WithValidContexts[ottllog.TransformContext]([]string{"log"})
ottl.RequirePathsContext[ottllog.TransformContext](),
) Last change would be on the |
I am definitely interested in trying this approach.
I think it is solvable using regex and some assumptions. Like if we know the context configured is
I think this would also work, but I'd really like to be able to print out for the user what the new statement syntax should be to help them transition. Maybe the ottlcontext could do that instead of the transformprocessor manipulating the input and printing it?
This part is still the most hazy aspect of this change in my mind, but if you have and idea I'm looking forward to see it implemented. @edmocosta at this point I'm interested in seeing an implementation - it doesn't need to pass tests and can include as many breaking changes as you want. |
…a path names (#34875) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR is part of the #29017, and introduces the necessary OTTL Grammar/Parser changes to make it possible for statements express their context via path names. This first iteration changes the grammar to parse the first `path` segment as the `Context` name, for example, the path `foo.bar.field` would be parsed as the following AST example: `{Context: "foo", Fields: [{Name: "bar"}, {Name: "field"}]}`, instead of `{Fields: [{Name: "foo"},{Name: "bar"}, {Name: "field"}]}`. To make it non-breaking to all components during the transition time and development, this PR also changes the `ottl.Parser[k].newPath` function to, by default, fall this new behavior back to the previous one, using the grammar's parsed `Context` value as the first path segment (`Fields[0]`). Two new `Parser[K]` options were added, `WithPathContextNames` and `WithPathContextNameValidation`, The first option will be used to enable the statements context support, parsing the first path segment as Context when the value maches any configured context names, or falling it back otherwise. The second option will be used to enable the validation and turning off the backward compatible mode, raising an error when paths have no context prefix, or when it's invalid. For more details, please check the #29017 discussion out. **Link to tracking Issue:** #29017 **Testing:** - Unit tests **Documentation:** - No documentation changes were made at this point.
…a path names (open-telemetry#34875) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR is part of the open-telemetry#29017, and introduces the necessary OTTL Grammar/Parser changes to make it possible for statements express their context via path names. This first iteration changes the grammar to parse the first `path` segment as the `Context` name, for example, the path `foo.bar.field` would be parsed as the following AST example: `{Context: "foo", Fields: [{Name: "bar"}, {Name: "field"}]}`, instead of `{Fields: [{Name: "foo"},{Name: "bar"}, {Name: "field"}]}`. To make it non-breaking to all components during the transition time and development, this PR also changes the `ottl.Parser[k].newPath` function to, by default, fall this new behavior back to the previous one, using the grammar's parsed `Context` value as the first path segment (`Fields[0]`). Two new `Parser[K]` options were added, `WithPathContextNames` and `WithPathContextNameValidation`, The first option will be used to enable the statements context support, parsing the first path segment as Context when the value maches any configured context names, or falling it back otherwise. The second option will be used to enable the validation and turning off the backward compatible mode, raising an error when paths have no context prefix, or when it's invalid. For more details, please check the open-telemetry#29017 discussion out. **Link to tracking Issue:** open-telemetry#29017 **Testing:** - Unit tests **Documentation:** - No documentation changes were made at this point.
…a path names (open-telemetry#34875) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR is part of the open-telemetry#29017, and introduces the necessary OTTL Grammar/Parser changes to make it possible for statements express their context via path names. This first iteration changes the grammar to parse the first `path` segment as the `Context` name, for example, the path `foo.bar.field` would be parsed as the following AST example: `{Context: "foo", Fields: [{Name: "bar"}, {Name: "field"}]}`, instead of `{Fields: [{Name: "foo"},{Name: "bar"}, {Name: "field"}]}`. To make it non-breaking to all components during the transition time and development, this PR also changes the `ottl.Parser[k].newPath` function to, by default, fall this new behavior back to the previous one, using the grammar's parsed `Context` value as the first path segment (`Fields[0]`). Two new `Parser[K]` options were added, `WithPathContextNames` and `WithPathContextNameValidation`, The first option will be used to enable the statements context support, parsing the first path segment as Context when the value maches any configured context names, or falling it back otherwise. The second option will be used to enable the validation and turning off the backward compatible mode, raising an error when paths have no context prefix, or when it's invalid. For more details, please check the open-telemetry#29017 discussion out. **Link to tracking Issue:** open-telemetry#29017 **Testing:** - Unit tests **Documentation:** - No documentation changes were made at this point.
**Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR is part of #29017, and adds a grammar utility to allow extracting `ottl.path` from parsed statements and expressions. The new functions will power the context inferrer utility (see [draft](#35050)), which implementation will basically gather all declared statement's `Path.Context`, and choose the highest priority one. For the statements rewriter utility purpose, it changes the `grammar.go` file including a new field `Pos lexer.Position` into the `ottl.path` struct. This value is automatically set by the `participle` lexer and holds the path's offsets, being useful for identifying their positions in the raw statement, without using regexes an being coupled to the grammar's definition. **Additional changes**: This proposal uses the visitor pattern to gather all path's from the parsed statements. Considering the grammar's custom error mechanism (`checkForCustomError`) also works with a similar pattern, I've added the visitor implementation as part of the grammar objects, and refactored all `checkForCustomError` functions to be part of a validator visitor. Differently of the current implementation, it joins and displays all errors at once instead of one by one, IMO, it's specially useful for statements with more than one error, for example: `set(name, int(2), float(1))`. *** We can change it back do be error-by-error if necessary. *** If modifying the custom validator is not desired, I can roll those changes back keeping only the necessary code to extract the path's + `Pos` field. **Link to tracking Issue:** #29017 **Testing:** Unit tests were added **Documentation:** No changes
…-telemetry#35174) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR is part of open-telemetry#29017, and adds a grammar utility to allow extracting `ottl.path` from parsed statements and expressions. The new functions will power the context inferrer utility (see [draft](open-telemetry#35050)), which implementation will basically gather all declared statement's `Path.Context`, and choose the highest priority one. For the statements rewriter utility purpose, it changes the `grammar.go` file including a new field `Pos lexer.Position` into the `ottl.path` struct. This value is automatically set by the `participle` lexer and holds the path's offsets, being useful for identifying their positions in the raw statement, without using regexes an being coupled to the grammar's definition. **Additional changes**: This proposal uses the visitor pattern to gather all path's from the parsed statements. Considering the grammar's custom error mechanism (`checkForCustomError`) also works with a similar pattern, I've added the visitor implementation as part of the grammar objects, and refactored all `checkForCustomError` functions to be part of a validator visitor. Differently of the current implementation, it joins and displays all errors at once instead of one by one, IMO, it's specially useful for statements with more than one error, for example: `set(name, int(2), float(1))`. *** We can change it back do be error-by-error if necessary. *** If modifying the custom validator is not desired, I can roll those changes back keeping only the necessary code to extract the path's + `Pos` field. **Link to tracking Issue:** open-telemetry#29017 **Testing:** Unit tests were added **Documentation:** No changes
…-telemetry#35174) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR is part of open-telemetry#29017, and adds a grammar utility to allow extracting `ottl.path` from parsed statements and expressions. The new functions will power the context inferrer utility (see [draft](open-telemetry#35050)), which implementation will basically gather all declared statement's `Path.Context`, and choose the highest priority one. For the statements rewriter utility purpose, it changes the `grammar.go` file including a new field `Pos lexer.Position` into the `ottl.path` struct. This value is automatically set by the `participle` lexer and holds the path's offsets, being useful for identifying their positions in the raw statement, without using regexes an being coupled to the grammar's definition. **Additional changes**: This proposal uses the visitor pattern to gather all path's from the parsed statements. Considering the grammar's custom error mechanism (`checkForCustomError`) also works with a similar pattern, I've added the visitor implementation as part of the grammar objects, and refactored all `checkForCustomError` functions to be part of a validator visitor. Differently of the current implementation, it joins and displays all errors at once instead of one by one, IMO, it's specially useful for statements with more than one error, for example: `set(name, int(2), float(1))`. *** We can change it back do be error-by-error if necessary. *** If modifying the custom validator is not desired, I can roll those changes back keeping only the necessary code to extract the path's + `Pos` field. **Link to tracking Issue:** open-telemetry#29017 **Testing:** Unit tests were added **Documentation:** No changes
… paths context (#35716) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR is part of #29017, and adds the `ottl.Parser[K].AppendStatementPathsContext` function, allowing components to rewrite statements appending missing `ottl.path` context names. For examples, the following context-less statement: ``` set(value, 1) where name == attributes["foo.name"] ``` Would be rewritten using the `span` context as: ``` set(span.value, 1) where span.name == span.attributes["foo.name"] ``` **Why do we need to rewrite statements?** This utility will be used during the transition from structured OTTL statements to flat statements. Components such as the `transformprocessor` will leverage it to support both configuration styles, without forcing users to adapt/rewrite their existing config files. Once the component turns on the `ottl.Parser[K]` path's context validation, new configuration style usages will be validated, requiring all paths to have a context prefix, and old configuration styles will automatically rewrite the statements using this function. For more details, please have a look at the complete [draft](#35050) implementation. <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue #29017 <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests <!--Describe the documentation added.--> #### Documentation No changes <!--Please delete paragraphs that you did not use before submitting.-->
… paths context (open-telemetry#35716) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR is part of open-telemetry#29017, and adds the `ottl.Parser[K].AppendStatementPathsContext` function, allowing components to rewrite statements appending missing `ottl.path` context names. For examples, the following context-less statement: ``` set(value, 1) where name == attributes["foo.name"] ``` Would be rewritten using the `span` context as: ``` set(span.value, 1) where span.name == span.attributes["foo.name"] ``` **Why do we need to rewrite statements?** This utility will be used during the transition from structured OTTL statements to flat statements. Components such as the `transformprocessor` will leverage it to support both configuration styles, without forcing users to adapt/rewrite their existing config files. Once the component turns on the `ottl.Parser[K]` path's context validation, new configuration style usages will be validated, requiring all paths to have a context prefix, and old configuration styles will automatically rewrite the statements using this function. For more details, please have a look at the complete [draft](open-telemetry#35050) implementation. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue open-telemetry#29017 <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests <!--Describe the documentation added.--> #### Documentation No changes <!--Please delete paragraphs that you did not use before submitting.-->
Component(s)
pkg/ottl
Is your feature request related to a problem? Please describe.
Currently OTTL statements are independent of any context. Users must tell OTTL the context in which the statement is should be run.
This results in OTTL statements not being directly portable between components. Unless the config implementing OTTL is the same between components, the OTTL statement may not be able to be copied from one component to another.
If the OTTL statement expresses the context itself then there is no ambiguity and the statement would mean the same thing in any component.
Describe the solution you'd like
Update OTTL statements to express the context, probably via the path names. When parsing the statement, infer the context to be the lowest context defined in the statement.
The Parser needs updated to be told which contexts to consider. To start, one implementation would be to try parsing the statement in each context, starting with the "highest" context. The first context to pass parsing is used.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: