Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Unable to get item from slice resource attributes value #16811

Closed
ralphgj opened this issue Dec 8, 2022 · 15 comments · Fixed by #20754
Closed

[processor/transform] Unable to get item from slice resource attributes value #16811

ralphgj opened this issue Dec 8, 2022 · 15 comments · Fixed by #20754
Assignees
Labels
enhancement New feature or request priority:p2 Medium processor/transform Transform processor

Comments

@ralphgj
Copy link
Contributor

ralphgj commented Dec 8, 2022

Component(s)

processor/transform

Is your feature request related to a problem? Please describe.

Hi, I'm trying to use the generated slice result by the Split function in transform processor statements. But I didn't find any working solution to access the item in slice result.

Describe the solution you'd like

Expected statements:

      - context: resource
        statements:
          - set(attributes["db.instance"], Split(attributes["net.peer.name"], ".")[0]) where attributes["net.peer.name"] != nil

or

      - context: resource
        statements:
          - set(attributes["db.instance"], Split(attributes["net.peer.name"], ".").Get(0)) where attributes["net.peer.name"] != nil

Describe alternatives you've considered

No response

Additional context

No response

@ralphgj ralphgj added enhancement New feature or request needs triage New item requiring triage labels Dec 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2022

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ralphgj ralphgj changed the title [processor/transform] Unable to get item from slice resource value [processor/transform] Unable to get item from slice resource attributes value Dec 8, 2022
@evan-bradley
Copy link
Contributor

Thanks for reporting this. You're right that the OTTL currently doesn't support indexing slices. I think the solution that would be most obvious to most users would be the first statement you suggested that uses array indexing notation. Calling a method on a slice is an interesting concept, but I'm not sure whether we'd want to introduce OOP concepts to the OTTL.

I see two potential solutions to offer slice indexing:

  1. Implement slice indexing in the OTTL grammar. This would need to work with paths with slice values and with invocations that return slices. I don't think it would need to work with slice literals. It would likely be implemented as another type of Getter.
  2. Implement an index factory function that returns the value at a particular index from a slice Getter. This usage would be less obvious to users who are used to array indexing in most programming languages, but would be a little simpler to implement.

I think the first option is likely the best path forward, but would be interested in others' opinions.

@evan-bradley evan-bradley added help wanted Extra attention is needed priority:p2 Medium processor/transform Transform processor and removed needs triage New item requiring triage labels Dec 8, 2022
@TylerHelmuth
Copy link
Member

index operator fits in with the language since we already use it with map access. Will need to determine how to make sure the grammar can differentiate between [] as a map access vs a slice access.

@ralphgj
Copy link
Contributor Author

ralphgj commented Dec 9, 2022

Thanks for providing the potential solutions. I prefer the first option too. Are there any suggestions to differentiate the slice from the map in OTTL index operator grammar?

@TylerHelmuth
Copy link
Member

It shouldn't be difficult since the invocation struct is already separate from the path struct that uses indexing. I was more talking about reading the language and comprehending that we're accessing a slice instead of a map. We need to make sure statements are easy to understand.

@ralphgj
Copy link
Contributor Author

ralphgj commented Dec 9, 2022

Great!

@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Feb 8, 2023
@TylerHelmuth TylerHelmuth removed the Stale label Feb 8, 2023
@TylerHelmuth TylerHelmuth removed the help wanted Extra attention is needed label Mar 24, 2023
@TylerHelmuth TylerHelmuth self-assigned this Mar 24, 2023
@TylerHelmuth
Copy link
Member

@evan-bradley I've started working on this and need some opinions.

When talking about indexing with OTTL I think their are 2 categories: map indexing and slice indexing. I think OTTL should make these assumptions:

  • only allow indexing with a string on pcommon.Map and map[string]interface{} types.
  • only allow indexing with an int on pcommon.Slice and []interface{}.
  • only allow indexing path and Converters.

Allowing any number of "keys" after any path or Converter is very easy for the grammar. I am running into some trouble with actually parsing the result into a meaningful Statement.

Handling indexing and Converters was pretty simple inside the parser itself (see proof of concept here), but paths raise a problem. At the moment the indexing of a path, like attributes["test"], is handled by the contexts via PathExpressionParser. This make it difficult to handle indexing on paths within the parser. Options I want to discuss:

  1. Should we continue to handle path indexing via the context? This would mean Converter indexing is handled in the parser and path indexing is handled in the contexts. This would also make it difficult to do non-literal string indexing like set(attributes[cache["temp"]], "foo"), which is something I've seen someone ask for.

  2. Should we move path indexing out of the contexts and into the parser? I think this is possible, but depending on how we do it could be easy or hard.

    If we restrict the grammar to disallow . after [], then it is quite easy, but this would be a breaking change. Doing this would also not allow stuff like metric context: set(name, datapoints[0].attributes["test"]), which feels like it could be good to allow.

    If we continue to allow . after [] then I think path parsing has to happen in both the context and the parser. The context would handle the paths and .s, and the parser would have to handle indexing. What I haven't figured out yet is how to pass the result of the indexing back into the path parser. That is hot-path execution but at the moment path parsing is during startup. I believe parts of this will need to change and we'll end up wrapping Getters in Getters and calling path parser multiple times.

@TylerHelmuth
Copy link
Member

Reflecting on this a little more, I think disallowing . after indexing is appropriate. As interesting as something like set(name, datapoints[0].attributes["test"]) looks, it breaks a current fundamental concept we have in OTTL, which is that you can go "down" in the hierarchical structure of pdata.

datapoints[0].attributes["test"] implies accessing the an attribute named "test" on first datapoint of the metric. But this would mean going from the metric context "down" into the datapoint context, which is something we have disallowed. I think it would be possible to implement (especially if we made indexing paths a responsible of the context), but it does go against our current rules.

If we do choose to continue disallowing "downward" access it would mean that [] has to terminate a path and could be handled during hot path execution similar to Converters. It would also open up a clear path to allowing other paths or Converters as index values.

@evan-bradley
Copy link
Contributor

@TylerHelmuth Thanks for the detailed writeup. I agree with the indexing assumptions you outlined, I think those are all reasonable. Based on what you've written and the decision to remove the support for sub-pathing on slice/map values in paths, I think moving path indexing into the parser is also a good call.

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Mar 31, 2023

@TylerHelmuth I worked on this some more and have more realizations to discuss:

  1. Converters continue to be easy

  2. Setting is EXTREMELY messy (so far) when indexing is moved into the parser.

    To do a set for a path that uses indexes we end up having to to a Get first so we have an object to manipulate, do all the manipulation, and then pass the final object into the context's setter. For attributes this looks like:

    • getting the entire attribute map (contexts would know only return entire Maps for the parser to index)
    • manipulating the map based on the index keys
    • passing the entire map to the context's setter to set the entire map.

    I haven't done any benchmarking yet, but I am afraid we're gonna take a performance hit for this.

  3. TraceState has to be indexible as well. Exposing this concept to the parser feels grosser than exposing pcommon.Map and map[string]interface{}

Here is a link to the setting in PoC branch.. I tested it briefly via the transformprocessor's "e2e" tests and it works, but I don't like this.

There is a lot of "leaking" happening right now with this feature that I'm beginning to feel uncomfortable with. More and more logic that I feel has to do with the underlying telemetry is seeping into the parser and I don't think it will scale well.

Before we move forward with this approach I want to try another. Instead of moving indexing into the parser for paths, we could leave the contexts as the source of truth. At the moment we only pass in string literals, but in the future if we choose to allow paths to be used as keys, we could pass in Getters which could be executed during hot path execution.

Leaving the indexing of paths to the contexts is pretty straight forward, but we'd have to decide how to handle Converters. Options I'm pondering

  1. Handle them separately. Converters would only ever be a Get and the logic for index a Get is simple. Downside is we'd be maintainer multiple index logic solutions (hopefully some of it is reuseable?) and the solution seems inconsistent. As OTLP changes the types our grammar has to handle changes and this indexing logic would have to scale as the contexts scale. Risk would be them getting out of sync.
  2. Don't let Converters be indexed. Instead, set the result of the Converter in cache and then access the value you want from cache in a second statement. This approach gives us the most consistency in OTTL but is probably a poorer user experience.

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Mar 31, 2023

Here is a very quick example of letting the context handle the indexing: 5e77d32#diff-b6a04373af227be3fc277b1626e64736223ed0f678265774a18be3cf6c6c6dbb. In the future dynamic keys could be supported by making the []keys a []StringGetters or something like that.

@kentquirk
Copy link
Member

I've just spoken with Tyler about this at length, and our takeaways are:

  • Context should handle indexing as in the previous comment; that's far preferable.
  • Converters can be indexed; the duplication of logic is worth it, but we should work to consolidate it as as much as possible.
  • We should disallow . after [] because it violates the context model; there are aggregations we can write (as in the context of metrics) for such things. But that's handled at the context level, not in the parser. We don't need to or want to change the parser about that.

@TylerHelmuth
Copy link
Member

I think we're in agreement to keep Path indexing a responsibility of the Contexts. I will work on getting that implemented and then see what is shareable between Path indexing and Converter indexing.

On the topic of . after [] in a path, after discussing with Kent and in light of keeping Path decisions up to the Context, I think we can ignore #20528, keep the grammar as is, and enforce any Path needs we have in our Contexts. Doing this avoids a breaking change, allows users of OTTL to interpret paths however they want, but still allows our contexts to enforce their own rules.

Our contexts are actually bad at doing this today. For example, you can do set(name["foo"], "bar") and the metric context won't complain, it'll simply ignore the keys and return the name. As part of enforcing not . after [], we should also start enforcing no indexing of paths that aren't allowed to be indexing. This would be start-time validation that would provide a fast-fail user experience.

@evan-bradley what do you think?

@evan-bradley
Copy link
Contributor

On the topic of . after [] in a path, after discussing with Kent and in light of keeping Path decisions up to the Context, I think we can ignore #20528, keep the grammar as is, and enforce any Path needs we have in our Contexts.

I think that makes sense given the decision to keep path parsing in the contexts. I'm not sure there will be a need for it anytime soon given the value types supported in pdata Maps, but it's good to provide more flexibility for the contexts where we can. If we can have the contexts validate on startup that no sub-pathing or indexing happens where it doesn't make sense based on the types, then there's really no motivation to do it in the parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority:p2 Medium processor/transform Transform processor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants