Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] Add ParseJSONIntoMap function #16340

Closed

Conversation

TylerHelmuth
Copy link
Member

Description:
Adds a new function named ParseJSONIntoMap that takes a JSON string, unmarshals it, and adds the resulting JSON object's root-level fields to the given map.

This is an initial attempt to start moving logstransformprocessor capabilities into the transform processor. It keeps the functionality to a minimal to keep things simple as we dive into the complication that is logs processing. Features like specifying exactly which attributes to add or not add (with and without wildcard/regex support) can be added later.

While we work on those enhanced features, this function gives us the basic capabilities:

  • convert a log's body into attributes
  • convert an attribute into more attributes

Removal of unwanted attributes can be handled with delete_key and delete_matching_keys in the meantime.

Link to tracking Issue:
Related to #9410
Related to #14938

Testing:
Added unit tests

Documentation:
Updated function README

@evan-bradley
Copy link
Contributor

My initial thought for handling this functionality is that we could do JSON parsing in a factory function so we can later use other factory functions to handle additional processing. So something like:

copy_map_keys(attributes, KeepMatchingKeys(ParseJSON(body), "http_.+"))

What do you think?

@TylerHelmuth
Copy link
Member Author

@evan-bradley we definitely are gonna need a "merge maps" function, some "drop/keep keys" functions, a way to parse json, and a way to turn that json back into a string. But I am worried about the complexity of all those parts and users needing to know how to use them. For common operations I'd like to avoid having to use too many functions, although I think for true flexibility we'll need to be able to handle each step via its own function/path.

I am also wondering if we can avoid parsing the body multiple times by using the Context and an accessor like body.json to get and set the body via json. The parsing could be saved in the context so the next statement that uses body.json wouldn't have to parse it again. Could have a similar accessor for attributes, something like attributes["kubernetes"].json.

For this PR I think we have to decide if we only want to expose the building blocks of log/json interaction and require users to put the pieces together to transform their logs or if it is also ok to have some functions that hide some of the complexity. No matter what we are going to have to create those building-block functions for the complex use cases.

@TylerHelmuth
Copy link
Member Author

@djaglowski what is stanza's approach? Does it provide any functions that encapsulate complexity exposed by other functions?

Copy link
Member

@kentquirk kentquirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, but I think it should be recursive and based on #16352.

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need updating now that #16352 is merged?

@evan-bradley
Copy link
Contributor

I am also wondering if we can avoid parsing the body multiple times by using the Context and an accessor like body.json to get and set the body via json.

I totally agree. I wasn't initially going to suggest it for this PR, but I would support starting with that implementation here.

For this PR I think we have to decide if we only want to expose the building blocks of log/json interaction and require users to put the pieces together to transform their logs or if it is also ok to have some functions that hide some of the complexity.

In general I would prefer to keep the OTTL fairly lean and prefer composable blocks instead of functions that perform particular operations, though I don't have a good picture of what level of complexity most users expect. It's possible that having a relatively comprehensive suite of examples for the functions could help give a picture of how the pieces fit together.

No matter what we are going to have to create those building-block functions for the complex use cases.

Could we start with those, and implement functions that simplify common use cases once we see how things look?

@@ -220,6 +221,35 @@ Examples:

- `limit(resource.attributes, 50, ["http.host", "http.method"])`

## parse_json_into_map

`parse_json_into_map(target, value)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea: I think we will end with multiple parseIntoMap that accepts multiple formats, JSON/YAML/etc. Do you think it is worth accepting the format as argument instead?

Copy link
Member

@MikeGoldsmith MikeGoldsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a query regarding a default case for the switch.

}

func setValue(value pcommon.Value, val interface{}) error {
switch v := val.(type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a default case to guard against unknown types and return an error?

@TylerHelmuth
Copy link
Member Author

Could we start with those, and implement functions that simplify common use cases once we see how things look?

@evan-bradley I'll open some PRs with the building block functions.

@TylerHelmuth
Copy link
Member Author

Closing this for now in favor of #16444

@TylerHelmuth TylerHelmuth deleted the ottl-parse-json branch April 14, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants