[pkg/ottl] Experiment with Factory Function definition of existing Function #16571

TylerHelmuth · 2022-12-01T17:30:37Z

Description:
This PR is mainly to have a discussion about how OTTL/transformprocessor should support situations where a functionality is needed both as a factory function and as a standard function. I'm using KeepKeys to center the conversation but there will be other functions/factory functions that fit as well.

(KeepKeys should do regex matching like DeleteAllMatches but let's ignore that for now)

At the moment the OTTL has a function called KeepKeys, referenced as keep_keys in the transform processor, that takes a map and a list of string keys and drops all entries from the map that aren't in the list.

There is also a need to be able to adjust a map returned from parsing (#14938 and #9410). For this we need a Factory Function so that we can do something like merge_maps(attributes, KeepKeys(ParseJSON(body), ["log_type"]), "upsert").

Both keep_keys and KeepKeys have the same intention, but keep_keys returns nil and KeepKeys returns the map that was modified.

Non-exhaustive Options

Introduce "keep keys" functionality as a function and a factory function. The function's implementation can rely on the factory function's. This is the implementation in this PR. This solution require's users to understand the difference between a Factory Function and a Function, which is currently only a conceptual difference; nothing in the grammar enforces this (but it could).
Update the existing keep_keys function to return the value it modifies. This will allow statements like merge_maps(attributes, keep_keys(ParseJSON(body), ["log_type"]), "upsert") but it also allows stuff like set(attributes["test"], keep_keys(resource.attributes, ["log"])). Since keep_keys is not pure, it would allow modifying multiple paths in the same statement, which could get users into bad situations unexpectedly.
Remove keep_keys and only allow modifying the map with the factory function of KeepKeys. This would require KeepKeys to always be used with set. keep_keys(attributes, ["test"]) would become set(attributes, KeepKeys(attributes, ["test"])).

Personally I like option 1 in combination with a grammar/package change to make factory functions a core concept in the language. I don't think I want to go to the extreme of removing standard function's ability to return yet. But I do think we can restrict the grammar's value type to functions that start with a capital letter and the Invocation function name to start with a lower case letter. This starts to enforce a difference between the 2 types of functions (although there is still a problem of the transformprocessor being the one that actually defines the function map, but that can also be adjusted).

runforesight · 2022-12-01T17:31:51Z

Foresight Summary

Major Impacts

build-and-test-windows duration(4 seconds) has decreased 33 minutes 36 seconds compared to main branch avg(33 minutes 40 seconds).

View More Details

⭕ build-and-test-windows workflow has finished in 4 seconds (33 minutes 36 seconds less than `main` branch avg.) and finished at 5th Dec, 2022.

Job	Failed Steps	Tests
windows-unittest-matrix	- 🔗	N/A	See Details
windows-unittest	- 🔗	N/A	See Details

✅ tracegen workflow has finished in 8 minutes 34 seconds (⚠️ 5 minutes 53 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

Job	Failed Steps	Tests
build-dev	- 🔗	N/A	See Details
publish-latest	- 🔗	N/A	See Details
publish-stable	- 🔗	N/A	See Details

❌ changelog workflow has finished in 8 minutes 58 seconds (⚠️ 3 minutes 48 seconds more than `main` branch avg.) and finished at 5th Dec, 2022. 1 job failed.

Job	Failed Steps	Tests
changelog	Ensure ./.chloggen/*.yaml addition(s) 🔗	N/A	See Details

✅ check-links workflow has finished in 8 minutes 48 seconds (⚠️ 6 minutes 30 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

Job	Failed Steps	Tests
changed files	- 🔗	N/A	See Details
check-links	- 🔗	N/A	See Details

✅ load-tests workflow has finished in 21 minutes 27 seconds (⚠️ 6 minutes 31 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

Job	Failed Steps	Tests
loadtest (TestIdleMode)	- 🔗	✅ 1 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestTraceAttributesProcessor)	- 🔗	✅ 3 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestMetric10kDPS\|TestMetricsFromFile)	- 🔗	✅ 6 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestTraceBallast1kSPSWithAttrs\|TestTraceBallast1kSPSAddAttrs)	- 🔗	✅ 10 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestTraceNoBackend10kSPS\|TestTrace1kSPSWithAttrs)	- 🔗	✅ 8 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestMetricResourceProcessor\|TestTrace10kSPS)	- 🔗	✅ 12 ❌ 0 ⏭ 0 🔗	See Details
loadtest (TestBallastMemory\|TestLog10kDPS)	- 🔗	✅ 19 ❌ 0 ⏭ 0 🔗	See Details
setup-environment	- 🔗	N/A	See Details

✅ prometheus-compliance-tests workflow has finished in 20 minutes 25 seconds (⚠️ 12 minutes 4 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

Job	Failed Steps	Tests
prometheus-compliance-tests	- 🔗	✅ 21 ❌ 0 ⏭ 0 🔗	See Details

❌ build-and-test workflow has finished in 44 minutes 36 seconds (11 minutes 17 seconds less than `main` branch avg.) and finished at 5th Dec, 2022. 3 jobs failed.

Job	Failed Steps	Tests
unittest-matrix (1.19, internal)	- 🔗	✅ 592 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, extension)	- 🔗	✅ 528 ❌ 0 ⏭ 0 🔗	See Details
correctness-traces	- 🔗	✅ 0 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, internal)	- 🔗	✅ 592 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, processor)	- 🔗	✅ 1464 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, processor)	- 🔗	✅ 1464 ❌ 0 ⏭ 0 🔗	See Details
correctness-metrics	- 🔗	✅ 0 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, extension)	- 🔗	✅ 528 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, receiver-0)	- 🔗	✅ 2533 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, receiver-0)	- 🔗	✅ 2532 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, other)	- 🔗	✅ 4363 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, exporter)	- 🔗	✅ 2412 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, exporter)	- 🔗	✅ 2412 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.19, other)	- 🔗	✅ 4362 ❌ 0 ⏭ 0 🔗	See Details
unittest-matrix (1.18, receiver-1)	- 🔗	✅ 1845 ❌ 0 ⏭ 0 🔗	See Details
integration-tests	- 🔗	✅ 59 ❌ 0 ⏭ 0 🔗	See Details
setup-environment	- 🔗	N/A	See Details
check-collector-module-version	- 🔗	N/A	See Details
checks	Impi 🔗	N/A	See Details
check-codeowners	- 🔗	N/A	See Details
lint-matrix (receiver-0)	- 🔗	N/A	See Details
lint-matrix (receiver-1)	- 🔗	N/A	See Details
lint-matrix (processor)	- 🔗	N/A	See Details
lint-matrix (exporter)	- 🔗	N/A	See Details
lint-matrix (extension)	- 🔗	N/A	See Details
lint-matrix (internal)	- 🔗	N/A	See Details
lint-matrix (other)	Lint 🔗	N/A	See Details
unittest-matrix (1.19, receiver-1)	- 🔗	✅ 1845 ❌ 0 ⏭ 0 🔗	See Details
build-examples	- 🔗	N/A	See Details
lint	Interpret result 🔗	N/A	See Details
unittest (1.19)	- 🔗	N/A	See Details
unittest (1.18)	- 🔗	N/A	See Details
cross-compile	- 🔗	N/A	See Details
build-package	- 🔗	N/A	See Details
windows-msi	- 🔗	N/A	See Details
publish-dev	- 🔗	N/A	See Details
publish-check	- 🔗	N/A	See Details
publish-stable	- 🔗	N/A	See Details

🔎 See details on Foresight

^{*You can configure Foresight comments in your organization settings page.}

evan-bradley

I like this approach overall, and I agree that some guard rails would help prevent confusion. Is there a way to provide better error messages if we make the two function types distinct within the grammar? I find the grammar's error messages to be pretty cryptic in many cases.

evan-bradley · 2022-12-01T18:01:12Z

pkg/ottl/ottlfuncs/func_factory_keep_keys.go

+			return nil, fmt.Errorf("target must be a map but got %T", val)
+		}
+
+		mapVal.RemoveIf(func(key string, value pcommon.Value) bool {


Could you make it so that the factory function doesn't mutate the input? I'd like to see what the code overlap looks like if we keep it pure.

Ya I can try that out.

@evan-bradley @MikeGoldsmith done

MikeGoldsmith

I think this looks really good as it both simplifies the logic and enables it be used in other locations. I agree with @evan-bradley's suggestion of the factory itself should not mutate input params 👍🏻

TylerHelmuth · 2022-12-05T17:24:30Z

@evan-bradley I'll submit a PR with a grammar change to start enforcing function types

evan-bradley · 2022-12-05T17:43:57Z

I'll submit a PR with a grammar change to start enforcing function types

If the grammar can't be configured to print better error messages, I'd be alright with enforcing it in the parser instead.

TylerHelmuth · 2022-12-12T22:29:58Z

I am going to close this PR for now in favor of @kentquirk "temp storage" idea. I'll open a PR for that soon. We can come back to this PR in the future if needed.

We've had discussion recently around how to handle functions and converters (previously factory functions) that share the same functionality (see #16571) This PR suggests a different approach that should reduce the need to chain converters/functions. Before we chained converters together because OTTL had no place to store data except for the telemetry payload itself. While attributes could be used, it results in the need for cleanup and potentially unwanted transformations based on condition resolution. This PR introduces the concept of tmp to the logs contexts (and future contexts if we like the solution) that statements can use as a "staging" location for complex transformations. Before to handle situations like the one in #14938 we would have to chain together functions. Pretending we had KeepKeys converter the statement would look like merge_maps(attributes, KeepKeys(ParseJSON(body), ["log_type"]), "upsert"). These types of statements are tricky to write and can difficult to comprehend, especially for new users. Each Converter we add on increased the burden. Adding a single extra function, like Flatten, really makes a difference: merge_maps(attributes, Flatten(KeepKeys(ParseJSON(body), ["log_type"]), "."), "upsert") Co-authored-by: Evan Bradley <[email protected]>

TylerHelmuth requested a review from a team December 1, 2022 17:30

TylerHelmuth requested review from bogdandrutu and evan-bradley as code owners December 1, 2022 17:30

github-actions bot assigned djaglowski Dec 1, 2022

github-actions bot added the pkg/ottl label Dec 1, 2022

github-actions bot requested a review from kentquirk December 1, 2022 17:31

TylerHelmuth mentioned this pull request Dec 2, 2022

[pkg/ottl] Experiment with function map definition #16592

Closed

evan-bradley reviewed Dec 2, 2022

View reviewed changes

MikeGoldsmith reviewed Dec 5, 2022

View reviewed changes

TylerHelmuth added 2 commits December 5, 2022 10:23

Experiment with duplicate function

4092910

Make factory function pure

7469edf

TylerHelmuth force-pushed the ottl-keepkeys branch from d533ade to 7469edf Compare December 5, 2022 17:23

TylerHelmuth mentioned this pull request Dec 5, 2022

[pkg/ottl] Enforce function case #16718

Merged

TylerHelmuth closed this Dec 12, 2022

TylerHelmuth mentioned this pull request Dec 13, 2022

[pkg/ottl] Add cache path #16994

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pkg/ottl] Experiment with Factory Function definition of existing Function #16571

[pkg/ottl] Experiment with Factory Function definition of existing Function #16571

TylerHelmuth commented Dec 1, 2022

runforesight bot commented Dec 1, 2022 •

edited

Loading

`build-and-test-windows` duration(4 seconds) has decreased 33 minutes 36 seconds compared to main branch avg(33 minutes 40 seconds).

⭕ build-and-test-windows workflow has finished in 4 seconds (33 minutes 36 seconds less than `main` branch avg.) and finished at 5th Dec, 2022.

✅ tracegen workflow has finished in 8 minutes 34 seconds (⚠️ 5 minutes 53 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

❌ changelog workflow has finished in 8 minutes 58 seconds (⚠️ 3 minutes 48 seconds more than `main` branch avg.) and finished at 5th Dec, 2022. 1 job failed.

✅ check-links workflow has finished in 8 minutes 48 seconds (⚠️ 6 minutes 30 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

✅ load-tests workflow has finished in 21 minutes 27 seconds (⚠️ 6 minutes 31 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

✅ prometheus-compliance-tests workflow has finished in 20 minutes 25 seconds (⚠️ 12 minutes 4 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

❌ build-and-test workflow has finished in 44 minutes 36 seconds (11 minutes 17 seconds less than `main` branch avg.) and finished at 5th Dec, 2022. 3 jobs failed.

evan-bradley left a comment

evan-bradley Dec 1, 2022

TylerHelmuth Dec 2, 2022

TylerHelmuth Dec 5, 2022

MikeGoldsmith left a comment

TylerHelmuth commented Dec 5, 2022

evan-bradley commented Dec 5, 2022

TylerHelmuth commented Dec 12, 2022

[pkg/ottl] Experiment with Factory Function definition of existing Function #16571

[pkg/ottl] Experiment with Factory Function definition of existing Function #16571

Conversation

TylerHelmuth commented Dec 1, 2022

Non-exhaustive Options

runforesight bot commented Dec 1, 2022 • edited Loading

Foresight Summary

build-and-test-windows duration(4 seconds) has decreased 33 minutes 36 seconds compared to main branch avg(33 minutes 40 seconds).

⭕ build-and-test-windows workflow has finished in 4 seconds (33 minutes 36 seconds less than main branch avg.) and finished at 5th Dec, 2022.

✅ tracegen workflow has finished in 8 minutes 34 seconds (⚠️ 5 minutes 53 seconds more than main branch avg.) and finished at 5th Dec, 2022.

❌ changelog workflow has finished in 8 minutes 58 seconds (⚠️ 3 minutes 48 seconds more than main branch avg.) and finished at 5th Dec, 2022. 1 job failed.

✅ check-links workflow has finished in 8 minutes 48 seconds (⚠️ 6 minutes 30 seconds more than main branch avg.) and finished at 5th Dec, 2022.

✅ load-tests workflow has finished in 21 minutes 27 seconds (⚠️ 6 minutes 31 seconds more than main branch avg.) and finished at 5th Dec, 2022.

✅ prometheus-compliance-tests workflow has finished in 20 minutes 25 seconds (⚠️ 12 minutes 4 seconds more than main branch avg.) and finished at 5th Dec, 2022.

❌ build-and-test workflow has finished in 44 minutes 36 seconds (11 minutes 17 seconds less than main branch avg.) and finished at 5th Dec, 2022. 3 jobs failed.

evan-bradley left a comment

Choose a reason for hiding this comment

evan-bradley Dec 1, 2022

Choose a reason for hiding this comment

TylerHelmuth Dec 2, 2022

Choose a reason for hiding this comment

TylerHelmuth Dec 5, 2022

Choose a reason for hiding this comment

MikeGoldsmith left a comment

Choose a reason for hiding this comment

TylerHelmuth commented Dec 5, 2022

evan-bradley commented Dec 5, 2022

TylerHelmuth commented Dec 12, 2022

runforesight bot commented Dec 1, 2022 •

edited

Loading

`build-and-test-windows` duration(4 seconds) has decreased 33 minutes 36 seconds compared to main branch avg(33 minutes 40 seconds).

⭕ build-and-test-windows workflow has finished in 4 seconds (33 minutes 36 seconds less than `main` branch avg.) and finished at 5th Dec, 2022.

✅ tracegen workflow has finished in 8 minutes 34 seconds (⚠️ 5 minutes 53 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

❌ changelog workflow has finished in 8 minutes 58 seconds (⚠️ 3 minutes 48 seconds more than `main` branch avg.) and finished at 5th Dec, 2022. 1 job failed.

✅ check-links workflow has finished in 8 minutes 48 seconds (⚠️ 6 minutes 30 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

✅ load-tests workflow has finished in 21 minutes 27 seconds (⚠️ 6 minutes 31 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

✅ prometheus-compliance-tests workflow has finished in 20 minutes 25 seconds (⚠️ 12 minutes 4 seconds more than `main` branch avg.) and finished at 5th Dec, 2022.

❌ build-and-test workflow has finished in 44 minutes 36 seconds (11 minutes 17 seconds less than `main` branch avg.) and finished at 5th Dec, 2022. 3 jobs failed.