Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

feat(recombine): add combine_with option #315

Merged
merged 5 commits into from
Nov 30, 2021
Merged

feat(recombine): add combine_with option #315

merged 5 commits into from
Nov 30, 2021

Conversation

andrzej-stencel
Copy link
Member

Fixes #314

Copy link
Contributor

@pmalek-sumo pmalek-sumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@codecov
Copy link

codecov bot commented Nov 28, 2021

Codecov Report

Merging #315 (929671d) into main (2645572) will increase coverage by 0.0%.
The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main    #315   +/-   ##
=====================================
  Coverage   77.0%   77.0%           
=====================================
  Files         94      94           
  Lines       4408    4410    +2     
=====================================
+ Hits        3395    3397    +2     
  Misses       697     697           
  Partials     316     316           
Impacted Files Coverage Δ
...perator/builtin/transformer/recombine/recombine.go 71.1% <100.0%> (+0.5%) ⬆️

@djaglowski
Copy link
Member

@astencel-sumo This is a great improvement and I am fully in support of it.

That said, I think that we should have a test or two that demonstrate the configuration as it is read from an actual file, so that we can be sure there are not issues with the way a user would represent a newline character or any other character.

Would you mind adding a couple golden config tests to demonstrate the way a user would configure this?

@andrzej-stencel
Copy link
Member Author

Thanks @djaglowski for pointing this out, will add the config tests.

Please not that for those to work, I will need this change: #317. Without it, I'm getting this failure:

$ go test
--- FAIL: TestConfig (0.00s)
    --- FAIL: TestConfig/default (0.00s)
        operatortest.go:85: 
                Error Trace:    operatortest.go:85
                                                        config_test.go:33
                Error:          Not equal: 
                                expected: &recombine.RecombineOperatorConfig{TransformerConfig:helper.TransformerConfig{WriterConfig:helper.WriterConfig{BasicConfig:helper.BasicConfig{OperatorID:"recombine", OperatorType:"metadata"}, OutputIDs:helper.OutputIDs(nil)}, OnError:"send", IfExpr:""}, IsFirstEntry:"", IsLastEntry:"", MaxBatchSize:1000, CombineField:entry.Field{FieldInterface:entry.FieldInterface(nil)}, CombineWith:"\n", OverwriteWith:"oldest"}
                                actual  : &recombine.RecombineOperatorConfig{TransformerConfig:helper.TransformerConfig{WriterConfig:helper.WriterConfig{BasicConfig:helper.BasicConfig{OperatorID:"recombine", OperatorType:"recombine"}, OutputIDs:helper.OutputIDs(nil)}, OnError:"send", IfExpr:""}, IsFirstEntry:"", IsLastEntry:"", MaxBatchSize:1000, CombineField:entry.Field{FieldInterface:entry.FieldInterface(nil)}, CombineWith:"\n", OverwriteWith:"oldest"}
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -5,3 +5,3 @@
                                     OperatorID: (string) (len=9) "recombine",
                                -    OperatorType: (string) (len=8) "metadata"
                                +    OperatorType: (string) (len=9) "recombine"
                                    },
                Test:           TestConfig/default
FAIL
exit status 1
FAIL    github.com/open-telemetry/opentelemetry-log-collection/operator/builtin/transformer/recombine   1.438s

@andrzej-stencel
Copy link
Member Author

@djaglowski See my commit with config tests 8864cae.

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those config tests @astencel-sumo. Just one doc nit.

@@ -11,9 +11,10 @@ The `recombine` operator combines consecutive logs into single logs based on sim
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](/docs/types/on_error.md). |
| `is_first_entry` | | An [expression](/docs/types/expression.md) that returns true if the entry being processed is the first entry in a multiline series. |
| `is_last_entry` | | An [expression](/docs/types/expression.md) that returns true if the entry being processed is the last entry in a multiline series. |
| `combine_field` | required | The [field](/docs/types/field.md) from all the entries that will recombined with newlines. |
| `combine_field` | required | The [field](/docs/types/field.md) from all the entries that will recombined. |
| `combine_with` | `\n` | The string that is put between the combined entries. This can be an empty string as well. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - since a user should expect different results based on whether or not this value is quoted, I think we need to be precise here. Specifically, the user would have to specify "\n" in order to replicate the default behavior, so I think we should show that exact value here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I fully agree, we should be explicit about the types of the values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically, it doesn't matter if the user writes "foo" or foo. When read, the value becomes the same either way.

In this case, it does matter whether the user writes "\n" or \n. I think it is enough to show the value here that is actually interpreted the same as the default value.

Suggested change
| `combine_with` | `\n` | The string that is put between the combined entries. This can be an empty string as well. |
| `combine_with` | `"\n"` | The string that is put between the combined entries. This can be an empty string as well. |

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 649bfc8.

Also updated examples in the docs, please take a look @djaglowski.

This adds a basic test plus tests for the new `combine_with` option.
The CRI example was assuming the 'P' entries are separate lines,
which is incorrect - the entries are part of a single long line
and should be merged without the newlines in between.
Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @astencel-sumo. This is very nice improvement.

@djaglowski djaglowski merged commit 0106b0d into open-telemetry:main Nov 30, 2021
@andrzej-stencel andrzej-stencel deleted the feat-recombine-add-combine-with-option branch November 30, 2021 14:36
@andrzej-stencel
Copy link
Member Author

Any plans for a new release @djaglowski? I'd like to use it in OTC's filelog receiver.

This change should probably also go into the Changelog, shouldn't it?

@djaglowski
Copy link
Member

Yes, it should go in the changelog. I'll make a PR to prep the changelog for the release, and will cut a release as soon as that's merged, most likely can be today.

djaglowski added a commit to djaglowski/opentelemetry-log-collection that referenced this pull request Nov 30, 2021
- `combine_with` setting to `recombine` operator, to allow for joining on custom delimiter ([PR315](open-telemetry#315))
- Issue where `force_flush_period` could cause line splitting to be skipped ([PR303](open-telemetry#303))
- Issue where `tcp_input` and `udp_input` could panic when stopping ([PR273](open-telemetry#273))
- Syslog severity mapping is now aligned with log specification ([PR300](open-telemetry#300))

- Improve error message when timezone database is not found ([PR289](open-telemetry#289))
djaglowski added a commit that referenced this pull request Nov 30, 2021
- `combine_with` setting to `recombine` operator, to allow for joining on custom delimiter ([PR315](#315))
- Issue where `force_flush_period` could cause line splitting to be skipped ([PR303](#303))
- Issue where `tcp_input` and `udp_input` could panic when stopping ([PR273](#273))
- Syslog severity mapping is now aligned with log specification ([PR300](#300))

- Improve error message when timezone database is not found ([PR289](#289))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

recombine operator: Add option to combine without newlines
3 participants