Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

recombine operator: Add option to combine without newlines #314

Closed
andrzej-stencel opened this issue Nov 25, 2021 · 1 comment · Fixed by #315
Closed

recombine operator: Add option to combine without newlines #314

andrzej-stencel opened this issue Nov 25, 2021 · 1 comment · Fixed by #315

Comments

@andrzej-stencel
Copy link
Member

Currently (v0.22.0) the recombine operator always stitches entries together using a newline character. This is described here:

combine_field | required | The field from all the entries that will recombined with newlines.

and implemented here.

I propose to add an option to stitch the entries without the newline, or perhaps an option to change the stitching character/string to a different one (including none).

The reason for this proposal is to support stitching together the logs split by the Kubernetes CRI log format. For example, given this input file:

2016-10-06T00:17:09.669794202Z stdout P This is a very long line that has been spl
2016-10-06T00:17:09.669794202Z stdout F it according to the Kubernetes CRI format.

I want to be able to use the recombine operator to stitch the above two lines into one output line This is a very long line that has been split according to the Kubernetes CRI format.. Currently I'm not able to do this, instead what I'm getting is This is a very long line that has been spl\nit according to the Kubernetes CRI format.

This exact CRI scenario is in fact described in the operator's documentation, but I believe the description is misleading. It suggests that the log lines with P tag and the log line with F tag should be combined as a single multiline log message, but I believe this is incorrect. These CRI log lines should be combined as a single one-line log message. In my experience, this is a common mistake to mix up these two concepts - stitching together one long line that has been split by the container runtime in multiple entries vs. combining multiple separate log lines into a single multiline log message.

If there is another way to achieve my goal that I've missed, please kindly point me in the right direction. Otherwise, I'm happy to contribute this change as a pull request, including fixing the documentation.

@andrzej-stencel
Copy link
Member Author

I've added a proposed solution in this pull request: #315. Please let me know if it makes sense.

If this is accepted, I'll follow up with the update of the documentation for recombine, fixing the CRI format example according to my understanding and adding a separate multiline example.

djaglowski pushed a commit that referenced this issue Nov 30, 2021
* feat(recombine): add `combine_with` option

Fixes #314

* test(recombine): add config tests

This adds a basic test plus tests for the new `combine_with` option.

* docs(recombine): add double quotes around special characters

* docs(recombine): fix CRI example

The CRI example was assuming the 'P' entries are separate lines,
which is incorrect - the entries are part of a single long line
and should be merged without the newlines in between.

* docs(recombine): add multiline stack trace example
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant