Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promtail: Autodetect log format #380

Closed

Conversation

steven-sheehy
Copy link
Contributor

@steven-sheehy steven-sheehy commented Mar 7, 2019

Fixes #357

  • Adds a new EntryParser "auto" that tries each parser and chooses the first one that parses successfully
  • Order of autodetection is: Docker -> CRI -> Raw
  • Auto detection is only performed on the first log entry of that file then cached to avoid the performance overhead
  • Create a new EntryHandler for every file target since each file could potentially be a different log format
  • Adds test cases for auto and raw parser
  • Switches default parser to auto both in code and in Helm chart

cc @slim-bean @tomwilkie

@steven-sheehy
Copy link
Contributor Author

Not sure why it failed linting. Looks to be complaining about code in vendor. Can someone re-trigger?

@steven-sheehy
Copy link
Contributor Author

steven-sheehy commented Mar 8, 2019

These tests are pretty flaky. lint was failing, now it is successful. Tests were successful, now they're failing. Tests succeed locally. 😞

@slim-bean
Copy link
Collaborator

yeah i fixed one intermittent test problem yesterday and there seems to be another yet. additionally the linter has always seemed flaky.

I kicked it again and hopefully it passes this time.

I may be working on something else today and won't get to dig into this, but at first glance my only feedback would be to add some kind of persistence or caching to the detected format so that we don't have to iterate through the formats for every log entry? Maybe there is a way to store what parser worked and just continue to use that for the duration?

@steven-sheehy
Copy link
Contributor Author

Failed again. 😢

add some kind of persistence or caching to the detected format so that we don't have to iterate through the formats for every log entry?

The current implementation does cache it for the remainder of the log file. It stores the autodetected parser as a variable in the closure.

@slim-bean
Copy link
Collaborator

@steven-sheehy would like to chat with you about this and some other ongoing work to try to coordinate things a little, would you be able to join our public slack grafana.slack.com and ping me @ewelch ?

@steven-sheehy
Copy link
Contributor Author

Tests finally successful! 🎉

@daixiang0 daixiang0 mentioned this pull request Apr 2, 2019
Signed-off-by: Steven Sheehy <[email protected]>
@@ -201,7 +203,7 @@ func (s *syncer) sync(groups []*targetgroup.Group) {
}

func (s *syncer) newTarget(path string, labels model.LabelSet) (*FileTarget, error) {
return NewFileTarget(s.log, s.entryHandler, s.positions, path, labels, s.targetConfig)
return NewFileTarget(s.log, s.entryParser.Wrap(s.entryHandler), s.positions, path, labels, s.targetConfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I re-detect at a deeper level than FileTarget?

@steven-sheehy steven-sheehy changed the title Autodetect log format Promtail: Autodetect log format Apr 26, 2019
@steven-sheehy
Copy link
Contributor Author

Closing due to the complexity needed to support this for pipeline stages and the improvements made since this in configuring the default parser.

periklis pushed a commit to periklis/loki that referenced this pull request Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto detect log format
2 participants