Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[promtail] Add support for journal matches (only conjuntions) #6656

Merged
merged 1 commit into from
Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
##### Enhancements
* [6395](https://github.com/grafana/loki/pull/6395) **DylanGuedes**: Add encoding support
* [6828](https://github.com/grafana/loki/pull/6828) **alexandre1984rj** Add the BotScore and BotScoreSrc fields once the Cloudflare API returns those two fields on the list of all available log fields.
* [6656](https://github.com/grafana/loki/pull/6656) **carlospeon**: Allow promtail to add matches to the journal reader

##### Fixes
* [6766](https://github.com/grafana/loki/pull/6766) **kavirajk**: fix(logql): Make `LabelSampleExtractor` ignore processing the line if it doesn't contain that specific label. Fixes unwrap behavior explained in the issue https://github.com/grafana/loki/issues/6713
Expand Down
4 changes: 4 additions & 0 deletions clients/pkg/promtail/scrapeconfig/scrapeconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,10 @@ type JournalTargetConfig struct {
// Path to a directory to read journal entries from. Defaults to system path
// if empty.
Path string `yaml:"path"`

// Journal matches to filter. Character (+) is not supported, only logical AND
// matches will be added.
Matches string `yaml:"matches"`
Copy link
Contributor

@dannykopping dannykopping Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pluralised but it only seems to take a single entry.
Instead of using strings.Fields can we just allow a list to be specified in YAML?

Copy link
Contributor Author

@carlospeon carlospeon Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though about it before submitted this PR, but if "github.com/coreos/go-systemd/sdjournal" support logical OR (some work can be seen in their source code) it looked a bit weird to me to add a single + character as a list item.

So finally I decided to go with a string and the idea to set here the same arguments that you can use with journalctl, for example "_TRANSPORT=kernel + PRIORITY=3", but I don't have a strong opinion which two options is better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detail 👍
I think we shouldn't let the implementation detail of a library we use bubble up to the end-user, so perhaps we can specify the matches as a list of conditions (I think this is the most ergonomic / idiomatic "YAML way"), and then provide them to the library in the way it requires.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is precisely the implementation detail of the library that promtail uses. It uses and array of matches and this decision limits the way you could set AND/OR matches. Maybe that's the cause that this library doesn't implement it at all taking into account that Journal supports AND/OR matches, and the library is just a binding of journald code.

Journal support filters with logical AND and/or logical OR, so I just though how to support that expression of filters in promtail without bringing a no ideal (IMO) decision of the library to promtail.

Anyway, the change proposed is quite simple and could be easily coded although I think it should not be done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, i've taken a look at the manpage to re-familiarise myself with the specifics of filtering.

Here's one example they cite:

   To show all fields emitted by a unit and about the unit, option
   -u/--unit= should be used.  journalctl -u name expands to a
   complex filter similar to

       _SYSTEMD_UNIT=name.service
         + UNIT=name.service _PID=1
         + OBJECT_SYSTEMD_UNIT=name.service _UID=0
         + COREDUMP_UNIT=name.service _UID=0 MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1

If a user were to try write a filter expression like the above in a promtail config, might it look like this?

  ...
  matches:
    - _SYSTEMD_UNIT=name.service
    - UNIT=name.service _PID=1
    - OBJECT_SYSTEMD_UNIT=name.service _UID=0
    - COREDUMP_UNIT=name.service _UID=0 MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1
  ...

In the documentation we could specify that each matches item is a disjunction (logical OR), and the logical AND is combining multiple filters in a single entry separated by space.

It seems the library supports adding disjunctions, but I'm not familiar enough with the code to know how much effort this will be.

I understand that you've put in a lot of effort into this already, so I'm happy to merge this as-is, but in a perfect world I think we should follow the Principle of Least Surprise here to make the filtering work as a user would expect. If journal filtering allows for disjunctions, then we should support it, too.

If you don't have the time to add disjunction support, that's ok - we can always add it as an enhancement for later; I don't want to block this useful feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the documentation we could specify that each matches item is a disjunction (logical OR), and the logical AND is combining multiple filters in a single entry separated by space.

Absolutely. Logical AND have preference and are evaluated first, so IMO that's the way to express the filters. Or going further in the "yaml way" a two levels nested list:

  ...
  matches:
    - _SYSTEMD_UNIT=name.service
    - - UNIT=name.service
      - _PID=1
    - - OBJECT_SYSTEMD_UNIT=name.service
      - _UID=0
    - - COREDUMP_UNIT=name.service
      - _UID=0 
      - MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1
  ...

In the short time, in order to user current go-systemd functionality there should be only one list item:

  ...
  matches:
    - COREDUMP_UNIT=name.service _UID=0 MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1
  ...

or

  ...
  matches:
    - - COREDUMP_UNIT=name.service
      - _UID=0 
      - MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1
  ...

which is a bit weird and we must explain it in the docs. So we have a trade off between being ready to support future functionality and a weird (IMO) configuration or have a more simple configuration and no-compatibility with future functionality.

It seems the library supports adding disjunctions, but I'm not familiar enough with the code to know how much effort this will be.

Yes, it is ready to use disjuntions, but the reader does not uses them at all. IMO the library should mimic journalctl add_matches.

My main concern here is that proposing a modification in coreos/go-systemd reader to support disjunctions will need a different JournalReaderConfig struct, maybe replacing the match list with a two levels nested list of matches, and this will break compatibility of anyone using it, like promtail.

I will reach you in your public slack so we can talk about next steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlospeon heads up, I'm on PTO from tomorrow and back on the 22nd Aug.
I'm happy for this to be merged as-is once all the comments are addressed (except for this one)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I think given the complexity of this we can stick with what we have,
Let's leave this comment unresolved so we can reference it later.
Thank you for the great discussion!

}

// SyslogTargetConfig describes a scrape config that listens for log lines over syslog.
Expand Down
20 changes: 18 additions & 2 deletions clients/pkg/promtail/targets/journal/journaltarget.go
Original file line number Diff line number Diff line change
Expand Up @@ -174,12 +174,26 @@ func journalTargetWithReader(
return nil, errors.Wrap(err, "parsing journal reader 'max_age' config value")
}

cfg := t.generateJournalConfig(journalConfigBuilder{
cb := journalConfigBuilder{
JournalPath: targetConfig.Path,
Position: position,
MaxAge: maxAge,
EntryFunc: entryFunc,
})
}

matches := strings.Fields(targetConfig.Matches)
carlospeon marked this conversation as resolved.
Show resolved Hide resolved
dannykopping marked this conversation as resolved.
Show resolved Hide resolved
for _, m := range matches {
fv := strings.Split(m, "=")
if len(fv) != 2 {
return nil, errors.New("Error parsing journal reader 'matches' config value")
}
cb.Matches = append(cb.Matches, sdjournal.Match{
Field: fv[0],
Value: fv[1],
})
}

cfg := t.generateJournalConfig(cb)
t.r, err = readerFunc(cfg)
if err != nil {
return nil, errors.Wrap(err, "creating journal reader")
Expand Down Expand Up @@ -208,6 +222,7 @@ func journalTargetWithReader(
type journalConfigBuilder struct {
JournalPath string
Position string
Matches []sdjournal.Match
MaxAge time.Duration
EntryFunc journalEntryFunc
}
Expand All @@ -221,6 +236,7 @@ func (t *JournalTarget) generateJournalConfig(

cfg := sdjournal.JournalReaderConfig{
Path: cb.JournalPath,
Matches: cb.Matches,
Formatter: t.formatter,
}

Expand Down
35 changes: 35 additions & 0 deletions clients/pkg/promtail/targets/journal/journaltarget_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -385,3 +385,38 @@ func Test_MakeJournalFields(t *testing.T) {
}
assert.Equal(t, expectedFields, receivedFields)
}

func TestJournalTarget_Matches(t *testing.T) {
carlospeon marked this conversation as resolved.
Show resolved Hide resolved
w := log.NewSyncWriter(os.Stderr)
logger := log.NewLogfmtLogger(w)

testutils.InitRandom()
dirName := "/tmp/" + testutils.RandName()
positionsFileName := dirName + "/positions.yml"

// Set the sync period to a really long value, to guarantee the sync timer
// never runs, this way we know everything saved was done through channel
// notifications when target.stop() was called.
ps, err := positions.New(logger, positions.Config{
SyncPeriod: 10 * time.Second,
PositionsFile: positionsFileName,
})
if err != nil {
t.Fatal(err)
}

client := fake.New(func() {})

cfg := scrapeconfig.JournalTargetConfig{
Matches: "UNIT=foo.service PRIORITY=1",
}

jt, err := journalTargetWithReader(NewMetrics(prometheus.NewRegistry()), logger, client, ps, "test", nil,
&cfg, newMockJournalReader, newMockJournalEntry(nil))
require.NoError(t, err)

r := jt.r.(*mockJournalReader)
matches := []sdjournal.Match{{Field: "UNIT", Value: "foo.service"}, {Field: "PRIORITY", Value: "1"}}
require.Equal(t, r.config.Matches, matches)
client.Stop()
}
5 changes: 4 additions & 1 deletion docs/sources/clients/promtail/scraping.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ scrape_configs:
json: false
max_age: 12h
path: /var/log/journal
matches: _TRANSPORT=kernel
labels:
job: systemd-journal
relabel_configs:
Expand All @@ -109,7 +110,9 @@ here for reference. The `max_age` field ensures that no older entry than the
time specified will be sent to Loki; this circumvents "entry too old" errors.
The `path` field tells Promtail where to read journal entries from. The labels
map defines a constant list of labels to add to every journal entry that Promtail
reads.
reads. The `matches` field adds journal filters. If multiple filters are specified
matching different fields, the log entries are filtered by both, if two filters
apply to the same field, then they are automatically matched as alternatives.

When the `json` field is set to `true`, messages from the journal will be
passed through the pipeline as JSON, keeping all of the original fields from the
Expand Down