Skip to content

Commit

Permalink
[filebeat] First unifiedlogs implementation (#41791) (#42005)
Browse files Browse the repository at this point in the history
* First unifiedlogs implementation

* Fix date check and accept multiple predicates

* Fix cursor and date walking

* Log stderr on error

* Add 1s tick

* Refactor to do automatic backfill

* Add docs and fix resuming from interrupted backfilling

* Fix doc config example

* Add first unit tests

* wip tests

* Add stream test

* Extract test and make input stable

* Improve docs

---------

Co-authored-by: r-ung <[email protected]>
(cherry picked from commit f9a9b32)

Co-authored-by: Marc Guasch <[email protected]>
  • Loading branch information
mergify[bot] and marc-gr authored Dec 12, 2024
1 parent 80486c8 commit 63df576
Show file tree
Hide file tree
Showing 10 changed files with 1,117 additions and 1 deletion.
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Add ability to remove request trace logs from entityanalytics input. {pull}40004[40004]
- Refactor & cleanup with updates to default values and documentation. {pull}41834[41834]
- Update CEL mito extensions to v1.16.0. {pull}41727[41727]
- Add `unifiedlogs` input for MacOS. {pull}41791[41791]
- Add evaluation state dump debugging option to CEL input. {pull}41335[41335]
- Added support for retry configuration in GCS input. {issue}11580[11580] {pull}41862[41862]
- Improve S3 polling mode states registry when using list prefix option. {pull}41869[41869]
Expand Down
3 changes: 3 additions & 0 deletions filebeat/docs/filebeat-options.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ You can configure {beatname_uc} to use the following inputs:
* <<{beatname_lc}-input-syslog>>
* <<{beatname_lc}-input-tcp>>
* <<{beatname_lc}-input-udp>>
* <<{beatname_lc}-input-unifiedlogs>>
* <<{beatname_lc}-input-unix>>
* <<{beatname_lc}-input-winlog>>

Expand Down Expand Up @@ -158,6 +159,8 @@ include::inputs/input-tcp.asciidoc[]

include::inputs/input-udp.asciidoc[]

include::../../x-pack/filebeat/docs/inputs/input-unifiedlogs.asciidoc[]

include::inputs/input-unix.asciidoc[]

include::inputs/input-winlog.asciidoc[]
180 changes: 180 additions & 0 deletions x-pack/filebeat/docs/inputs/input-unifiedlogs.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
[role="xpack"]

:type: unifiedlogs

[id="{beatname_lc}-input-{type}"]
=== Unified Logs input

++++
<titleabbrev>Unified Logs</titleabbrev>
++++

NOTE: Only available for MacOS.

The unified logging system provides a comprehensive and performant API to capture
telemetry across all levels of the system. This system centralizes the storage of
log data in memory and on disk, rather than writing that data to a text-based log file.

The input interacts with the `log` command-line tool to provide access to the events.

The input starts streaming events from the current point in time unless a start date or
the `backfill` options are set. When restarted it will continue where it left off.

Alternatively, it can also do one off operations, such as:

- Stream events contained in a `.logarchive` file.
- Stream events contained in a `.tracev3` file.
- Stream events in a specific time span, by providing a specific end date.

After this one off operations complete, the input will stop.

Other configuration options can be specified to filter what events to process.

NOTE: The input can cause some duplicated events when backfilling and/or
restarting. This is caused by how the underlying fetching method works and
should be taken into account when using the input.

Example configuration:

Process all old and new logs:

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: unifiedlogs
id: unifiedlogs-id
enabled: true
backfill: true
----

Process logs with predicate filters:

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: unifiedlogs
id: unifiedlogs-id
enabled: true
predicate:
# Captures keychain.db unlock events
- 'process == "loginwindow" && sender == "Security"'
# Captures user login events
- 'process == "logind"'
# Captures command line activity run with elevated privileges
- 'process == "sudo"'
----

==== Configuration options

The `unifiedlogs` input supports the following configuration options plus the
<<{beatname_lc}-input-{type}-common-options>> described later.

[float]
==== `archive_file`

Display events stored in the given archive.
The archive must be a valid log archive bundle with the suffix `.logarchive`.

[float]
==== `trace_file`

Display events stored in the given `.tracev3` file.
In order to be decoded, the file must be contained within a valid `.logarchive`

[float]
==== `start`

Shows content starting from the provided date.
The following date/time formats are accepted:
`YYYY-MM-DD`, `YYYY-MM-DD HH:MM:SS`, `YYYY-MM-DD HH:MM:SSZZZZZ`.

[float]
==== `end`

Shows content up to the provided date.
The following date/time formats are accepted:
`YYYY-MM-DD`, `YYYY-MM-DD HH:MM:SS`, `YYYY-MM-DD HH:MM:SSZZZZZ`.

[float]
==== `predicate`

Filters messages using the provided predicate based on NSPredicate.
A compound predicate or multiple predicates can be provided as a list.

For detailed information on the use of predicate based filtering,
please refer to the https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/Predicates/Articles/pSyntax.html[Predicate Programming Guide].

[float]
==== `process`

A list of the processes on which to operate. It accepts a PID or process name.

[float]
==== `source`

Include symbol names and source line numbers for messages, if available.
Default: `false`.

[float]
==== `info`

Disable or enable info level messages.
Default: `false`.

[float]
==== `debug`

Disable or enable debug level messages.
Default: `false`.

[float]
==== `backtrace`

Disable or enable display of backtraces.
Default: `false`.

[float]
==== `signpost`

Disable or enable display of signposts.
Default: `false`.

[float]
==== `unreliable`

Annotate events with whether the log was emitted unreliably.
Default: `false`.

[float]
==== `mach_continuous_time`

Use mach continuous time timestamps rather than walltime.
Default: `false`.

[float]
==== `backfill`

If set to true the input will process all available logs since the beginning
of time the first time it starts.
Default: `false`.


[id="{beatname_lc}-input-{type}-common-options"]
include::../../../../filebeat/docs/inputs/input-common-options.asciidoc[]

[float]
=== Metrics

This input exposes metrics under the <<http-endpoint, HTTP monitoring endpoint>>.
These metrics are exposed under the `/inputs/` path. They can be used to
observe the activity of the input.

You must assign a unique `id` to the input to expose metrics.

[options="header"]
|=======
| Metric | Description
| `errors_total` | Total number of errors.
|=======

:type!:
54 changes: 54 additions & 0 deletions x-pack/filebeat/input/default-inputs/inputs_darwin.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

//go:build darwin

package inputs

import (
"github.com/elastic/beats/v7/filebeat/beater"
v2 "github.com/elastic/beats/v7/filebeat/input/v2"
"github.com/elastic/beats/v7/libbeat/beat"
"github.com/elastic/beats/v7/x-pack/filebeat/input/awscloudwatch"
"github.com/elastic/beats/v7/x-pack/filebeat/input/awss3"
"github.com/elastic/beats/v7/x-pack/filebeat/input/azureblobstorage"
"github.com/elastic/beats/v7/x-pack/filebeat/input/azureeventhub"
"github.com/elastic/beats/v7/x-pack/filebeat/input/benchmark"
"github.com/elastic/beats/v7/x-pack/filebeat/input/cel"
"github.com/elastic/beats/v7/x-pack/filebeat/input/cloudfoundry"
"github.com/elastic/beats/v7/x-pack/filebeat/input/entityanalytics"
"github.com/elastic/beats/v7/x-pack/filebeat/input/gcs"
"github.com/elastic/beats/v7/x-pack/filebeat/input/http_endpoint"
"github.com/elastic/beats/v7/x-pack/filebeat/input/httpjson"
"github.com/elastic/beats/v7/x-pack/filebeat/input/lumberjack"
"github.com/elastic/beats/v7/x-pack/filebeat/input/netflow"
"github.com/elastic/beats/v7/x-pack/filebeat/input/o365audit"
"github.com/elastic/beats/v7/x-pack/filebeat/input/salesforce"
"github.com/elastic/beats/v7/x-pack/filebeat/input/streaming"
"github.com/elastic/beats/v7/x-pack/filebeat/input/unifiedlogs"
"github.com/elastic/elastic-agent-libs/logp"
)

func xpackInputs(info beat.Info, log *logp.Logger, store beater.StateStore) []v2.Plugin {
return []v2.Plugin{
azureblobstorage.Plugin(log, store),
azureeventhub.Plugin(log),
cel.Plugin(log, store),
cloudfoundry.Plugin(),
entityanalytics.Plugin(log),
gcs.Plugin(log, store),
http_endpoint.Plugin(),
httpjson.Plugin(log, store),
o365audit.Plugin(log, store),
awss3.Plugin(store),
awscloudwatch.Plugin(),
lumberjack.Plugin(),
salesforce.Plugin(log, store),
streaming.Plugin(log, store),
streaming.PluginWebsocketAlias(log, store),
netflow.Plugin(log),
benchmark.Plugin(),
unifiedlogs.Plugin(log, store),
}
}
2 changes: 1 addition & 1 deletion x-pack/filebeat/input/default-inputs/inputs_other.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

//go:build !aix && !windows
//go:build !aix && !darwin && !windows

package inputs

Expand Down
75 changes: 75 additions & 0 deletions x-pack/filebeat/input/unifiedlogs/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

//go:build darwin

package unifiedlogs

import (
"fmt"
"strings"
"time"
)

type config struct {
showConfig
commonConfig
Backfill bool `config:"backfill"`
}

type showConfig struct {
ArchiveFile string `config:"archive_file"`
TraceFile string `config:"trace_file"`
Start string `config:"start"`
End string `config:"end"`
}

type commonConfig struct {
Predicate []string `config:"predicate"`
Process []string `config:"process"`
Source bool `config:"source"`
Info bool `config:"info"`
Debug bool `config:"debug"`
Backtrace bool `config:"backtrace"`
Signpost bool `config:"signpost"`
Unreliable bool `config:"unreliable"`
MachContinuousTime bool `config:"mach_continuous_time"`
}

func (c config) Validate() error {
if err := checkDateFormat(c.Start); err != nil {
return fmt.Errorf("start date is not valid: %w", err)
}
if err := checkDateFormat(c.End); err != nil {
return fmt.Errorf("end date is not valid: %w", err)
}
if c.ArchiveFile != "" && !strings.HasSuffix(c.ArchiveFile, ".logarchive") {
return fmt.Errorf("archive_file %v has the wrong extension", c.ArchiveFile)
}
if c.TraceFile != "" && !strings.HasSuffix(c.TraceFile, ".tracev3") {
return fmt.Errorf("trace_file %v has the wrong extension", c.TraceFile)
}
return nil
}

func defaultConfig() config {
return config{}
}

func checkDateFormat(date string) error {
if date == "" {
return nil
}
acceptedLayouts := []string{
"2006-01-02",
"2006-01-02 15:04:05",
"2006-01-02 15:04:05-0700",
}
for _, layout := range acceptedLayouts {
if _, err := time.Parse(layout, date); err == nil {
return nil
}
}
return fmt.Errorf("not a valid date, accepted layouts are: %v", acceptedLayouts)
}
Loading

0 comments on commit 63df576

Please sign in to comment.