Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/attributes] Add support to filter on log body #8996

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- Add `make crosslink` target to ensure replace statements are included in `go.mod` for all transitive dependencies within repository (#8822)
- `filestorageextension`: Change bbolt DB settings for better performance (#9004)
- `jaegerremotesamplingextension`: Add local and remote sampling stores (#8818)
- `attributesprocessor`: Add support to filter on log body (#8996)
atoulme marked this conversation as resolved.
Show resolved Hide resolved

### 🛑 Breaking changes 🛑

Expand Down
12 changes: 10 additions & 2 deletions internal/coreinternal/processor/filterconfig/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,10 @@ type MatchProperties struct {
// Deprecated: the Name field is removed from the log data model.
LogNames []string `mapstructure:"log_names"`

// LogBodies is a list of strings that the LogRecord's body field must match
// against.
LogBodies []string `mapstructure:"log_bodies"`

// MetricNames is a list of strings to match metric name against.
// A match occurs if metric name matches at least one item in the list.
// This field is optional.
Expand Down Expand Up @@ -123,6 +127,10 @@ func (mp *MatchProperties) ValidateForSpans() error {
return errors.New("log_names should not be specified for trace spans")
}

if len(mp.LogBodies) > 0 {
return errors.New("log_bodies should not be specified for trace spans")
}

if len(mp.Services) == 0 && len(mp.SpanNames) == 0 && len(mp.Attributes) == 0 &&
len(mp.Libraries) == 0 && len(mp.Resources) == 0 {
return errors.New(`at least one of "services", "span_names", "attributes", "libraries" or "resources" field must be specified`)
Expand All @@ -137,8 +145,8 @@ func (mp *MatchProperties) ValidateForLogs() error {
return errors.New("neither services nor span_names should be specified for log records")
}

if len(mp.Attributes) == 0 && len(mp.Libraries) == 0 && len(mp.Resources) == 0 {
return errors.New(`at least one of "attributes", "libraries" or "resources" field must be specified`)
if len(mp.Attributes) == 0 && len(mp.Libraries) == 0 && len(mp.Resources) == 0 && len(mp.LogBodies) == 0 {
return errors.New(`at least one of "attributes", "libraries", "resources" or "log_bodies" field must be specified`)
}

return nil
Expand Down
16 changes: 16 additions & 0 deletions internal/coreinternal/processor/filterlog/filterlog.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ type propertiesMatcher struct {

// log names to compare to.
nameFilters filterset.FilterSet

// log bodies to compare to.
bodyFilters filterset.FilterSet
}

// NewMatcher creates a LogRecord Matcher that matches based on the given MatchProperties.
Expand All @@ -62,20 +65,33 @@ func NewMatcher(mp *filterconfig.MatchProperties) (Matcher, error) {
return nil, fmt.Errorf("error creating log record name filters: %v", err)
}
}
var bodyFS filterset.FilterSet
if len(mp.LogBodies) > 0 {
bodyFS, err = filterset.CreateFilterSet(mp.LogBodies, &mp.Config)
if err != nil {
return nil, fmt.Errorf("error creating log record body filters: %v", err)
}
}

return &propertiesMatcher{
PropertiesMatcher: rm,
nameFilters: nameFS,
bodyFilters: bodyFS,
}, nil
}

// MatchLogRecord matches a log record to a set of properties.
// There are 3 sets of properties to match against.
// The log record names are matched, if specified.
// The log record bodies are matched, if specified.
// The attributes are then checked, if specified.
// At least one of log record names or attributes must be specified. It is
// supported to have more than one of these specified, and all specified must
// evaluate to true for a match to occur.
func (mp *propertiesMatcher) MatchLogRecord(lr pdata.LogRecord, resource pdata.Resource, library pdata.InstrumentationScope) bool {
if lr.Body().Type() == pdata.ValueTypeString && mp.bodyFilters != nil && mp.bodyFilters.Matches(lr.Body().StringVal()) {
return true
}

return mp.PropertiesMatcher.Match(lr.Attributes(), resource, library)
}
15 changes: 12 additions & 3 deletions internal/coreinternal/processor/filterlog/filterlog_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,15 @@ func TestLogRecord_validateMatchesConfiguration_InvalidConfig(t *testing.T) {
{
name: "empty_property",
property: filterconfig.MatchProperties{},
errorString: "at least one of \"attributes\", \"libraries\" or \"resources\" field must be specified",
errorString: `at least one of "attributes", "libraries", "resources" or "log_bodies" field must be specified`,
},
{
name: "empty_log_names_and_attributes",
property: filterconfig.MatchProperties{
LogNames: []string{},
LogNames: []string{},
LogBodies: []string{},
},
errorString: "at least one of \"attributes\", \"libraries\" or \"resources\" field must be specified",
errorString: `at least one of "attributes", "libraries", "resources" or "log_bodies" field must be specified`,
},
{
name: "span_properties",
Expand Down Expand Up @@ -149,10 +150,18 @@ func TestLogRecord_Matching_True(t *testing.T) {
},
},
},
{
name: "log_body_regexp_match",
properties: &filterconfig.MatchProperties{
Config: *createConfig(filterset.Regexp),
LogBodies: []string{"AUTH.*"},
},
},
}

lr := pdata.NewLogRecord()
lr.Attributes().InsertString("abc", "def")
lr.Body().SetStringVal("AUTHENTICATION FAILED")

for _, tc := range testcases {
t.Run(tc.name, func(t *testing.T) {
Expand Down
11 changes: 8 additions & 3 deletions processor/attributesprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,13 +166,13 @@ if the input data should be included or excluded from the processor. To configur
this option, under `include` and/or `exclude` at least `match_type` and one of the following
is required:
- For spans, one of `services`, `span_names`, `attributes`, `resources`, or `libraries` must be specified
with a non-empty value for a valid configuration. The `log_names`, `expressions`, `resource_attributes` and
with a non-empty value for a valid configuration. The `log_names`, `log_bodies`, `expressions`, `resource_attributes` and
`metric_names` fields are invalid.
- For logs, one of `log_names`, `attributes`, `resources`, or `libraries` must be specified with a
- For logs, one of `log_names`, `log_bodies`, `attributes`, `resources`, or `libraries` must be specified with a
non-empty value for a valid configuration. The `span_names`, `metric_names`, `expressions`, `resource_attributes`,
and `services` fields are invalid.
- For metrics, one of `metric_names`, `resources` must be specified
with a valid non-empty value for a valid configuration. The `span_names`, `log_names`, and
with a valid non-empty value for a valid configuration. The `span_names`, `log_names`, `log_bodies` and
`services` fields are invalid.


Expand Down Expand Up @@ -218,6 +218,11 @@ attributes:
# This is an optional field.
log_names: [<item1>, ..., <itemN>]

# The log body must match at least one of the items.
atoulme marked this conversation as resolved.
Show resolved Hide resolved
# Currently only string body types are supported.
# This is an optional field.
log_bodies: [<item1>, ..., <itemN>]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth mentioning that only String bodies are matched against here


# The metric name must match at least one of the items.
# This is an optional field.
metric_names: [<item1>, ..., <itemN>]
Expand Down
18 changes: 18 additions & 0 deletions processor/attributesprocessor/testdata/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,24 @@ processors:
action: update
value: "SELECT * FROM USERS [obfuscated]"


# The following demonstrates how to process logs that have a body that match regexp
# patterns. This processor will remove "token" attribute and will obfuscate "password"
# attribute in spans where body matches "AUTH.*".
attributes/log_body_regexp:
# Specifies the span properties that must exist for the processor to be applied.
include:
# match_type defines that "services" is an array of regexp-es.
match_type: regexp
# The span service name must match "auth.*" pattern.
log_bodies: ["AUTH.*"]
actions:
- key: password
action: update
value: "obfuscated"
- key: token
action: delete

receivers:
nop:

Expand Down