Skip to content

Commit

Permalink
Improve documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
srebhan committed Sep 21, 2023
1 parent 39f7bb3 commit a2b18d3
Show file tree
Hide file tree
Showing 3 changed files with 191 additions and 55 deletions.
243 changes: 189 additions & 54 deletions plugins/processors/regex/README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,12 @@
# Regex Processor Plugin

The `regex` plugin transforms tag and field values with regex pattern. If
`result_key` parameter is present, it can produce new tags and fields from
existing ones.
This plugin allows to transforms tag and field _values_ as well as renaming
tags, fields and metrics using regex patterns. Tag and field _values_ can be
transformed using named-groups in a batch fashion.

The regex processor **only operates on string fields**. It will not work on
any other data types, like an integer or float.

For tags transforms, if `append` is set to `true`, it will append the
transformation to the existing tag value, instead of overwriting it.

For metrics transforms, `key` denotes the element that should be
transformed. Furthermore, `result_key` allows control over the behavior applied
in case the resulting `tag` or `field` name already exists.

## Global configuration options <!-- @/docs/includes/plugin_config.md -->

In addition to the plugin-specific configuration settings, plugins support
Expand All @@ -30,74 +23,216 @@ See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
[[processors.regex]]
namepass = ["nginx_requests"]

# Tag and field conversions defined in a separate sub-tables
## Tag value conversion(s). Multiple instances are allowed.
[[processors.regex.tags]]
## Tag to change, "*" will change every tag
## Tag(s) to process with optional glob expressions such as '*'.
key = "resp_code"
## Regular expression to match on a tag value
## Regular expression to match on the tag value. If the value doesn't
## match the tag is ignored.
pattern = "^(\\d)\\d\\d$"
## Matches of the pattern will be replaced with this string. Use ${1}
## notation to use the text of the first submatch.
## Replacement expression defining the value of the target tag. Allows to
## use regexp groups or named groups e.g. ${1} references the first group.
replacement = "${1}xx"

## Name of the target tag. If not specified the 'key' tag, in case of
## wildcards it is the currently processed one, is used as target.
# result_key = "method"
## Allows to append the replacement to the target tag instead of
## overwriting it.
# append = false

## Field value conversion(s). Multiple instances are allowed.
[[processors.regex.fields]]
## Field to change
## Field(s) to process with optional glob expressions such as '*'.
key = "request"
## All the power of the Go regular expressions available here
## For example, named subgroups
## Regular expression to match on the tag value. If the value doesn't
## match or the field doesn't contain a string the field is ignored.
pattern = "^/api(?P<method>/[\\w/]+)\\S*"
## Replacement expression defining the value of the target field. Allows to
## use regexp groups or named groups e.g. ${method} references the group
## named "method".
replacement = "${method}"
## If result_key is present, a new field will be created
## instead of changing existing field
result_key = "method"

# Multiple conversions may be applied for one field sequentially
# Let's extract one more value
[[processors.regex.fields]]
key = "request"
pattern = ".*category=(\\w+).*"
replacement = "${1}"
result_key = "search_category"
## Name of the target field. If not specified the 'key' field, in case of
## wildcards it is the currently processed one, is used as target.
# result_key = "method"

# Rename metric fields
## Rename metric fields
[[processors.regex.field_rename]]
## Regular expression to match on a field name
## Regular expression to match on the field name
pattern = "^search_(\\w+)d$"
## Matches of the pattern will be replaced with this string. Use ${1}
## notation to use the text of the first submatch.
## Replacement expression defining the name of the new field
replacement = "${1}"
## If the new field name already exists, you can either "overwrite" the
## existing one with the value of the renamed field OR you can "keep"
## both the existing and source field.
# result_key = "keep"

# Rename metric tags
# [[processors.regex.tag_rename]]
# ## Regular expression to match on a tag name
# pattern = "^search_(\\w+)d$"
# ## Matches of the pattern will be replaced with this string. Use ${1}
# ## notation to use the text of the first submatch.
# replacement = "${1}"
# ## If the new tag name already exists, you can either "overwrite" the
# ## existing one with the value of the renamed tag OR you can "keep"
# ## both the existing and source tag.
# # result_key = "keep"

# Rename metrics
# [[processors.regex.metric_rename]]
# ## Regular expression to match on an metric name
# pattern = "^search_(\\w+)d$"
# ## Matches of the pattern will be replaced with this string. Use ${1}
# ## notation to use the text of the first submatch.
# replacement = "${1}"
## Rename metric tags
[[processors.regex.tag_rename]]
## Regular expression to match on a tag name
pattern = "^search_(\\w+)d$"
## Replacement expression defining the name of the new tag
replacement = "${1}"
## If the new tag name already exists, you can either "overwrite" the
## existing one with the value of the renamed tag OR you can "keep"
## both the existing and source tag.
# result_key = "keep"

## Rename metrics
[[processors.regex.metric_rename]]
## Regular expression to match on an metric name
pattern = "^search_(\\w+)d$"
## Replacement expression defining the new name of the metric
replacement = "${1}"
```

### Tag and field _value_ conversions

Values of tags and fields can be processing using the corresponding section.
Multiple `[[processors.regex.tags]]` and/or `[[processors.regex.fields]]`
sections can be specified.
Conversions are only applied if a tag/field _name_ matches the `key` which can
contain glob statements such as `*` (asterix) _and_ the `pattern` matches the
tag/field _value_. For fields the field values has to be of type `string` to
apply the conversion. If any of the given criteria does not apply the conversion
is not applied to the metric.

The `replacement` option specifies the value of the resulting tag or field. It
can reference capturing groups by index (e.g. `${1}` being the first group) or
by name (e.g. `${mygroup}` being the group named `mygroup`).

By default, the currently processed tag or field is overwritten by the
`replacement`. To create a new tag or field you can additionally specify the
`result_key` option containing the new target tag or field name. In case the
given tag or field already exists, its value is overwritten. For `tags` you
might use the `append` flag to append the `replacement` value to an existing
tag.

### Batch processing using named groups

In `tags` and `fields` sections it is possible to use named groups to create
multiple new tags or fields respectively. To do so, _all_ capture groups have
to be named in the `pattern`. Additional non-capturing ones or other
expressions are allowed. Furthermore, neither `replacement` nor `result_key`
can be set as the resulting tag/field name is the name of the group and the
value corresponds to the group's content.

### Tag and field _name_ conversions

You can batch-rename tags and fields using the `tag_rename` and `field_rename`
sections. Again, multiple sections can be specified. Contrary to the `tags` and
`fields` sections, the rename operates on the tag or field _name_, not its
_value_.

A tag or field is renamed if the given `pattern` matches the name. The new name
is specified via the `replacement` option. Optionally, the `result_key` can be
set to either `overwrite` (default) or `keep` to control the behavior in case
the target tag/field already exists. For `overwrite` the target tag/field is
replaced by the source key. With this (default) setting, the source tag/field
is removed in any case. When using the `keep` setting, the target tag/field as
well as the source is left unchanged and no renaming takes place.

### Metric _name_ conversions

Similar to the tag and field renaming, `metric_rename` section(s) can be used
to rename metrics matching the given `pattern`. The resulting metric name is
given via `replacement` option. If matching `pattern` the conversion is always
applied. The `result_key` option has no effect on metric renaming and shall
not be specified.

## Tags

No tags are applied by this processor.

## Example

In the following examples we are using this metric

```text
nginx_requests,verb=GET,resp_code=200 request="/api/search/?category=plugins&q=regex&sort=asc",referrer="-",ident="-",http_version=1.1,agent="UserAgent",client_ip="127.0.0.1",auth="-",resp_bytes=270i 1519652321000000000
```

### Explicit specification

```toml
[[processors.regex]]
namepass = ["nginx_requests"]

[[processors.regex.tags]]
key = "resp_code"
pattern = "^(\\d)\\d\\d$"
replacement = "${1}xx"

[[processors.regex.fields]]
key = "request"
pattern = "^/api(?P<method>/[\\w/]+)\\S*"
replacement = "${method}"
result_key = "method"

[[processors.regex.fields]]
key = "request"
pattern = ".*category=(\\w+).*"
replacement = "${1}"
result_key = "search_category"

[[processors.regex.field_rename]]
pattern = "^client_(\\w+)$"
replacement = "${1}"
```

will result in

```text
nginx_requests,verb=GET,resp_code=2xx request="/api/search/?category=plugins&q=regex&sort=asc",method="/search/",category="plugins",referrer="-",ident="-",http_version=1.1,agent="UserAgent",ip="127.0.0.1",auth="-",resp_bytes=270i 1519652321000000000
```

### Appending

```toml
[[processors.regex]]
namepass = ["nginx_requests"]

[[processors.regex.tags]]
key = "resp_code"
pattern = '^2\d\d$'
replacement = " OK"
result_key = "verb"
append = true
```

will result in

```text
nginx_requests,verb=GET\ OK,resp_code=200 request="/api/search/?category=plugins&q=regex&sort=asc",referrer="-",ident="-",http_version=1.1,agent="UserAgent",client_ip="127.0.0.1",auth="-",resp_bytes=270i 1519652321000000000
```

### Named groups

```toml
[[processors.regex]]
namepass = ["nginx_requests"]

[[processors.regex.fields]]
key = "request"
pattern = '^/api/(?P<method>\w+)[/?].*category=(?P<category>\w+)&(?:.*)'
```

will result in

```text
nginx_requests,verb=GET,resp_code=200 request="/api/search/?category=plugins&q=regex&sort=asc",method="search",category="plugins",referrer="-",ident="-",http_version=1.1,agent="UserAgent",client_ip="127.0.0.1",auth="-",resp_bytes=270i 1519652321000000000
```

### Metric renaming

```toml
[[processors.regex]]
[[processors.regex.metric_rename]]
pattern = '^(\w+)_.*$'
replacement = "${1}"
```

will result in

```text
nginx_requests,verb=GET,resp_code=2xx request="/api/search/?category=plugins&q=regex&sort=asc",method="/search/",category="plugins",referrer="-",ident="-",http_version=1.1,agent="UserAgent",client_ip="127.0.0.1",auth="-",resp_bytes=270i 1519652321000000000
nginx,verb=GET,resp_code=200 request="/api/search/?category=plugins&q=regex&sort=asc",referrer="-",ident="-",http_version=1.1,agent="UserAgent",client_ip="127.0.0.1",auth="-",resp_bytes=270i 1519652321000000000
```
1 change: 1 addition & 0 deletions plugins/processors/regex/converter.go
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ func (c *converter) applyFields(m telegraf.Metric) {
continue
}

// Handle explicit replacements
newKey := field.Key
if c.ResultKey != "" {
newKey = c.ResultKey
Expand Down
2 changes: 1 addition & 1 deletion plugins/processors/regex/regex_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -776,7 +776,7 @@ func TestNamedGroups(t *testing.T) {
Fields: []converter{
{
Key: "request",
Pattern: `^/api/(?P<method>[\w/]+)/.*category=(?P<search_category>\w+).*`,
Pattern: `^/api/(?P<method>\w+)[/?].*category=(?P<search_category>\w+)&(?:.*)`,
},
},
Log: testutil.Logger{},
Expand Down

0 comments on commit a2b18d3

Please sign in to comment.