Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/3484 - set metric name with 'measurement' modifier for grok parser #4433

Merged
merged 56 commits into from
Aug 17, 2018
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
e12eced
input plugin that reads files each interval
maxunt Jun 21, 2018
08a11d7
change config file
maxunt Jun 21, 2018
9c4b522
tweak metric output
maxunt Jun 21, 2018
4e24a1b
add grok as a top level parser
maxunt Jun 21, 2018
ec7f131
add more test files
maxunt Jun 21, 2018
504d978
clean up some test cases
maxunt Jun 21, 2018
542c030
knock more errors from test files
maxunt Jun 21, 2018
554b960
add setparser to reader
maxunt Jun 25, 2018
36a23ea
Merge branch 'master' into plugin/reader
maxunt Jun 25, 2018
f40371e
add init function to reader
maxunt Jun 25, 2018
9c84595
add grok as a top level parser, still need README
maxunt Jun 25, 2018
cc40629
allow for import from plugins/all
maxunt Jun 25, 2018
79d9ea4
add docker-image spin up for reader
maxunt Jun 26, 2018
bbd68b3
docker will spin up
maxunt Jun 26, 2018
bf7220d
add test file to docker spin up
maxunt Jun 26, 2018
a931eb1
update DATA_FORMATS_INPUT.MD to include grok
maxunt Jun 26, 2018
e450b26
remove comments
maxunt Jun 26, 2018
001658a
condense telegraf.conf
maxunt Jun 26, 2018
7fa27f4
more condensing
maxunt Jun 26, 2018
1be2a8e
Formatting and revert Makefile
glinton Jun 26, 2018
aa750ec
add reader README.md
maxunt Jun 27, 2018
892c95a
update readmes
maxunt Jun 27, 2018
04f09d6
grok parser func unexported
maxunt Jun 28, 2018
8063b38
address some of Daniel's comments
maxunt Jul 3, 2018
bfc13a7
incomplete changes to logparser plugin
maxunt Jul 3, 2018
67db143
still unfinished logparser changes
maxunt Jul 3, 2018
8a9da28
logparser is linked to grok parser
maxunt Jul 6, 2018
cafa95e
logparser no longer uses seperate grok
maxunt Jul 6, 2018
c6087ab
add more unit tests to grok parser
maxunt Jul 6, 2018
e4b6f23
fix unit tests for grok parser
maxunt Jul 6, 2018
d224673
change logparser unit tests
maxunt Jul 9, 2018
f52ceeb
test files added for logparser
maxunt Jul 9, 2018
285cf0b
Merge branch 'master' into plugin/reader
maxunt Jul 12, 2018
0c3ac29
addresses daniel's comments
maxunt Jul 12, 2018
74900ed
change parser config names
maxunt Jul 12, 2018
d0f5389
allow for original config and functionality of logparser
maxunt Jul 12, 2018
b10f592
unfinished playing w grok parser
maxunt Jul 13, 2018
441bc41
add modifier for setting metric name for grok parser
maxunt Jul 16, 2018
d1e0c7c
Merge branch 'master' into feature/3484
maxunt Jul 16, 2018
b7ed886
unfinished config changes
maxunt Jul 17, 2018
903a977
additional test cases and README updated
maxunt Jul 17, 2018
0040530
address greg's comments
maxunt Jul 27, 2018
054c20e
fix a unit test
maxunt Jul 27, 2018
0e5e115
whips...
maxunt Jul 27, 2018
e3d9ca0
Merge branch 'master' into feature/3484
maxunt Jul 27, 2018
1b8ce4a
addresses comments and merges with master
maxunt Jul 27, 2018
797f9bd
remove reader directory
maxunt Jul 27, 2018
255e596
remove reader from all.go
maxunt Jul 27, 2018
6d49188
Merge branch 'master' into feature/3484 and change readme
maxunt Aug 16, 2018
34075e3
readme changes
maxunt Aug 16, 2018
a246c11
breaking stuff
maxunt Aug 16, 2018
4ae64bd
get rid of static measurement names, only dynamic
maxunt Aug 17, 2018
c019cfa
no longer accepts no semantic name
maxunt Aug 17, 2018
7e20044
Fix documentation
danielnelson Aug 17, 2018
0be9d85
remove more support for no semantic name
maxunt Aug 17, 2018
5b1bbbd
small fix
maxunt Aug 17, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions docs/DATA_FORMATS_INPUT.md
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,28 @@ For more information about the dropwizard json format see
#### Grok
Parse logstash-style "grok" patterns. Patterns can be added to patterns, or custom patterns read from custom_pattern_files.

Modifiers can be appended to the end of a grok field to specify how that field should be handled.
There are also timestamp modifiers, which can be used to specify the format of time data.
Available modifiers can be found below.

The 'measurement' modifier has two seperate use cases, one for static measurement names and one for
dynamic measurement names.

For setting a static measurement name, apply the 'measurement' modifier to the a single pattern in the
'patterns' field. If grok matches the pattern, the measurement name will be changed to the specified name.
So the config: `patterns = ["%{TEST:test_name:measurement}"]` would output a metric named "test_name" if grok
matches the pattern. It is important to only specify one pattern per element in the patterns array field
or an error will be thrown.
So the config: `patterns = ["%{TEST:test_name:measurement}|%{TEST2:test2_name:measurement}"]` would need to be changed
to: `patterns = ["%{TEST:test_name:measurement}","%{TEST2:test2_name:measurement}"]`

For setting a dynamic measurement name, apply the 'measurement' modifier to a value in a custom pattern.
If the pattern is matched, the measurement name will be set to the value of the field it was applied to.
Each pattern should only have one 'measurement' modifier applied to it. The modifier should only apply to fields
of a single value type, not to another grok pattern. The name of the field with the measurement modifier applied
will be ignored, so these formats are the same: `custom_patterns = {"TEST %{WORD:ignored_name:measurement}"}`
`custom_patterns = {"TEST %{WORD::measurement}"}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the patterns should behave the same regardless of if it is in the pattern or custom_pattern. Can we just say %{WORD::measurement} will use the matched text and %{TEST:test_name:measurement} will use the static value test_name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I think that should be made clear. The issue with that would be if people use the dynamic name on an entire pattern, ie not a single field, it would add the text of the whole pattern as a metric name.


# View logstash grok pattern docs here:
# https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
# All default logstash patterns are supported, these can be viewed here:
Expand All @@ -675,6 +697,7 @@ Parse logstash-style "grok" patterns. Patterns can be added to patterns, or cust
# duration (ie, 5.23ms gets converted to int nanoseconds)
# tag (converts the field into a tag)
# drop (drops the field completely)
# measurement (sets the metric name to designated field)
# Timestamp modifiers:
# ts-ansic ("Mon Jan _2 15:04:05 2006")
# ts-unix ("Mon Jan _2 15:04:05 MST 2006")
Expand Down
2 changes: 1 addition & 1 deletion plugins/inputs/file/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ use the [tail input plugin](/plugins/inputs/tail) instead.
## ** as a "super asterisk". ie:
## /var/log/**.log -> recursively find all .log files in /var/log
## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
## /var/log/apache.log -> only tail the apache log file
## /var/log/apache.log -> only read the apache log file
files = ["/var/log/apache/access.log"]

## Data format to consume.
Expand Down
2 changes: 1 addition & 1 deletion plugins/inputs/file/dev/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ services:
volumes:
- ./telegraf.conf:/telegraf.conf
- ../../../../telegraf:/telegraf
- ./json_a.log:/var/log/test.log
- ./dev/json_a.log:/var/log/test.log
entrypoint:
- /telegraf
- --config
Expand Down
14 changes: 0 additions & 14 deletions plugins/inputs/file/dev/json_a.log

This file was deleted.

9 changes: 4 additions & 5 deletions plugins/inputs/file/file.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,8 @@ import (
)

type File struct {
Files []string `toml:"files"`
FromBeginning bool
parser parsers.Parser
Files []string `toml:"files"`
parser parsers.Parser

filenames []string
}
Expand All @@ -24,7 +23,7 @@ const sampleConfig = `
## ** as a "super asterisk". ie:
## /var/log/**.log -> recursively find all .log files in /var/log
## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
## /var/log/apache.log -> only tail the apache log file
## /var/log/apache.log -> only read the apache log file
files = ["/var/log/apache/access.log"]
## The dataformat to be read from files
Expand All @@ -40,7 +39,7 @@ func (f *File) SampleConfig() string {
}

func (f *File) Description() string {
return "reload and gather from file[s] on telegraf's interval"
return "Reload and gather from file[s] on telegraf's interval."
}

func (f *File) Gather(acc telegraf.Accumulator) error {
Expand Down
12 changes: 6 additions & 6 deletions plugins/inputs/file/file_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,26 +14,26 @@ import (
func TestRefreshFilePaths(t *testing.T) {
wd, err := os.Getwd()
r := File{
Files: []string{filepath.Join(wd, "testfiles/**.log")},
Files: []string{filepath.Join(wd, "dev/testfiles/**.log")},
}

err = r.refreshFilePaths()
require.NoError(t, err)
assert.Equal(t, len(r.filenames), 2)
assert.Equal(t, 2, len(r.filenames))
}
func TestJSONParserCompile(t *testing.T) {
var acc testutil.Accumulator
wd, _ := os.Getwd()
r := File{
Files: []string{filepath.Join(wd, "testfiles/json_a.log")},
Files: []string{filepath.Join(wd, "dev/testfiles/json_a.log")},
}
parserConfig := parsers.Config{
DataFormat: "json",
TagKeys: []string{"parent_ignored_child"},
}
nParser, err := parsers.NewParser(&parserConfig)
r.parser = nParser
assert.NoError(t, err)
r.parser = nParser

r.Gather(&acc)
assert.Equal(t, map[string]string{"parent_ignored_child": "hi"}, acc.Metrics[0].Tags)
Expand All @@ -44,7 +44,7 @@ func TestGrokParser(t *testing.T) {
wd, _ := os.Getwd()
var acc testutil.Accumulator
r := File{
Files: []string{filepath.Join(wd, "testfiles/grok_a.log")},
Files: []string{filepath.Join(wd, "dev/testfiles/grok_a.log")},
}

parserConfig := parsers.Config{
Expand All @@ -57,5 +57,5 @@ func TestGrokParser(t *testing.T) {
assert.NoError(t, err)

err = r.Gather(&acc)
assert.Equal(t, 2, len(acc.Metrics))
assert.Equal(t, len(acc.Metrics), 2)
}
2 changes: 1 addition & 1 deletion plugins/inputs/logparser/logparser.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ const (

// LogParser in the primary interface for the plugin
type GrokConfig struct {
MeasurementName string `toml:"measurement"`
Patterns []string
NamedPatterns []string
CustomPatterns string
CustomPatternFiles []string
TimeZone string
MeasurementName string `toml:"measurement"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert change so there is less churn in the git history and to keep the real changes clear.

}

type logEntry struct {
Expand Down
2 changes: 1 addition & 1 deletion plugins/inputs/logparser/logparser_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ func TestGrokParseLogFiles(t *testing.T) {

logparser := &LogParserPlugin{
GrokConfig: GrokConfig{
MeasurementName: "logparser_grok",
Patterns: []string{"%{TEST_LOG_A}", "%{TEST_LOG_B}"},
CustomPatternFiles: []string{thisdir + "grok/testdata/test-patterns"},
MeasurementName: "logparser_grok",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert change

},
FromBeginning: true,
Files: []string{thisdir + "grok/testdata/*.log"},
Expand Down
59 changes: 51 additions & 8 deletions plugins/parsers/grok/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ var timeLayouts = map[string]string{
}

const (
MEASUREMENT = "measurement"
INT = "int"
TAG = "tag"
FLOAT = "float"
Expand All @@ -56,7 +57,7 @@ var (
// %{IPORHOST:clientip:tag}
// %{HTTPDATE:ts1:ts-http}
// %{HTTPDATE:ts2:ts-"02 Jan 06 15:04"}
modifierRe = regexp.MustCompile(`%{\w+:(\w+):(ts-".+"|t?s?-?\w+)}`)
modifierRe = regexp.MustCompile(`%{\w+:(\w+|):(ts-".+"|t?s?-?\w+)}`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just require a name that isn't used, similar to how timestamp works. This will make the code even simpler and we won't need to explain the new form.

// matches a plain pattern name. ie, %{NUMBER}
patternOnlyRe = regexp.MustCompile(`%{(\w+)}`)
)
Expand All @@ -74,6 +75,9 @@ type Parser struct {
Measurement string
DefaultTags map[string]string

//holds any modifiers set on named user patterns
patternModifiers map[string][]string

// Timezone is an optional component to help render log dates to
// your chosen zone.
// Default: "" which renders UTC
Expand Down Expand Up @@ -126,6 +130,7 @@ func (p *Parser) Compile() error {
p.tsMap = make(map[string]map[string]string)
p.patterns = make(map[string]string)
p.tsModder = &tsModder{}
p.patternModifiers = make(map[string][]string)
var err error
p.g, err = grok.NewWithConfig(&grok.Config{NamedCapturesOnly: true})
if err != nil {
Expand All @@ -137,12 +142,32 @@ func (p *Parser) Compile() error {
p.NamedPatterns = make([]string, 0, len(p.Patterns))
for i, pattern := range p.Patterns {
pattern = strings.TrimSpace(pattern)
if pattern == "" {
continue

//checks that there is only one named field in pattern and 2 ':' indicating a modifier
//then extract any modifiers off pattern
if strings.Count(pattern, "%") == 1 && strings.Count(pattern, ":") == 2 {
pattern = strings.Trim(pattern, "%{}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will leave a } (strings.Trim("%{doggy}:dog:perro", "%{}") returns doggy}:dog:perro), move this after SplitN and change it to trim splitPattern[0]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, i would just remove this .Trim and not re-add the %{ and } characters on line 158.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i was under the impression that this form: %{doggy}:dog:perro is not supported by the grok parser at all. It is necessary to trim the %{} characters because if not, the tag will be added as tag}.

splitPattern := strings.SplitN(pattern, ":", 3)
if pattern == "" {
continue
}
name := fmt.Sprintf("GROK_INTERNAL_PATTERN_%d", i)

//map pattern modifiers by name
p.patternModifiers["%{"+name+"}"] = splitPattern[1:3]
p.CustomPatterns += "\n" + name + " " + "%{" + splitPattern[0] + "}" + "\n"
p.NamedPatterns = append(p.NamedPatterns, "%{"+name+"}")
} else {
if strings.Contains(pattern, ":measurement}") {
return fmt.Errorf("pattern with measurement modifier must have own 'pattern' field")
}
if pattern == "" {
continue
}
name := fmt.Sprintf("GROK_INTERNAL_PATTERN_%d", i)
p.CustomPatterns += "\n" + name + " " + pattern + "\n"
p.NamedPatterns = append(p.NamedPatterns, "%{"+name+"}")
}
name := fmt.Sprintf("GROK_INTERNAL_PATTERN_%d", i)
p.CustomPatterns += "\n" + name + " " + pattern + "\n"
p.NamedPatterns = append(p.NamedPatterns, "%{"+name+"}")
}

if len(p.NamedPatterns) == 0 {
Expand Down Expand Up @@ -213,7 +238,8 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {

timestamp := time.Now()
for k, v := range values {
if k == "" || v == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still want this

if (k == "" || v == "") && p.typeMap[patternName][k] != "measurement" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

log.Printf("D! skipping key: %v", k)
continue
}

Expand All @@ -238,6 +264,8 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
}

switch t {
case MEASUREMENT:
p.Measurement = v
case INT:
iv, err := strconv.ParseInt(v, 10, 64)
if err != nil {
Expand Down Expand Up @@ -348,8 +376,17 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
}
}

//check the modifiers on the pattern
modifiers, ok := p.patternModifiers[patternName]
if ok && modifiers[1] == "measurement" {
if p.patternModifiers[patternName][0] == "" {
return nil, fmt.Errorf("pattern: %v must be named to use 'measurement' modifier", patternName)
}
p.Measurement = p.patternModifiers[patternName][0]
}

if len(fields) == 0 {
return nil, fmt.Errorf("logparser_grok: must have one or more fields")
return nil, fmt.Errorf("grok: must have one or more fields")
}

return metric.New(p.Measurement, tags, fields, p.tsModder.tsMod(timestamp))
Expand Down Expand Up @@ -453,6 +490,12 @@ func (p *Parser) parseTypedCaptures(name, pattern string) (string, error) {
}
hasTimestamp = true
} else {
//for handling measurement tag with no name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

if match[1] == "" && match[2] == "measurement" {
match[1] = "measurement_name"
//add "measurement_name" to pattern so it is valid grok
pattern = strings.Replace(pattern, "::measurement", ":measurement_name:measurement", 1)
}
p.typeMap[patternName][match[1]] = match[2]
}

Expand Down
73 changes: 73 additions & 0 deletions plugins/parsers/grok/parser_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -959,3 +959,76 @@ func TestReplaceTimestampComma(t *testing.T) {
//Convert Nanosecond to milisecond for compare
require.Equal(t, 555, m.Time().Nanosecond()/1000000)
}

func TestDynamicMeasurementModifier(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST}"},
CustomPatterns: "TEST %{NUMBER:var1:tag} %{NUMBER:var2:float} %{WORD:var3:measurement}",
}

require.NoError(t, p.Compile())
m, err := p.ParseLine("4 5 hello")
require.NoError(t, err)
require.Equal(t, m.Name(), "hello")
}

func TestStaticMeasurementModifier(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST:test_name:measurement}"},
CustomPatterns: "TEST %{NUMBER:var1:tag} %{NUMBER:var2:float} %{WORD:var3:string}",
}

require.NoError(t, p.Compile())
m, err := p.ParseLine("4 5 hello")
require.NoError(t, err)
require.Equal(t, m.Name(), "test_name")
}

func TestStaticAndDynamicMeasurementModifier(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST:test_name:measurement}"},
CustomPatterns: "TEST %{NUMBER:var1:tag} %{NUMBER:var2:float} %{WORD:var3:measurement}",
}

require.NoError(t, p.Compile())
m, err := p.ParseLine("4 5 hello")
require.NoError(t, err)
require.Equal(t, m.Name(), "test_name")
}

func TestMultipleMeasurementModifier(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST:test_name:measurement}", "%{TEST2:test2_name:measurement"},
CustomPatterns: `TEST %{NUMBER:var1:tag} %{NUMBER:var2:float} %{WORD:var_string:string}
TEST2 %{WORD:stringer1:tag} %{NUMBER:var2:float} %{NUMBER:var3:float}`,
}

require.NoError(t, p.Compile())
m, err := p.ParseLine("4 5 hello")
m2, err := p.ParseLine("mystr 5 9.5")
require.NoError(t, err)
require.Equal(t, m.Name(), "test_name")
require.Equal(t, m2.Name(), "test2_name")
}

func TestMeasurementModifierNoName(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST}"},
CustomPatterns: "TEST %{NUMBER:var1:tag} %{NUMBER:var2:float} %{WORD::measurement}",
}

require.NoError(t, p.Compile())
m, err := p.ParseLine("4 5 hello")
require.NoError(t, err)
require.Equal(t, m.Name(), "hello")
}

func TestMeasurementErrors(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST:test_name:measurement}|%{TEST2:test2_name}"},
CustomPatterns: `TEST %{NUMBER:var1:tag} %{NUMBER:var2:float} %{WORD:var_string:string}
TEST2 %{WORD:stringer1:tag} %{NUMBER:var2:float} %{NUMBER:var3:float}`,
}
err := p.Compile()
require.Error(t, err)
}