Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include_fields processor does not work with dotted keys #5916

Closed
wants to merge 3 commits into from

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Dec 19, 2017

The include_fields processor does not work in case the key contains a dot. See tests added here for this case.

There are different solutions here:

  • Modify GetValue in common.MapStr to also look for keys with dots inside instead of directly walking the tree
  • Modifying jolokia fetcher to convert dot keys to objects

The first change would be possible with adding

--- a/libbeat/common/mapstr.go
+++ b/libbeat/common/mapstr.go
@@ -296,6 +296,10 @@ func walkMap(key string, data MapStr, op mapStrOperation) (interface{}, error) {
        var err error
        keyParts := strings.Split(key, ".")

+       if v, exists := data[key]; exists {
+               return v, nil
+       }
+

But that does not fully solve the problem. As in the processor the following code is used, the fields which are added after filtering to the filtered fields, will have a nesting without the docs (see Put method)

	for _, field := range f.Fields {
		v, err := event.GetValue(field)
		if err == nil {
			_, err = filtered.Put(field, v)
		}

		// Ignore ErrKeyNotFound errors
		if err != nil && errors.Cause(err) != common.ErrKeyNotFound {
			errs = append(errs, err.Error())
		}
	}

This Put would also cause other issues as we see in the event below. If it's convert to an object classes and the namespaces classes.loaded would conflict because one is an object and the other one isn't. I would have expected that this already causes an issue on the Elasticsearch side before filtering.

"jolokia":{
   "metrics":{
      "Metrics":{
         "atomikos.nbTransactions":0.000000,
         "classes":18857.000000,
         "classes.loaded":19127.000000,
         "classes.unloaded":270.000000,
         "counter.servo.discoveryclient-httpclient_createnew":13542.000000,
         "counter.servo.discoveryclient-httpclient_delete":13534.000000,
         "uptime":406037194.000000
      },
      "java":{
         "Threading":{
            "DaemonThreadCount":37.000000,
            "ThreadCount":57.000000
         }
      }
   }
}

An alternative solution to the above is replacing the . in the names with _ as we do for example for docker labels and other places with a similar issue. The downside of this is that filtering on fields is not straight forward as the names change between fetching from jolokia and in ES.

See https://discuss.elastic.co/t/include-fields-problem/109809

@ruflin ruflin added :Processors in progress Pull request is currently in progress. libbeat labels Dec 19, 2017
@andrewkroh
Copy link
Member

I would have preferred "modifying jolokia fetcher to convert dot keys to objects" if it weren't for the data being problematic as you have shown.

We could write separate functions to Get/Put that work on keys containing dots. The Get would be more expensive. Get(key string) could try to find a key by successively concatenating keys. Put(key []string, v interface{} would need to explicitly specify the keys.

For example, with Get("jolokia.metrics.Metrics.atomikos.nbTransactions"):

// try map["atomikos"] and if not found then:
// try map["atomikos.nbTransactions"]

@ruflin
Copy link
Contributor Author

ruflin commented Dec 20, 2017

I did some testing on the ES side an the following even with dots will no index in newer versions of Elasticsearch:

         "classes":18857.000000,
         "classes.loaded":19127.000000,
         "classes.unloaded":270.000000,

The reason is that Elasticsearch does the "dedotting" and converstion to an object on its side. That means if someone wants to index the above it is required on our side to replace . with _ or use the trick we do in Metricbeat that it would be classes.value if it is both. But I feel like that adds too much magic.

@ruflin
Copy link
Contributor Author

ruflin commented Dec 26, 2017

Here is a more general issue to discuss this: #5942

@ruflin
Copy link
Contributor Author

ruflin commented Dec 26, 2017

I opened the PR here which dedots the keys and replaces them with _: #5957

@ruflin
Copy link
Contributor Author

ruflin commented Jan 4, 2018

@andrewkroh I'm undecided on what we should do with this PR. The trade off is that the initial document and the created document are not 100% the same as Put is used. But that should be ok.

ruflin added a commit to ruflin/beats that referenced this pull request Jan 4, 2018
This should prevent name collisions and allow filtering on these fields. The feature is enabled by default and cannot be disable as otherwise ingestion could stop as Elasticsearch is returning an error. Currently not config option is provided as this should not be disabled.

* Introduces DeDot function in common package
* Adds tests to jmx for new functionality
* Use new function also for docker labels

Related to elastic#5942
Closes elastic#5916
andrewkroh pushed a commit that referenced this pull request Jan 5, 2018
This should prevent name collisions and allow filtering on these fields. The feature is enabled by default and cannot be disable as otherwise ingestion could stop as Elasticsearch is returning an error. Currently not config option is provided as this should not be disabled.

* Introduces DeDot function in common package
* Adds tests to jmx for new functionality
* Use new function also for docker labels

Related to #5942
Closes #5916
@ruflin ruflin reopened this Jan 9, 2018
@ruflin
Copy link
Contributor Author

ruflin commented Jan 9, 2018

Reopening this PR as I think the automatically closing was unintentional.

@andrewkroh
Copy link
Member

Related: #2921

@andrewkroh
Copy link
Member

Maybe should have a way to escape the dots? Or could offer a way to specify keys similar to Logstash like [metrics][atomikos.nbTransactions]?

@ruflin
Copy link
Contributor Author

ruflin commented Jan 12, 2018

Also looking at #2921 the solution in this PR would work for these cases. The only issue with this solution is that the document looks different before and after the processing. But as from a ES perspective a.b: {2} is the same as a: { b: {2}} anyways, I don't think this is an issue.

If we go with the above change, dot notation can be used in all processors as the walkMap function is changed.

The include_fields processor does not work in case the key contains a dot. See tests added here for this case.

There are different solutions here:

* Modify `GetValue` in common.MapStr to also look for keys with dots inside instead of directly walking the tree
* Modifying jolokia fetcher to convert dot keys to objects

The first change would be possible with adding

```
--- a/libbeat/common/mapstr.go
+++ b/libbeat/common/mapstr.go
@@ -296,6 +296,10 @@ func walkMap(key string, data MapStr, op mapStrOperation) (interface{}, error) {
        var err error
        keyParts := strings.Split(key, ".")

+       if v, exists := data[key]; exists {
+               return v, nil
+       }
+
```

But that does not fully solve the problem. As in the processor the following code is used, the fields which are added after filtering to the filtered fields, will have a nesting without the docs (see Put method)

```
	for _, field := range f.Fields {
		v, err := event.GetValue(field)
		if err == nil {
			_, err = filtered.Put(field, v)
		}

		// Ignore ErrKeyNotFound errors
		if err != nil && errors.Cause(err) != common.ErrKeyNotFound {
			errs = append(errs, err.Error())
		}
	}
```

This Put would also cause other issues as we see in the event below. If it's convert to an object `classes` and the namespaces `classes.loaded` would conflict because one is an object and the other one isn't. I would have expected that this already causes an issue on the Elasticsearch side before filtering.

```
"jolokia":{
   "metrics":{
      "Metrics":{
         "atomikos.nbTransactions":0.000000,
         "classes":18857.000000,
         "classes.loaded":19127.000000,
         "classes.unloaded":270.000000,
         "counter.servo.discoveryclient-httpclient_createnew":13542.000000,
         "counter.servo.discoveryclient-httpclient_delete":13534.000000,
         "uptime":406037194.000000
      },
      "java":{
         "Threading":{
            "DaemonThreadCount":37.000000,
            "ThreadCount":57.000000
         }
      }
   }
}
```

An alternative solution to the above is replacing the `.` in the names with `_` as we do for example for docker labels and other places with a similar issue. The downside of this is that filtering on fields is not straight forward as the names change between fetching from jolokia and in ES.

See https://discuss.elastic.co/t/include-fields-problem/109809
@ruflin
Copy link
Contributor Author

ruflin commented Feb 12, 2018

@andrewkroh Working on #6292 I realised this feature is also needed there. Ok to get this change in?

@ruflin ruflin added review and removed in progress Pull request is currently in progress. labels Feb 12, 2018
@ruflin ruflin mentioned this pull request Feb 18, 2018
Benchmark old to new walkMap

Recursive WalkMap
```
BenchmarkWalkMap/Get-4            500000              3032 ns/op
BenchmarkWalkMap/Put-4            300000              4131 ns/op
BenchmarkWalkMap/HasKey-4         300000              4080 ns/op
BenchmarkWalkMap/Delete-4         300000              4353 ns/op
```

Non recursive WalkMap
```
BenchmarkWalkMap/Get-4            500000              2890 ns/op
BenchmarkWalkMap/Put-4            300000              4426 ns/op
BenchmarkWalkMap/HasKey-4         300000              4025 ns/op
BenchmarkWalkMap/Delete-4         300000              4568 ns/op
```

I did multiple runs and there was no clear winner. They seem to be very similar in execution time.
@ruflin
Copy link
Contributor Author

ruflin commented Feb 19, 2018

I'm closing this in favor of #6406

@ruflin ruflin closed this Feb 19, 2018
ruflin added a commit to ruflin/beats that referenced this pull request Feb 19, 2018
Currently if a dotted key is passed to walkmap it is split up in it's part and the map is traversed for each part. In case an event has a dotted key inside walkMap does not return the expected value. For the Event `{"a.b": 1}` the method `GetValue("a.b")` has previously returned `ErrKeyNotExists` but now returns 0.

With this change all processors which rely on Get/Put functionalities of MapStr can now also work with events containing dotted keys.

**Changes**

* The walkMap function was rewritten to a recursive function. This made adding the new functionality easier.
* Add tests for GetValue to make possible usages more clear
* Add Benchmark tests for walkMap use cases

**Limitations**

When using `walkMap` with a Key, either the full key has to be dotted or the last part. `GetValue("a.b.c")` on an event `{"a.b":{"c":17}}` does not work as only the first part of the key or the full key are checked against the map.

Benchmark old to new walkMap:

```
go test  -bench=BenchmarkWalkMap -run=BenchmarkWalkMap
```

Recursive:
```
BenchmarkWalkMap/Get-4           1000000              1274 ns/op
BenchmarkWalkMap/Put-4           1000000              1277 ns/op
BenchmarkWalkMap/HasKey-4        2000000               880 ns/op
BenchmarkWalkMap/HasKeyFirst-4  20000000               111 ns/op
BenchmarkWalkMap/Delete-4        1000000              1313 ns/op
```

Iterative:
```
BenchmarkWalkMap/Get-4            500000              2608 ns/op
BenchmarkWalkMap/Put-4           1000000              1140 ns/op
BenchmarkWalkMap/HasKey-4        3000000               411 ns/op
BenchmarkWalkMap/HasKeyFirst-4  20000000                85.9 ns/op
BenchmarkWalkMap/Delete-4        1000000              1008 ns/op
```

The benchmark show a difference in the code before and after. The `GetValue` function has become quite a bit more efficient. The `HasKey` is now less efficient. I still need to investigate in detail where the differences come from but the assumption is it comes from the additional functionality.

This change was initially triggered by elastic#5916 but is split out into a separate PR because it has usage in multiple places.
andrewkroh pushed a commit that referenced this pull request Feb 20, 2018
Currently if a dotted key is passed to walkmap it is split up in it's part and the map is traversed for each part. In case an event has a dotted key inside walkMap does not return the expected value. For the Event `{"a.b": 1}` the method `GetValue("a.b")` has previously returned `ErrKeyNotExists` but now returns 0.

With this change all processors which rely on Get/Put functionalities of MapStr can now also work with events containing dotted keys.

**Changes**

* The walkMap function was rewritten to a recursive function. This made adding the new functionality easier.
* Add tests for GetValue to make possible usages more clear
* Add Benchmark tests for walkMap use cases

**Limitations**

When using `walkMap` with a Key, either the full key has to be dotted or the last part. `GetValue("a.b.c")` on an event `{"a.b":{"c":17}}` does not work as only the first part of the key or the full key are checked against the map.

Benchmark old to new walkMap:

```
go test  -bench=BenchmarkWalkMap -run=BenchmarkWalkMap
```

Recursive
```
BenchmarkWalkMap/Get-4           1000000              1334 ns/op             432 B/op          7 allocs/op
BenchmarkWalkMap/Put-4           1000000              1273 ns/op            1120 B/op         10 allocs/op
BenchmarkWalkMap/HasKey-4        2000000               690 ns/op             160 B/op          5 allocs/op
BenchmarkWalkMap/HasKeyFirst-4  50000000                28.8 ns/op             0 B/op          0 allocs/op
BenchmarkWalkMap/Delete-4        1000000              1431 ns/op            1120 B/op         10 allocs/op
```

Iterative
```
BenchmarkWalkMap/Get-4            500000              2577 ns/op             824 B/op         13 allocs/op
BenchmarkWalkMap/Put-4           1000000              1052 ns/op            1072 B/op          8 allocs/op
BenchmarkWalkMap/HasKey-4        5000000               381 ns/op              96 B/op          2 allocs/op
BenchmarkWalkMap/HasKeyFirst-4  20000000                89.6 ns/op            16 B/op          1 allocs/op
BenchmarkWalkMap/Delete-4        1000000              1083 ns/op            1072 B/op          8 allocs/op
```

The benchmark show a difference in the code before and after. The `GetValue` function has become quite a bit more efficient. The `HasKey` is now less efficient. I still need to investigate in detail where the differences come from but the assumption is it comes from the additional functionality.

This change was initially triggered by #5916 but is split out into a separate PR because it has usage in multiple places.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants