Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create (X-Pack Monitoring) stats metricset for Kibana module #7525

Merged
merged 66 commits into from
Jul 19, 2018

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Jul 6, 2018

This PR takes the stats metricset of the kibana Metricbeat module and makes it ship documents to .monitoring-kibana-6-mb-%{YYYY.MM.DD} indices, while preserving the current format/mapping expected by docs in these indices. This will ensure that current consumers of the data in these indices, viz. the X-Pack Monitoring UI and the Telemetry shipping module in Kibana, will continue to work as-is.

To test this PR

The idea is that metricbeat (specifically the kibana/stats metricset with xpack.enabled: true) will create exactly the same documents in .monitoring-kibana-* indices as Kibana's bulk uploader does today.

  1. Start up Elasticsearch (with the default Basic license is okay)

  2. Start up Kibana (with the default Basic license is okay), checked out from master.

  3. Enable Monitoring in Elasticsearch (via the cluster setting xpack.monitoring.collection.enabled: true). You can do this via Elasticsearch's Cluster Update Settings API or by clicking the "Turn on Monitoring" button in the Monitoring UI in Kibana.

  4. Let Kibana run for ~20 seconds so a few documents are indexed into .monitoring-kibana-*.

  5. Stop Kibana

  6. From .monitoring-kibana-*, retrieve a document each for type = kibana_stats

  7. Delete the .monitoring-kibana-* indices.

  8. In kibana.yml, set xpack.monitoring.kibana.collection.enabled: false. This will ensure that Kibana is no longer directly shipping its monitoring metrics to Elasticsearch.

  9. Start up Kibana again, this time checked out from Stats API: implement the "kibana status" spec from the Monitoring data model for stats kibana#20577.

  10. Enable the kibana module in metricbeat: ./metricbeat modules enable kibana.

  11. In modules.d/kibana.yml, add the stats metricset and set xpack.enabled: true. Concretely, your modules.d/kibana.yml should look something like this:

    - module: kibana
       metricsets:
       - stats
       period: 10s
       hosts: ["localhost:5601"]
       #basepath: ""
       #username: "user"
       #password: "secret"
       xpack.enabled: true

    Additionally, if your Kibana URLs have a basepath prefix, say foo, make sure to un-comment the basepath setting in the kibana module YAML file and set it's value to "foo".

  12. Start metricbeat.

  13. Let metricbeat run for ~20 seconds so a few documents are indexed into .monitoring-kibana-*.

  14. Stop metricbeat

  15. From .monitoring-kibana-*, retrieve a document each for type = kibana_stats.

  16. Using a tool such as http://www.jsondiff.com, compare the documents indexed by Kibana's with those indexed by metricbeat. Verify that their structures are identical (same fields, not necessarily same values), except for these known and expected differences:

    1. Only Metricbeat-indexed documents are expected to contain the fields @timestamp, beat, host, and metricset. These are "standard" fields added by beats and metricbeat and don't have an adverse impact since they are additive.
    2. Only Kibana-indexed documents are expected to contain the field source_node. This field is used for debugging purposes only and not actually consumed by either the Monitoring UI or Telemetry feature in Kibana.
    3. Only Kibana-indexed documents are expected to contain the field requests.status_codes. This field is not currently being used by the Monitoring UI and is, in fact, not being indexed into Elasticsearch with the right mapping. So we're not going to index it going forward (via Metricbeat), until we need it in the UI.

Out of scope for this PR

These items are related to this PR but will be covered in follow up PRs:

  • This PR takes care of kibana_stats documents in .monitoring-kibana-6-* indices. Similarly, a future PR needs to take care of kibana_settings documents.
  • Once both kibana_stats and kibana_settings documents are being indexed by Metricbeat's kibana module (with xpack.enabled: true), we need to document how users can switch over from Kibana to Metricbeat for collecting and shipping Kibana monitoring metrics into ES.

@ycombinator ycombinator added in progress Pull request is currently in progress. module Metricbeat Metricbeat v7.0.0-alpha1 labels Jul 6, 2018
@ycombinator ycombinator self-assigned this Jul 6, 2018
@ycombinator ycombinator changed the title [WIP] Starting to create (X-Pack Monitoring) stats metricset for Kibana module [WIP] Create (X-Pack Monitoring) stats metricset for Kibana module Jul 6, 2018
}),
"uptime_in_millis": c.Int("uptime_ms"),
}),
"requests": RequestsDict),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing ',' in composite literal (and 10 more errors)

@ycombinator ycombinator force-pushed the metricbeat/monitoring-kibana branch from 00f764f to adddaf4 Compare July 11, 2018 00:10
schemaXPackMonitoring = s.Schema{
"concurrent_connections": c.Int("concurrent_connections"),
"os": c.Dict("os", s.Schema{
"load": c.Dict("cpu.load_average", s.Schema{
Copy link
Contributor Author

@ycombinator ycombinator Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin @jsoriano The unit test for this file is failing because of the usage of cpu.load_average here. The failure message says:

1 error: key `os.cpu.load_average` not found

I think it doesn't like the fact that the key is nested (cpu.load_average).

Any ideas how I could achieve what I'm trying to over here? I want the resulting schema to look like this (because that's how X-Pack Monitoring documents look like today):

"os": {
  "load": {
    "1m": 12345,
    "5m": 23456,
    "15m": 34567
  }
}

But the corresponding source data looks like this:

"os": {
  "cpu": {
    "load_average": {
      "1m": 12345,
      "5m": 23456,
      "15m": 34567,
    }
  }
}

Note that the source data has an additional level of nesting than what I want in the resulting schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator try to use the nested notation in c.Float(), something like this:

"load": s.Object{
  "1m": c.Float("cpu.load_average.1m"),
  ... 
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsoriano. I tried that but it didn't work either:

=== RUN   TestEventMappingXPack
--- FAIL: TestEventMappingXPack (0.00s)
        Error Trace:    data_xpack_test.go:43
        Error:          Received unexpected error:
                        3 errors: key `os.cpu.load_average.1m` not found; key `os.cpu.load_average.5m` not found; key `os.cpu.load_average.15m` not found
        Test:           TestEventMappingXPack
        Messages:       _meta/test/stats.700.json
        Error Trace:    data_xpack_test.go:44
        Error:          Should be true
        Test:           TestEventMappingXPack
        Messages:       _meta/test/stats.700.json
        Error Trace:    data_xpack_test.go:45
        Error:          Not equal:
                        expected: 0
                        actual  : 1
        Test:           TestEventMappingXPack
        Messages:       _meta/test/stats.700.json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been taking a look to the behaviour of nested fields in schema and I have opened a PR for discussion (#7583).

The thing is that for some types (string and interface{}) the map is converted to common.MapStr, and then the value is obtained with GetValue(), what works with nested field paths. For the rest of types, as dicts and floats the map is used as a normal map and then it cannot work with nested values. I'm not sure if this is the intended behaviour... but we should probably unify behaviours.

@ycombinator ycombinator force-pushed the metricbeat/monitoring-kibana branch from 4bc6835 to 6f9c814 Compare July 13, 2018 16:51
@ycombinator ycombinator changed the title [WIP] Create (X-Pack Monitoring) stats metricset for Kibana module Create (X-Pack Monitoring) stats metricset for Kibana module Jul 16, 2018
@ycombinator ycombinator added review and removed in progress Pull request is currently in progress. labels Jul 16, 2018
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM. Could you generate a new data.json? This should make it easier to comment on the generated structure.

I wonder if we really should add all these metrics or start with a subset and check what we really need. It's easy to add metrics but hard to remove them.

Could you also add service.name: kibana to the non xpack event?

"event_loop_delay": s.Object{
"ms": c.Float("event_loop_delay"),
},
"memory": c.Dict("memory", s.Schema{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking long term: Do we need to report this info or is it up to the system module to report this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think system module. I can remove it from the non-xpack-monitoring event here. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

"uptime": s.Object{
"ms": c.Int("uptime_ms"),
},
}),
"requests": RequestsDict,
"response_times": c.Dict("response_times", s.Schema{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would make this singular.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6357e1090766a02d6e5360c483b3128bb5804f04.

}

// RequestsDict defines how to convert the requests field
RequestsDict = c.Dict("requests", s.Schema{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, would make it singular as disconnects is already plural.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6357e1090766a02d6e5360c483b3128bb5804f04.

@@ -71,5 +71,21 @@ func (m *MetricSet) Fetch(r mb.ReporterV2) {
return
}

eventMapping(r, content)
config := struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be inside new and not part of fetch.

}

if config.XPackEnabled {
cfgwarn.Experimental("The experimental xpack.enabled flag in kibana/stats metricset is enabled.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logging should be part of New otherwise it's logged very time.

@ycombinator
Copy link
Contributor Author

ycombinator commented Jul 17, 2018

I wonder if we really should add all these metrics or start with a subset and check what we really need. It's easy to add metrics but hard to remove them.

I'm +1 to starting with a minimal set for the reason you mentioned. Question is how to decide what we really need?

If I had to take a guess at this, I'd say we remove the usage section altogether. That would leave things like process metrics (node.js heap, etc.), request counts, response times, and kibana instance metadata.

WDYT?

Could you also add service.name: kibana to the non xpack event?

Yes, will do. Thanks for catching.

[EDIT] Looks like this is already being done here: https://github.com/elastic/beats/pull/7525/files#diff-57317d7f333ce418ec60eea18ce13bd6L104

@ruflin
Copy link
Member

ruflin commented Jul 17, 2018

What to add and what not is the tricky question that bugs us since day one of Metricbeat. The good news is that by now with processors we allow the users to also drop fields they don't need.

For the metrics I would ask the following questions to answer if we should keep them or not:

  • If something is wrong with Kibana, will these metrics be valuable?
  • Are these metrics of interest for the users when everything is green?

There is a lot of gray area and I suggest just pick the ones that you think are best for now.

@ycombinator
Copy link
Contributor Author

For the metrics I would ask the following questions to answer if we should keep them or not:

  • If something is wrong with Kibana, will these metrics be valuable?
  • Are these metrics of interest for the users when everything is green?

Yeah, using similar guidelines I made my earlier suggestion:

I'd say we remove the usage section altogether. That would leave things like process metrics (node.js heap, etc.), request counts, response times, and kibana instance metadata.

I'll make the necessary changes to this PR. Thanks!

@ycombinator
Copy link
Contributor Author

ycombinator commented Jul 17, 2018

Could you generate a new data.json? This should make it easier to comment on the generated structure.

Done in 64903e7.

@ruflin I believe I have now addressed all the points from your review. Please take another 👀 whenever you get a chance. Thanks!

http, err := helper.NewHTTP(base)
if err != nil {
return nil, err
}

if config.BasePath != "" {
http.AddBasePath(config.BasePath)
Copy link
Contributor Author

@ycombinator ycombinator Jul 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Since Kibana URIs can have an optional basepath, I'm allowing users of the kibana Metricbeat module to set it in their config YAML. If it is set, I insert it into the URI using http.AddBasePath as shown here. Please see my implementation of the AddBasePath method (elsewhere in this PR) as well.

WDYT - is this config setting the right approach? If yes, is the implementation of the AddBasePath method alright?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM. See my comment above if this is only for dev purpose and if yes, if the host config change could work.

@ycombinator ycombinator force-pushed the metricbeat/monitoring-kibana branch from 0e1cd0a to 2edc12d Compare July 18, 2018 00:16
@@ -351,6 +351,7 @@ metricbeat.modules:
metricsets: ["status"]
period: 10s
hosts: ["localhost:5601"]
basepath: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the basepath mainly used for development? Would it also work the specify the host as localhost:5601/abc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basepath gets used in production scenarios too, where users deploy Kibana and also other applications behind the same proxy and need a unique base path to route to each application.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I think a separate config makes sense. I assume including in the host would also work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume including in the host would also work?

You mean something like hosts: ["localhost:5601/foo"] in modules.d/kibana.yml, right?

Heh, I swear I tried that first and it didn't work at the time, so I went with a separate setting. However, I just tried it again, and it works! So now I think we should get rid of the extra basepath setting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, spoke too soon. It doesn't work when I add it to the hosts setting. 😞

What seems to happen is that the URL becomes http://localhost:5601/foo?extended=true instead of http://localhost:5601/foo/api/stats?extended=true.

So now I'm back to thinking we should give basepath its own setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see if we can change the behavior of the http parser such that if path is included in the hosts URL that it still does the right thing. It seems like that was the expected behavior. Fewer config options and an explicit URL sounds simpler to me, but I may not have the whole picture.

What seems to happen is that the URL becomes http://localhost:5601/foo?extended=true instead of http://localhost:5601/foo/api/stats?extended=true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense @andrewkroh. Thanks for weighing in. I'll give this a shot in a new PR this coming week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator Could you set it a priority so we get it into 6.4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I plan to have a PR up today!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is up and I've requested a review from both of you: #7682

"event_loop_delay": 718.5184001922607,
"name": "kibana",
"concurrent_connections": 3,
"kibana": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we put this data up one level outside stats?

So it would be for example kibana.host.name, kibana.name, kibana.id etc.? I would assume in case we add other metricsets they would reuse this data.

It bugs me that we have kibana.*.kibana.

format: bytes
description: >
Memory usage of C++ objects bound to JavaScript objects managed by V8.
- name: pid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to say we should keep the pid and put it under process.pid for correlation reasons.

Copy link
Contributor Author

@ycombinator ycombinator Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added back in bf8881a.

type: long
Average response time in milliseconds
- name: max.ms
type: scaled_float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a long?

return err
}

process := data["process"].(map[string]interface{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate all these type conversions to not have a panic. Also 3 more below.

http, err := helper.NewHTTP(base)
if err != nil {
return nil, err
}

if config.BasePath != "" {
http.AddBasePath(config.BasePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM. See my comment above if this is only for dev purpose and if yes, if the host config change could work.

@@ -3,5 +3,6 @@
# - status
period: 10s
hosts: ["localhost:5601"]
#basepath: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a config that non dev users will use? If yes, is it common? If not, lets skip it in the short config.

@ycombinator ycombinator force-pushed the metricbeat/monitoring-kibana branch 2 times, most recently from 69f13d7 to 032d28b Compare July 18, 2018 18:59
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few last comments.

if !ok {
return mb.HostData{}, errors.Errorf("'%v' config for module %v is not a string", b.BasePathConfigKey, module.Name())
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this else is needed as the default for a string is already ""

basePath = strings.Trim(basePath, "/")

// Combine paths and normalize
fullPath := strings.Trim(basePath+"/"+path, "/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to use path.Join here? https://golang.org/pkg/path/#Join

@@ -77,8 +78,25 @@ func (b URLHostParserBuilder) Build() mb.HostParser {
} else {
path = b.DefaultPath
}
// Normalize path
path = strings.Trim(path, "/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm never sure if we should try to be smart about / prefix and post fixes or just leave them.

Copy link
Contributor Author

@ycombinator ycombinator Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because, without it, some existing tests were failing. I didn't want to change the tests because they might represent expectations of current usage (i.e. the existing contract).

@@ -114,6 +114,7 @@ func TestURLHostParserBuilder(t *testing.T) {
{map[string]interface{}{}, URLHostParserBuilder{DefaultPath: "/default"}, "http://example.com/default"},
{map[string]interface{}{"username": "guest"}, URLHostParserBuilder{}, "http://[email protected]"},
{map[string]interface{}{"username": "guest", "password": "secret"}, URLHostParserBuilder{}, "http://guest:[email protected]"},
{map[string]interface{}{"basepath": "/foo"}, URLHostParserBuilder{BasePathConfigKey: "basepath", DefaultPath: "/default"}, "http://example.com/foo/default"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we strip the / before and after lets also add some tests here to see what happens with the different varians.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in a644e62.

// Config defines the structure for the Kibana module configuration options
type Config struct {
XPackEnabled bool `config:"xpack.enabled"`
BasePath string `config:"basepath"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BasePath here becomes obsolete if we have it in the parser I assume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. Forgot to remove BasePath from this file.

Kibana instance name.
- name: uuid
Name of Kibana's internal index
- name: host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this host.name to be consistent with ECS?

type: keyword
description: >
Kibana instance's health status
- name: cluster_uuid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future this should go under elasticsearch.cluster.id (not needed now)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can easily make the change now itself. Unless there is a reason to hold off until later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to do it now.

return err
}

process, ok := data["process"].(map[string]interface{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano I think this is an interesting example where we could potentially use "required" fields and skip this part if some required fields are missing. Not part of this PR.

return reportErrorForMissingField("process.memory", r)
}

rss, ok := memory["resident_set_size_bytes"].(float64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you use float64 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I tried to use int here but the type assertion failed. The value in the JSON response from the Kibana stats API looks like an integer to me but for some reason the JSON parser thinks its a float because perhaps it's too large?.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you should try int64? But I would be surprised if the value exceeds int TBH.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it didn't like int64 either :(.

FWIW, if I do a fmt.Println(memory["resident_set_size_bytes"]) (right before the type assertion), I get 6.2808064e+07.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Kibana does report it in this format. float and later converstion to int SGTM. Perhaps make it int64 there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b3fbd5f.

@ycombinator ycombinator force-pushed the metricbeat/monitoring-kibana branch from 88800b6 to 8c54595 Compare July 19, 2018 17:23
@@ -73,6 +61,9 @@
"name": "stats",
"rtt": 115
},
"process": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@ruflin ruflin merged commit e5791d2 into elastic:master Jul 19, 2018
@ycombinator ycombinator deleted the metricbeat/monitoring-kibana branch July 19, 2018 19:45
tsg pushed a commit that referenced this pull request Jul 24, 2018
* Fix breaking change in monitoring data (#7563)

The prefix for the stats metrics was metrics but renamed to `stats` by accident as the name is now auto generated. This reverts this change.

Closes #7562

* Add http.request.mehod to Kibana log filset (#7607)

Take `http.request.method` from ECS and apply it to the Kibana fileset.

Additional logs are added to the example log files.

* Fix rename log message (#7614)

Instead of the from field the to field was logged.

* Add tests to verify template content (#7606)

We recently started to move fields.yml into the Golang binary to be used internally. To make sure the loading important and loading of all the data into the binary works as expected for Metricbeat, this adds some basic tests. Related to #7605.

* Basic support of ES GC metrics for jvm9 (#7628)

GC log format for JVM9 is more detailed than for JVM8.

Differences and possible improvements:
* To get cpu_times.* a corellation between log lines is required.
* Some GC metrics are available in jvm8 are not in jvm9
  (class_unload_time_sec, weak_refs_processing_time_sec, ...)
* heap.used_kb is empty, but it can be calculated as young_gen.used_kb +
  old_gen.size_kb
* GC phase times are logged in miliseconds vs seconds in jvm8

* Improve fields.yml generator of modules (#7533)

From now on when a user provides a type hint in an Ingest pipeline, it's added to the generated `fields.yml` instead of guessing.

Closes #7472

* Fix filebeat registry meta being nil vs empty (#7632)

Filebeat introduces a meta field to registry entries in 6.3.1. The meta field is used to distuingish different log streams in docker files. For other input types the meta field must be null. Unfortunately the input loader did initialize the meta field with an empty dictionary. This leads to failing matches of old and new registry entries. Due to the match failing, old entries will not be removed, and filebeat will handle all files as new files on startup (old logs are send again).

Users will observe duplicate entries in the reigstry file. One entry with "meta": null and one entry with "meta": {}. The entry with "meta": {} will be used by filebeat. The null-entry will not be used by filebeat, but is kept in the registry file, cause it has now active owner (yet).

Improvements provided by this PR:

* when matching states consider an empty map and a null-map to be equivalent
* update input loader to create a null map for old state -> registry entries will be compatible on upgrade
* Add checks in critical places replacing an empty map with a null-map
* Add support to fix registry entries on load. states from corrupted 6.3.1 files will be merged into one single state on load 
* introduce unit tests for loading different registry formats
* introduce system tests validating output and registry when upgrading filebeat from an older version

Closes: #7634

* Heartbeat Job Validation + addition of libbeat/mapval (#7587)

This commit seeks to establish a pattern for testing heartbeat jobs. It currently tests the HTTP and TCP jobs. It also required some minor refactors of those tasks for HTTP/TCP.

To do this, it made sense to validate event maps with a sort of schema library. I couldn't find one that did exactly what I wanted here, so I wrote one called mapval. That turned out to be a large undertaking, and is now the majority of this commit. Further tests need to be written, but this commit is large enough as is.

One of the nicest things about the heartbeat architecture is the dialer chain behavior. It should be the case that any validated protocol using TCP (e.g. HTTP, TCP, Redis, etc.) has the exact same tcp metadata.

To help make testing these properties easy mapval lets users compose portions of a schema into a bigger one. In other words, you can say "An HTTP response should be a TCP response, with the standard monitor data added in, and also the special HTTP fields". Even having only written a handful of tests this has uncovered some inconsistencies there, where TCP jobs have a hostname, but HTTP ones do not.

* Only fetch shard metrics from master node (#7635)

This PR makes it so that the `elasticsearch/shard` metricset only fetches information from the Elasticsearch node if that node is the master node.

* Create (X-Pack Monitoring) stats metricset for Kibana module (#7525)

This PR takes the `stats` metricset of the `kibana` Metricbeat module and makes it ship documents to `.monitoring-kibana-6-mb-%{YYYY.MM.DD}` indices, while preserving the current format/mapping expected by docs in these indices. This will ensure that current consumers of the data in these indices, viz. the X-Pack Monitoring UI and the Telemetry shipping module in Kibana, will continue to work as-is.

* Add kubernetes specs for auditbeat file integrity monitoring (#7642)

* Release the rename processor as GA

* Fix log message for Kibana beta state (#7631)

From copy paste Kafka was in the log message instead of Kibana.

* Clean up experimental and beta messages (#7659)

Sometimes the old logging mechanism was used. If all use the new one it is easier to find all the entries. In addition some messages were inconsistent.

* Release raid and socket metricset from system module as GA (#7658)

* Release raid and socket metricset from system module as GA

* remove raid metricset title

* Update geoip config docs (#7640)

* Document  breaking change in monitoring shcema

Situation:

* Edit breaking changes statement about monitoring schema changes (#7666)

* Marking Elasticsearch module and its metricsets as beta (#7662)

This PR marks the `elasticsearch` module and all its 8 existing metricsets all as `beta`. Previously only 
2 metricsets were marked as `beta` with the remaining 6 marked as `experimental`.

* Increase kafka version in tests to 1.1.1 (#7655)

* Add missing mongodb status fields (#7613)

Add `locks`, `global_locks`, `oplatencies` and `process` fields to `status` metricset of MongoDB module.

* Remove outdated vendor information. (#7676)

* Fix Filebeat tests with new region_iso_code field (#7678)

In elastic/elasticsearch#31669 the field `region_iso_code` was added to the geoip processor. Because of this test broke with the most recent release of Elasticsearch as the events contain an undocumented field.

* Fix duplicated module headers (#7650)

* Fix duplicated module headers

Closes #7643

* fix metricset titles for munin and kvm

* fix imssing kubernetes apiserver metricset doc

* remove headers from modules / metricset generator and clean up traefik title

* Release munin and traefik module as beta. (#7660)

* Release munin and treafik module as beta.

* fixes to munin module

* Report k8s pct metrics from enrichment process (#7677)

Instead of doing it from the `state_container`. Problem with the
previous approach is that `state_container` metricset is not run in all
nodes, but from a single point. Making performance metrics not available
in all cases.

With this new approach, the enriching process will also collect
performance metrics, so they should be available everywhere where the
module is run.

* Fix misspell in Beats repo (#7679)

Running `make misspell`.

* Update sarama (kafka client) to 1.17 (#7665)

- Update Sarama to 1.17. The Sarama testsuite tests kafka versions between 0.11 and 1.1.0.
- Update compatible versions in output docs
- Add compression_level setting for gzip compression

* Update github.com/OneOfOne/xxhash to fix mips

* Update boltdb to use github.com/coreos/bbolt fork

Closes #6052

* Generate fields.yml using Mage (#7670)

Make will now delegate to mage for generating fields.yml. Make will check if the mage command exists and go install it if not. The FIELDS_FILE_PATH make variable is not longer used because the path(s) are specified in magefile.go.

This allows fields.yml to be generated on Windows using Mage. The CI scripts for Windows have been updated so that fields.yml is generated for all Beats during testing.

This also adds a make.bat in each directory where building occurs to give Windows
users a starting point.

Some fixes were made to the generators because:
- rsync was excluding important source files contained in a directory
  named "build"
- the generated project needed to be `git init` before running certain
  magefile targets that detect project's root dir and import path.

* Update go-ucfg to 0.6.1 (#7599)

Update fixes config unpacking if users overwrite settings from CLI, with
missing values. When using `-E key=` (e.g. in scripts defining potential
empty defaults via env variables like `-E key=${MYVALUE}`), an untyped
`nil`-values was inserted into the config. This untyped value will make
Unpack fail for most typed settings.

* Docs: Add deprecation check for dashboard loading. (#7675)

For APM Server the recommended way of loading dashboards and Kibana index pattern will be through the Kibana UI from 6.4 on. Since the docs are based on the libbeat docs we need to add a deprecation flag for dashboard and index pattern related documentation.

relates to elastic/apm-server#1142

* Update expected filebeat module files for geoip change
ruflin pushed a commit that referenced this pull request Aug 9, 2018
Resolves #7621.

Depends on elastic/kibana#21100.

X-Pack Monitoring of Kibana requires two types of documents in the `.monitoring-kibana-*` indices: `kibana_stats` and `kibana_settings`. We made Metricbeat's `kibana/stats` metricset index `kibana_stats` documents into `.monitoring-kibana-*` in #7525. This PR makes the same metricset index `kibana_settings` documents into `.monitoring-kibana-*`.
ruflin pushed a commit that referenced this pull request Aug 13, 2018
Resolves #7621.

Depends on elastic/kibana#21100.

X-Pack Monitoring of Kibana requires two types of documents in the `.monitoring-kibana-*` indices: `kibana_stats` and `kibana_settings`. We made Metricbeat's `kibana/stats` metricset index `kibana_stats` documents into `.monitoring-kibana-*` in #7525. This PR makes the same metricset index `kibana_settings` documents into `.monitoring-kibana-*`.

(cherry picked from commit 2af5ab9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants