Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Elasticsearch 5.x output #2332

Merged
merged 19 commits into from
Mar 21, 2017
Merged

Add Elasticsearch 5.x output #2332

merged 19 commits into from
Mar 21, 2017

Conversation

lpic10
Copy link
Contributor

@lpic10 lpic10 commented Jan 27, 2017

This PR is based on #1875 with several changes:

  • It supports only ES 5.x
  • Management of shards/replicas removed
  • A list of hosts to connect can now be used (similar to influxdb config)
  • Allows a bit better template management
  • Index name per time frame is now working properly, based on the metric timestamp
  • Tags are now keyword fields inside a tag object
  • Metrics are now fields inside an object named as the input plugin
  • Basic HTTP auth support (not tested)
  • Uses ES bulk API for indexing

Comments/reviews are welcome

Required for all PRs:

  • CHANGELOG.md updated (we recommend not updating this until the PR has been approved by a maintainer)
  • Sign CLA (if not already signed)
  • README.md updated (if adding a new plugin)

# username = "telegraf"
# password = "mypassword"

# Index Config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when is this evaluated? only when telegraf start? or each metric "batch" under the same timestamp gets added to it's own index? (add that to the doc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the evaluation is done currently per metric inside the batch... there is a comment in the code about that.
But... is it always the case that a batch of metrics all have same timestamp? If yes, I will update this to do the evaluation once per batch only... and a note in the doc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, that is definitely not guaranteed, please leave it as-is!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but also please document that the index is set per-metric in the README and sample config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will send later today

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added additional information, let me know if it is enough.

# %H - hour (00..23)
index_name = "telegraf-%Y.%m.%d" # required.

## Template Config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide more information on the template management? doesn't have to be here, maybe in the README would be sufficient.

links to elasticsearch documentation would be extra good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, template management is about creating a proper template for the index name provided. I will add more info in the doc... These config option names are inspired on similar logstash config options that should be familiar to who already uses ES

// warn about ES version
if i, err := strconv.Atoi(strings.Split(esVersion, ".")[0]); err == nil {
if i < 5 {
log.Println("W! Elasticsearch version not supported: " + esVersion)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an error? should we continue with the plugin or just return here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure here, I didn't test to really know it is not going to work on older versions. Maybe someone wants to take the risk, so there's the warning. But I agree to return if you think it is safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to return and not bother trying to support older versions of elastic

@sparrc
Copy link
Contributor

sparrc commented Jan 28, 2017

@lpic10 Forgive an ES noob if this doesn't make sense, but should we be taking advantage of elasticsearch types? should we have a configurable _type field added to each metric?

@lpic10
Copy link
Contributor Author

lpic10 commented Jan 28, 2017

I thought about keeping a different _type per input, but I don't really see an advantage.
In a similar way, metricbeat also does not use different _type fields in its metrics.

(update)
The _type field is not useful for having different mappings in same index
https://www.elastic.co/blog/index-vs-type

@berglh
Copy link

berglh commented Jan 30, 2017

Just a couple of things I noticed here. In the documentation example you give field [tag]. In elasticsearch, the [tags] field is an array for adding tags. I think this is kind of confusing, considering I think of tags in elasticsearch as an array of strings and here you are providing key value pairs. Perhaps a better name for disambiguity? Perhaps putting it in an object called "[telegraf]"? elasticsearch arrays (tags)

Another convention in any elasticsearch output for log shippers is to have the ability to add arbitrary fields in the config. So the user can specify [type] => "flying_goats_counter", "[environment]" => "Production", "[host] => "telegraf_hostname". In the example of filebeat, a lightweight shipper written in go, adding fields help reduce complexity further down the line. I guess another consideration is that some people might want to actually send their metrics via logstash for further enrichment, but that is the topic of another output plugin perhaps. Anyway, an example in filebeat filebeat fields

Great work.

@lpic10
Copy link
Contributor Author

lpic10 commented Jan 30, 2017

Hi @berglh , thanks for the interest, hope you find this plugin useful.

In elasticsearch, the [tags] field is an array for adding tags.

Actually, tags array comes from logstash. I don't see this being an issue because logstash and telegraf will use different indexes.

As you mentioned, telegraf tags model is a key value structure.

Perhaps a better name for disambiguity? Perhaps putting it in an object called "[telegraf]"?

I considered other names and I choose tag name for its simplicity. It already says what is inside (a tag) and it is short to not annoy me when creating a query, eg. a "host" template in grafafa:
{"find": "terms", "field": "tag.host"}

If I set some global tags on telegraf, let's say "cluster": "ha-cluster1" and "dc": "west" I think it makes sense a query like: tag.cluster: ha-cluster1 AND tag.dc: west. If you replace tag by something else I think it gets more confusing.

I can make the name of this tag object configurable if this is needed, but I don't think it would be a good idea because it would make more difficult to share grafana dashboards as the queries would differ.

Another convention in any elasticsearch output for log shippers is to have the ability to add arbitrary fields in the config. So the user can specify [type] => "flying_goats_counter", "[environment]" => "Production", "[host] => "telegraf_hostname

In telegraf you can add tags in the [global_tags] config declaration. In the example above, this would be:

[global_tags]
  dc = "west"
  cluster = "ha-cluster1"

I guess another consideration is that some people might want to actually send their metrics via logstash for further enrichment

Probably this is already possible, as logstash supports a graylog input: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-gelf.html, or it can read from a telegraf file output.

@berglh
Copy link

berglh commented Jan 30, 2017 via email

Copy link
Contributor

@sparrc sparrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

questions about the template

if (a.OverwriteTemplate) || (!templateExists) {
// Create or update the template
tmpl := fmt.Sprintf(`
{ "template":"%s*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you include in the documentation what this template is? how do you know that this template will work for everyone? should the template be configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template is kind of optional, the plugin will work without it and ES would detect correctly most of the types, and even if not optimal, it could still work for querying and graphing the data.

One of the drawbacks to not use this template is to have the tags analyzed by ES. In this case, using the default mapping from ES, instead of using tag.host one would have to use tag.host.keyword to access the non-analyzed field.

The idea is that the template provided will have sane defaults/types. Everything can be override by another template if the ES admin wants, or have this template updated manually.

Maybe I can add in the doc that it is possible to check the template created by issuing curl http://localhost:9200/_template/telegraf?pretty (considering the template name is telegraf)

BTW I need to check few additional things about the mapping of the number fields. I saw today some fields being mapped as float and others as long, and I need to check if this is correct and/or it causes any trouble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made few updates to the template and added more information in the README.md file.

```json
{
"@timestamp": "2017-01-01T00:00:00+00:00",
"input_plugin": "cpu",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with ES best practices, but this seems a bit redundant to me. Why put "input_plugin": "cpu" in the top-level of the metric if you already have the name of the plugin in "cpu": ...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's to make possible/easier to query/filter for metrics from a particular input. It is not so easy/convenient to query for field names in ES. It can be done by issuing a terms query on _field_names, but I don't know how to do this in grafana/kibana for example (or even if it is possible to do).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, shouldn't we put a "measurement_name" field in the ES metric? seems like this is more useful than the plugin name. There are many plugins that write more than one measurement name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense. Actually I think the current "input_plugin" is already the measurement name, it comes from metric.Name(). I will change the name of the field.

@ajaybhatnagar
Copy link

Is this branch ready for test ? In docs and and output plugins list , Elasticsearch is not visible. What is correct link to download and build from source ?

```json
{
"@timestamp": "2017-01-01T00:00:00+00:00",
"input_plugin": "cpu",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, shouldn't we put a "measurement_name" field in the ES metric? seems like this is more useful than the plugin name. There are many plugins that write more than one measurement name.

@lpic10
Copy link
Contributor Author

lpic10 commented Feb 9, 2017

Hi @ajaybhatnagar , this is working, feel free to test this by either cloning and building from the branch I'm pulling from or by downloading this PR as a patch and applying on top of master.

But be aware things may break as changes will happen until this is merged.

@lpic10
Copy link
Contributor Author

lpic10 commented Feb 15, 2017

I have pending some optimizations in the dynamic template, a bugfix for the index name and a bit more information about the bulk request when there are some indexing errors. I should update this PR soon.

@fabMrc
Copy link

fabMrc commented Feb 21, 2017

Is a another way to inject data to elasticSearch in older version (I am using ES 2.4 as current Springboot data version supports it)

@lpic10
Copy link
Contributor Author

lpic10 commented Feb 21, 2017

@fabMrc I have no plans to work on ES 2.x support, possibly there are ways to inject telegraf data into Elasticsearch using logstash or something like that.

@lpic10
Copy link
Contributor Author

lpic10 commented Feb 28, 2017

Oh, sorry, my mistake. Yes, it works if I manually set the field mapping to "double".

Still, the dynamic mapping does not work, thus for a newly created index I need to insert a document with a smaller value first for ES to create the index mapping and setting that field to "double".

@sparrc
Copy link
Contributor

sparrc commented Feb 28, 2017

@lpic10 what do you think is the best solution to solve dynamic mapping of large values?

It's such an edge-case that I think it's OK to truncate the values, but I'll defer to what you think is best.

@ajaybhatnagar
Copy link

I think these links should be helpful in deciding the approach by using dynamic template custom rules.
See the numeric detection part in the links.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html

@lpic10
Copy link
Contributor Author

lpic10 commented Mar 2, 2017

I decided to keep for now the mappings with "float" instead of "double" and truncate too big or too small values before sending to ES.

switch x := v.(type) {
// Truncate values too big/small for Elasticsearch to not complain when creating a dynamic field mapping
case float64:
if (x > 0 && x > math.MaxFloat32) || (x < 0 && x < math.SmallestNonzeroFloat32) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 0 comparison is redundant, and MaxFloat32 is larger than the maximum "long" value in ES. You need to be comparing against math.MaxInt64 and math.MinInt64.

and you should also be checking float32 fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 0 comparison I added because I saw truncation occurring when values were 0 (or -0, not sure), I will double check this later today.

Isn't "float32" in go the same as "float" in ES? Because it seems ES can handle fine these. I will check again if ES dynamic field mapping does not blow with MaxFloat32 and SmallestNonzeroFloat32 (and maybe add a new test).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part of the error message suggests that int64 is the limit?:

"reason": "Numeric value (-9223372036854776000) out of range of long (-9223372036854775808 - 9223372036854775807)

Copy link
Contributor Author

@lpic10 lpic10 Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That happened when I mistakenly changed the template to map all fields as "long".

I just noticed that comparison is complete non-sense, will try to check this out later.

@lpic10
Copy link
Contributor Author

lpic10 commented Mar 3, 2017

I couldn't figure out yet exactly what is the issue with this dynamic field mapping, but it seems to happen when sending large int values only, probably because of the way these are sent to ES via JSON. For example, a MaxFloat64 can be sent to ES without issues, possibly because it is sent as a scientific notation in the JSON.

On ES side, the dynamic field detection seems to be done by the jackson JSON parser, and AFAICT it relies on java native objects for that, but I couldn't go much further on it.

sparrc
sparrc previously requested changes Mar 3, 2017
log.Printf("W! Elasticsearch output metric %s truncated (value %v is too big or too small, truncating to %v)", k, x, v)
}
switch {
case v.(float64) > math.MaxInt64:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work if v is a float32?

Not sure I understand how this compiles, since it's comparing a float64 to an int64?

case v.(float64) < math.MinInt64:
v = math.MinInt64
default:
v = float32(v.(float64))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

casting everything to float32?

@sparrc sparrc dismissed their stale review March 3, 2017 09:12

reviewed changed code

@sparrc
Copy link
Contributor

sparrc commented Mar 3, 2017

I decided to keep for now the mappings with "float" instead of "double" and truncate too big or too small values before sending to ES.

So are we not doing this anymore?

@lpic10
Copy link
Contributor Author

lpic10 commented Mar 3, 2017

So are we not doing this anymore?

Yes, I want to, but I will have to understand better what is causing the mapping issue in first place. From what I saw it is not exactly the size (in bytes) of the field sent, but maybe its representation in the JSON.

@sparrc
Copy link
Contributor

sparrc commented Mar 8, 2017

@lpic10 what's the final conclusion on handling large numbers?

should we leave as-is and make a note of it? should we filter out fields with numbers that are too small or too large?

@lpic10
Copy link
Contributor Author

lpic10 commented Mar 20, 2017

@sparrc for this issue with large integers, I've added a note in the README. It is still not clear how/if it should be fixed.

It is a combination of issues between Go encoding, Jackson on the Java side, and ES trying to figure out a data type.

@ajaybhatnagar
Copy link

I am seeing possibly another mapping related issue where for system load data need to be stored as float but it seems to mapped as long and can not be converted: Should I create another ticket for it ?

[2017-03-20T15:27:00,628][DEBUG][o.e.a.b.TransportShardBulkAction] [esdbmonm01] [es5dbmon_200317][0] failed to execute bulk item (index) index {[es5dbmon_200317][metrics][AVrsVB8zSoM5CnqaP7FL], source[{"@timestamp":"2017-03-20T15:27:00Z","measurement_name":"system","system":{"load1":0.08,"load15":0.02,"load5":0.07,"n_cpus":8,"n_users":0},"tag":{"host":"esdbmonm04"}}]}
java.lang.IllegalArgumentException: mapper [system.load15] cannot be changed from type [long] to [float]
at org.elasticsearch.index.mapper.MappedFieldType.checkTypeName(MappedFieldType.java:147) ~[elasticsearch-5.2.2.jar:5.2.2]

@lpic10
Copy link
Contributor Author

lpic10 commented Mar 20, 2017

@ajaybhatnagar do you have system.load5 fields mapped to long in your index mapping?

It is possible that you are using an old version of the template that could cause this issue. You can try to set overwrite_template = true on telegraf to update your template, but that will affect only new indexes.

@ajaybhatnagar
Copy link

Thanks. template overwrite, fixed it.

@sparrc
Copy link
Contributor

sparrc commented Mar 20, 2017

I think this PR looks more or less good to go, @danielnelson do you want to give it a final review?

@danielnelson danielnelson merged commit bb28fb2 into influxdata:master Mar 21, 2017
ssorathia pushed a commit to ssorathia/telegraf that referenced this pull request Mar 25, 2017
calerogers pushed a commit to calerogers/telegraf that referenced this pull request Apr 5, 2017
@fabMrc
Copy link

fabMrc commented May 12, 2017

Is it released ?

@sparrc
Copy link
Contributor

sparrc commented May 12, 2017

@fabMrc
Copy link

fabMrc commented May 12, 2017

Ok nice ! Soon or later ?

@danielnelson
Copy link
Contributor

@fabMrc Soon, you can try the latest release candidate, links are over here #2733

vlamug pushed a commit to vlamug/telegraf that referenced this pull request May 30, 2017
@lpic10 lpic10 deleted the es_support branch November 23, 2017 09:48
maxunt pushed a commit that referenced this pull request Jun 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants