-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
elasticsearch plugin: add a tag for node role #2158
Comments
If you set This plugin doesn't tag metrics by role. Unfortunately elasticsearch doesn't make the node role available via their cluster API. From what I can tell we might be able to get this info via a "cat nodes" query |
Thank you for looking into this. Actually the cluster/node API does expose the role. See example below (look for roles). BTW, would turning on local flag use a different API or the same?
|
@animageofmine, that is just a blob of JSON.....where did it come from? which API? can you provide a full request/response example? |
@sparrc Sure. Following is the information Query: curl localhost:9200/_nodes/_local Let me know if you need more info. BTW, I can't seem to find a metric that reports cluster health status (green, yellow, red). Any idea? {
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "elasticsearch",
"nodes": {
"ZOwb1f4DTVCQbuQpVu1jrw": {
"name": "elk4node01",
"transport_address": "10.2.240.172:9300",
"host": "10.2.240.172",
"ip": "10.2.240.172",
"version": "5.0.1",
"build_hash": "080bb47",
"total_indexing_buffer": 426010214,
"roles": [
"master",
"data",
"ingest"
],
"settings": {
"pidfile": "/var/run/elasticsearch/elasticsearch.pid",
"cluster": {
"name": "elasticsearch"
},
"node": {
"name": "elk4node01"
},
"path": {
"conf": "/etc/elasticsearch",
"data": [
"/var/lib/elasticsearch"
],
"logs": "/var/log/elasticsearch",
"home": "/usr/share/elasticsearch"
},
"client": {
"type": "node"
},
"http": {
"type": {
"default": "netty4"
}
},
"transport": {
"type": {
"default": "netty4"
}
},
"network": {
"host": "0.0.0.0",
"publish_host": "10.2.240.172"
}
},
"os": {
"refresh_interval_in_millis": 1000,
"name": "Linux",
"arch": "amd64",
"version": "4.4.27-moby",
"available_processors": 4,
"allocated_processors": 4
},
"process": {
"refresh_interval_in_millis": 1000,
"id": 45,
"mlockall": false
},
"jvm": {
"pid": 45,
"version": "1.8.0_111",
"vm_name": "OpenJDK 64-Bit Server VM",
"vm_version": "25.111-b14",
"vm_vendor": "Oracle Corporation",
"start_time_in_millis": 1481701191724,
"mem": {
"heap_init_in_bytes": 4294967296,
"heap_max_in_bytes": 4260102144,
"non_heap_init_in_bytes": 2555904,
"non_heap_max_in_bytes": 0,
"direct_max_in_bytes": 4260102144
},
"gc_collectors": [
"ParNew",
"ConcurrentMarkSweep"
],
"memory_pools": [
"Code Cache",
"Metaspace",
"Compressed Class Space",
"Par Eden Space",
"Par Survivor Space",
"CMS Old Gen"
],
"using_compressed_ordinary_object_pointers": "true"
},
"thread_pool": {
"force_merge": {
"type": "fixed",
"min": 1,
"max": 1,
"queue_size": -1
},
"fetch_shard_started": {
"type": "scaling",
"min": 1,
"max": 8,
"keep_alive": "5m",
"queue_size": -1
},
"listener": {
"type": "fixed",
"min": 2,
"max": 2,
"queue_size": -1
},
"index": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": 200
},
"refresh": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"generic": {
"type": "scaling",
"min": 4,
"max": 128,
"keep_alive": "30s",
"queue_size": -1
},
"warmer": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"search": {
"type": "fixed",
"min": 7,
"max": 7,
"queue_size": 1000
},
"flush": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"fetch_shard_store": {
"type": "scaling",
"min": 1,
"max": 8,
"keep_alive": "5m",
"queue_size": -1
},
"management": {
"type": "scaling",
"min": 1,
"max": 5,
"keep_alive": "5m",
"queue_size": -1
},
"get": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": 1000
},
"bulk": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": 50
},
"snapshot": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
}
},
"transport": {
"bound_address": [
"[::]:9300"
],
"publish_address": "10.2.240.172:9300",
"profiles": {}
},
"http": {
"bound_address": [
"[::]:9200"
],
"publish_address": "10.2.240.172:9200",
"max_content_length_in_bytes": 104857600
},
"plugins": [
{
"name": "repository-s3",
"version": "5.0.1",
"description": "The S3 repository plugin adds S3 repositories",
"classname": "org.elasticsearch.plugin.repository.s3.S3RepositoryPlugin"
}
],
"modules": [
{
"name": "aggs-matrix-stats",
"version": "5.0.1",
"description": "Adds aggregations whose input are a list of numeric fields and output includes a matrix.",
"classname": "org.elasticsearch.search.aggregations.matrix.MatrixAggregationPlugin"
},
{
"name": "ingest-common",
"version": "5.0.1",
"description": "Module for ingest processors that do not require additional security permissions or have large dependencies and resources",
"classname": "org.elasticsearch.ingest.common.IngestCommonPlugin"
},
{
"name": "lang-expression",
"version": "5.0.1",
"description": "Lucene expressions integration for Elasticsearch",
"classname": "org.elasticsearch.script.expression.ExpressionPlugin"
},
{
"name": "lang-groovy",
"version": "5.0.1",
"description": "Groovy scripting integration for Elasticsearch",
"classname": "org.elasticsearch.script.groovy.GroovyPlugin"
},
{
"name": "lang-mustache",
"version": "5.0.1",
"description": "Mustache scripting integration for Elasticsearch",
"classname": "org.elasticsearch.script.mustache.MustachePlugin"
},
{
"name": "lang-painless",
"version": "5.0.1",
"description": "An easy, safe and fast scripting language for Elasticsearch",
"classname": "org.elasticsearch.painless.PainlessPlugin"
},
{
"name": "percolator",
"version": "5.0.1",
"description": "Percolator module adds capability to index queries and query these queries by specifying documents",
"classname": "org.elasticsearch.percolator.PercolatorPlugin"
},
{
"name": "reindex",
"version": "5.0.1",
"description": "The Reindex module adds APIs to reindex from one index to another or update documents in place.",
"classname": "org.elasticsearch.index.reindex.ReindexPlugin"
},
{
"name": "transport-netty3",
"version": "5.0.1",
"description": "Netty 3 based transport implementation",
"classname": "org.elasticsearch.transport.Netty3Plugin"
},
{
"name": "transport-netty4",
"version": "5.0.1",
"description": "Netty 4 based transport implementation",
"classname": "org.elasticsearch.transport.Netty4Plugin"
}
],
"ingest": {
"processors": [
{
"type": "append"
},
{
"type": "convert"
},
{
"type": "date"
},
{
"type": "date_index_name"
},
{
"type": "dot_expander"
},
{
"type": "fail"
},
{
"type": "foreach"
},
{
"type": "grok"
},
{
"type": "gsub"
},
{
"type": "join"
},
{
"type": "json"
},
{
"type": "lowercase"
},
{
"type": "remove"
},
{
"type": "rename"
},
{
"type": "script"
},
{
"type": "set"
},
{
"type": "sort"
},
{
"type": "split"
},
{
"type": "trim"
},
{
"type": "uppercase"
}
]
}
}
}
}
|
+1 |
The /_nodes/stats endpoint has the roles as well. It gets the information for all of the nodes in the cluster. |
@sparrc Part of the problem is that the parser was set up to ignore strings and only process numeric data as metrics. In my recent merge that you accepted, I added the capability for the plugin to get the string data too (I needed it in the new API calls I was making). But I kept the node stats unchanged to avoid changing the behavior for anyone else. Maybe another plugin option could be used to control this behavior? |
we don't need to add a config option to add a tag to the metrics |
Node role as a tag would be really useful, what is actually blocking for adding it ? |
@eesprit Yes, I think we just need a PR |
+1 |
This adds node_roles as a tag to the exported elasticsearch metrics. For example: node_roles=master\,data\
Pushed a possible fix in the commit above (tests not adjusted yet). I don't know if we want an option to include/exclude them? (And what default do we use)? |
I don't like it, but I think this is our only/best option. Would be really nice if we could send multiple values for a tag. My only suggestion is to sort the roles in the list so they will be in a stable order. |
Bug report
We have a cluster of 9 nodes in elasticsearch. 5 data nodes, 3 master and 1 client. We use KairosDB for storing telegraf data and Grafana for graphs. One of the problems we are facing is to group metrics by role (master, client or data). However, it looks like each node in elasticsearch cluster returns payload for whole cluster. For example, if I want to monitor JVM Heap (mem_heap_used_in_bytes) for only data nodes, I can't seem to find a way to do that because each node returns JVM Heap for all the nodes that includes data, master and client nodes (because each node is cluster aware via Zen Discovery).
Not sure if I am doing anything wrong here or my understanding is incorrect, but I wanted to check if there is way to deal with this problem (I really hope I am doing something silly). Please see telegraf.conf below
Relevant telegraf.conf:
System info:
Linux elasticsearchNodeData1 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Telegraf version: Telegraf - version 1.0.0-beta3
All the nodes are dockerized with debian based build.
Steps to reproduce:
If we fetch some metric, say "mem_heap_used_in_bytes", this seems to fetch data from all the node types (data, client and master). Can't seem to find a way to isolate stats from each role.
There is something in config called "local=true", not sure what is it for".
Please let me know if you have any questions or need more info. Thank you.
The text was updated successfully, but these errors were encountered: