Skip to content

Commit

Permalink
[O11y][Hadoop] Resolve the conflicts in host.ip field (#7564)
Browse files Browse the repository at this point in the history
* resolve host ip field conflict

* update pr link in changelog.yml

* update the compatibility section in readme

* update readme.md

* update readme.md
  • Loading branch information
milan-elastic authored Sep 4, 2023
1 parent b037f8d commit 4c3ebff
Show file tree
Hide file tree
Showing 9 changed files with 39 additions and 1 deletion.
9 changes: 9 additions & 0 deletions packages/hadoop/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,15 @@ This integration is used to collect [Hadoop](https://hadoop.apache.org/) metrics

This integration uses Resource Manager API and JMX API to collect above metrics.

## Compatibility

This integration has been tested against Hadoop version `3.3.6`.

### Troubleshooting

If host.ip is shown conflicted under ``logs-*`` data view, then this issue can be solved by [reindexing](https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#reindex-with-a-data-stream) the ``Application`` data stream's indices.
If host.ip is shown conflicted under ``metrics-*`` data view, then this issue can be solved by [reindexing](https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#reindex-with-a-data-stream) the ``Cluster``, ``Datanode``, ``Namenode`` and ``Node Manager`` data stream's indices.

## application

This data stream collects Application metrics.
Expand Down
5 changes: 5 additions & 0 deletions packages/hadoop/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.9.1"
changes:
- description: Resolve host.ip field conflict.
type: bugfix
link: https://github.com/elastic/integrations/pull/7564
- version: "0.9.0"
changes:
- description: Add support for HTTP request trace logging in application data stream.
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/application/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,7 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: tags
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/cluster/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/datanode/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/namenode/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/node_manager/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
14 changes: 14 additions & 0 deletions packages/hadoop/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,15 @@ This integration is used to collect [Hadoop](https://hadoop.apache.org/) metrics

This integration uses Resource Manager API and JMX API to collect above metrics.

## Compatibility

This integration has been tested against Hadoop version `3.3.6`.

### Troubleshooting

If host.ip is shown conflicted under ``logs-*`` data view, then this issue can be solved by [reindexing](https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#reindex-with-a-data-stream) the ``Application`` data stream's indices.
If host.ip is shown conflicted under ``metrics-*`` data view, then this issue can be solved by [reindexing](https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#reindex-with-a-data-stream) the ``Cluster``, ``Datanode``, ``Namenode`` and ``Node Manager`` data stream's indices.

## application

This data stream collects Application metrics.
Expand Down Expand Up @@ -100,6 +109,7 @@ An example event for `application` looks as following:
| hadoop.application.time.finished | Application finished time | date |
| hadoop.application.time.started | Application start time | date |
| hadoop.application.vcore_seconds | The amount of CPU resources the application has allocated | long |
| host.ip | Host ip addresses. | ip |
| input.type | Type of Filebeat input. | keyword |
| tags | User defined tags | keyword |

Expand Down Expand Up @@ -243,6 +253,7 @@ An example event for `cluster` looks as following:
| hadoop.cluster.virtual_cores.available | The number of available virtual cores | long |
| hadoop.cluster.virtual_cores.reserved | The number of reserved virtual cores | long |
| hadoop.cluster.virtual_cores.total | The total number of virtual cores | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down Expand Up @@ -375,6 +386,7 @@ An example event for `datanode` looks as following:
| hadoop.datanode.estimated_capacity_lost_total | The estimated capacity lost in bytes | long |
| hadoop.datanode.last_volume_failure_date | The date/time of the last volume failure in milliseconds since epoch | date |
| hadoop.datanode.volumes.failed | Number of failed volumes | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down Expand Up @@ -522,6 +534,7 @@ An example event for `namenode` looks as following:
| hadoop.namenode.stale_data_nodes | Current number of DataNodes marked stale due to delayed heartbeat | long |
| hadoop.namenode.total_load | Current number of connections | long |
| hadoop.namenode.volume_failures_total | Total number of volume failures across all Datanodes | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down Expand Up @@ -638,6 +651,7 @@ An example event for `node_manager` looks as following:
| hadoop.node_manager.containers.killed | Containers Killed | long |
| hadoop.node_manager.containers.launched | Containers Launched | long |
| hadoop.node_manager.containers.running | Containers Running | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down
2 changes: 1 addition & 1 deletion packages/hadoop/manifest.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
format_version: 1.0.0
name: hadoop
title: Hadoop
version: "0.9.0"
version: "0.9.1"
license: basic
description: Collect metrics from Apache Hadoop with Elastic Agent.
type: integration
Expand Down

0 comments on commit 4c3ebff

Please sign in to comment.