Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[O11y][Hadoop] Resolve the conflicts in host.ip field #7564

Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions packages/hadoop/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,106 @@ This integration is used to collect [Hadoop](https://hadoop.apache.org/) metrics

This integration uses Resource Manager API and JMX API to collect above metrics.

## Compatibility

This integration has been tested against Hadoop version `3.3.6`.

### Troubleshooting

If host.ip is shown conflicted under ``logs-*`` data view, then this issue can be solved by reindexing the ``Application`` data stream's indices.
milan-elastic marked this conversation as resolved.
Show resolved Hide resolved
If host.ip is shown conflicted under ``metrics-*`` data view, then this issue can be solved by reindexing the ``Cluster``, ``Datanode``, ``Namenode`` and ``Node Manager`` data stream's indices.
To reindex the data, the following steps must be performed.

1. Stop the data stream by going to `Integrations -> Hadoop -> Integration policies` open the configuration of Hadoop and disable the `Collect Hadoop metrics` toggle to reindex metrics data stream and save the integration.

2. Copy data into the temporary index and delete the existing data stream and index template by performing the following steps in the Dev tools.

```
POST _reindex
{
"source": {
"index": "<index_name>"
},
"dest": {
"index": "temp_index"
}
}
```
Example:
```
POST _reindex
{
"source": {
"index": "metrics-hadoop.cluster-default"
},
"dest": {
"index": "temp_index"
}
}
```

```
DELETE /_data_stream/<data_stream>
```
Example:
```
DELETE /_data_stream/metrics-hadoop.cluster-default
```

```
DELETE _index_template/<index_template>
```
Example:
```
DELETE _index_template/metrics-hadoop.cluster
```
3. Go to `Integrations -> Hadoop -> Settings` and click on `Reinstall Hadoop`.

4. Copy data from temporary index to new index by performing the following steps in the Dev tools.

```
POST _reindex
{
"conflicts": "proceed",
"source": {
"index": "temp_index"
},
"dest": {
"index": "<index_name>",
"op_type": "create"

}
}
```
Example:
```
POST _reindex
{
"conflicts": "proceed",
"source": {
"index": "temp_index"
},
"dest": {
"index": "metrics-hadoop.cluster-default",
"op_type": "create"

}
}
```

5. Verify data is reindexed completely.

6. Start the data stream by going to the `Integrations -> Hadoop -> Integration policies` and open configuration of integration and enable the `Collect Hadoop metrics` toggle and save the integration.

7. Delete temporary index by performing the following step in the Dev tools.

```
DELETE temp_index
```

More details about reindexing can be found [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html).


## application

This data stream collects Application metrics.
Expand Down
5 changes: 5 additions & 0 deletions packages/hadoop/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.9.1"
changes:
- description: Resolve host.ip field conflict.
type: bugfix
link: https://github.com/elastic/integrations/pull/7564
- version: "0.9.0"
changes:
- description: Add support for HTTP request trace logging in application data stream.
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/application/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,7 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: tags
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/cluster/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/datanode/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/namenode/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
2 changes: 2 additions & 0 deletions packages/hadoop/data_stream/node_manager/fields/ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
name: event.module
- external: ecs
name: event.type
- external: ecs
name: host.ip
- external: ecs
name: service.address
- external: ecs
Expand Down
105 changes: 105 additions & 0 deletions packages/hadoop/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,106 @@ This integration is used to collect [Hadoop](https://hadoop.apache.org/) metrics

This integration uses Resource Manager API and JMX API to collect above metrics.

## Compatibility

This integration has been tested against Hadoop version `3.3.6`.

### Troubleshooting

If host.ip is shown conflicted under ``logs-*`` data view, then this issue can be solved by reindexing the ``Application`` data stream's indices.
If host.ip is shown conflicted under ``metrics-*`` data view, then this issue can be solved by reindexing the ``Cluster``, ``Datanode``, ``Namenode`` and ``Node Manager`` data stream's indices.
To reindex the data, the following steps must be performed.

1. Stop the data stream by going to `Integrations -> Hadoop -> Integration policies` open the configuration of Hadoop and disable the `Collect Hadoop metrics` toggle to reindex metrics data stream and save the integration.

2. Copy data into the temporary index and delete the existing data stream and index template by performing the following steps in the Dev tools.

```
POST _reindex
{
"source": {
"index": "<index_name>"
},
"dest": {
"index": "temp_index"
}
}
```
Example:
```
POST _reindex
{
"source": {
"index": "metrics-hadoop.cluster-default"
},
"dest": {
"index": "temp_index"
}
}
```

```
DELETE /_data_stream/<data_stream>
```
Example:
```
DELETE /_data_stream/metrics-hadoop.cluster-default
```

```
DELETE _index_template/<index_template>
```
Example:
```
DELETE _index_template/metrics-hadoop.cluster
```
3. Go to `Integrations -> Hadoop -> Settings` and click on `Reinstall Hadoop`.

4. Copy data from temporary index to new index by performing the following steps in the Dev tools.

```
POST _reindex
{
"conflicts": "proceed",
"source": {
"index": "temp_index"
},
"dest": {
"index": "<index_name>",
"op_type": "create"

}
}
```
Example:
```
POST _reindex
{
"conflicts": "proceed",
"source": {
"index": "temp_index"
},
"dest": {
"index": "metrics-hadoop.cluster-default",
"op_type": "create"

}
}
```

5. Verify data is reindexed completely.

6. Start the data stream by going to the `Integrations -> Hadoop -> Integration policies` and open configuration of integration and enable the `Collect Hadoop metrics` toggle and save the integration.

7. Delete temporary index by performing the following step in the Dev tools.

```
DELETE temp_index
```

More details about reindexing can be found [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html).


## application

This data stream collects Application metrics.
Expand Down Expand Up @@ -100,6 +200,7 @@ An example event for `application` looks as following:
| hadoop.application.time.finished | Application finished time | date |
| hadoop.application.time.started | Application start time | date |
| hadoop.application.vcore_seconds | The amount of CPU resources the application has allocated | long |
| host.ip | Host ip addresses. | ip |
| input.type | Type of Filebeat input. | keyword |
| tags | User defined tags | keyword |

Expand Down Expand Up @@ -243,6 +344,7 @@ An example event for `cluster` looks as following:
| hadoop.cluster.virtual_cores.available | The number of available virtual cores | long |
| hadoop.cluster.virtual_cores.reserved | The number of reserved virtual cores | long |
| hadoop.cluster.virtual_cores.total | The total number of virtual cores | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down Expand Up @@ -375,6 +477,7 @@ An example event for `datanode` looks as following:
| hadoop.datanode.estimated_capacity_lost_total | The estimated capacity lost in bytes | long |
| hadoop.datanode.last_volume_failure_date | The date/time of the last volume failure in milliseconds since epoch | date |
| hadoop.datanode.volumes.failed | Number of failed volumes | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down Expand Up @@ -522,6 +625,7 @@ An example event for `namenode` looks as following:
| hadoop.namenode.stale_data_nodes | Current number of DataNodes marked stale due to delayed heartbeat | long |
| hadoop.namenode.total_load | Current number of connections | long |
| hadoop.namenode.volume_failures_total | Total number of volume failures across all Datanodes | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down Expand Up @@ -638,6 +742,7 @@ An example event for `node_manager` looks as following:
| hadoop.node_manager.containers.killed | Containers Killed | long |
| hadoop.node_manager.containers.launched | Containers Launched | long |
| hadoop.node_manager.containers.running | Containers Running | long |
| host.ip | Host ip addresses. | ip |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| tags | List of keywords used to tag each event. | keyword |
Expand Down
2 changes: 1 addition & 1 deletion packages/hadoop/manifest.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
format_version: 1.0.0
name: hadoop
title: Hadoop
version: "0.9.0"
version: "0.9.1"
license: basic
description: Collect metrics from Apache Hadoop with Elastic Agent.
type: integration
Expand Down