Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add telemetry for data tiers #63031

Merged
merged 6 commits into from
Oct 1, 2020
Merged

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Sep 29, 2020

This commit adds telemetry for our data tier formalization. This telemetry helps determine the
topology of the cluster with regard to the content, hot, warm, & cold tiers/roles.

An example of the telemetry looks like:

GET /_xpack/usage?human
{
  ...
  "data_tiers" : {
    "available" : true,
    "enabled" : true,
    "data_warm" : {
      ...
    },
    "data_cold" : {
      ...
    },
    "data_content" : {
      "node_count" : 1,
      "index_count" : 6,
      "total_shard_count" : 6,
      "primary_shard_count" : 6,
      "doc_count" : 71,
      "total_size" : "59.6kb",
      "total_size_bytes" : 61110,
      "primary_size" : "59.6kb",
      "primary_size_bytes" : 61110,
      "primary_shard_size_avg" : "9.9kb",
      "primary_shard_size_avg_bytes" : 10185,
      "primary_shard_size_median" : "8kb",
      "primary_shard_size_median_bytes" : 8254,
      "primary_shard_size_mad" : "7.2kb",
      "primary_shard_size_mad_bytes" : 7391
    },
    "data_hot" : {
       ...
    }
  }
}

The fields are as follows:

  • node_count :: number of nodes with this tier/role
  • index_count :: number of indices on this tier
  • total_shard_count :: total number of shards for all nodes in this tier
  • primary_shard_count :: number of primary shards for all nodes in this tier
  • doc_count :: number of documents for all nodes in this tier
  • total_size_bytes :: total number of bytes for all shards for all nodes in this tier
  • primary_size_bytes :: number of bytes for all primary shards on all nodes in this tier
  • primary_shard_size_avg_bytes :: average shard size for primary shard in this tier
  • primary_shard_size_median_bytes :: median shard size for primary shard in this tier
  • primary_shard_size_mad_bytes :: median absolute deviation of shard size for primary shard in this tier

Relates to #60848

This commit adds telemetry for our data tier formalization. This telemetry helps determine the
topology of the cluster with regard to the content, data, hot, warm, & cold tiers/roles.

An example of the telemetry looks like:

```
GET /_xpack/usage?human
{
  ...
  "data_tiers" : {
    "available" : true,
    "enabled" : true,
    "data_warm" : {
      ...
    },
    "data" : {
      ...
    },
    "data_cold" : {
      ...
    },
    "data_content" : {
      "node_count" : 1,
      "index_count" : 6,
      "total_shard_count" : 6,
      "primary_shard_count" : 6,
      "doc_count" : 71,
      "total_size" : "59.6kb",
      "total_size_bytes" : 61110,
      "primary_size" : "59.6kb",
      "primary_size_bytes" : 61110,
      "primary_shard_size_avg" : "9.9kb",
      "primary_shard_size_avg_bytes" : 10185,
      "primary_shard_size_median_bytes" : "8kb",
      "primary_shard_size_median_bytes" : 8254,
      "primary_shard_size_mad_bytes" : "7.2kb",
      "primary_shard_size_mad_bytes" : 7391
    },
    "data_hot" : {
       ...
    }
  }
}
```

The fields are as follows:

- node_count :: number of nodes with this tier/role
- index_count :: number of indices on this tier
- total_shard_count :: total number of shards for all nodes in this tier
- primary_shard_count :: number of primary shards for all nodes in this tier
- doc_count :: number of documents for all nodes in this tier
- total_size_bytes :: total number of bytes for all shards for all nodes in this tier
- primary_size_bytes :: number of bytes for all primary shards on all nodes in this tier
- primary_shard_size_avg_bytes :: average shard size for primary shard in this tier
- primary_shard_size_median_bytes :: median shard size for primary shard in this tier
- primary_shard_size_mad_bytes :: [median absolute deviation](https://en.wikipedia.org/wiki/Median_absolute_deviation) of shard size for primary shard in this tier

Relates to elastic#60848
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Features)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Sep 29, 2020
Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Lee. I think this generally looks great, left only a few minor suggestions

@dakrone dakrone requested a review from andreidan September 30, 2020 20:43
Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Lee

@dakrone dakrone merged commit 5fca68a into elastic:master Oct 1, 2020
@dakrone dakrone deleted the dt-add-telemetry branch October 1, 2020 14:35
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Oct 1, 2020
This commit adds telemetry for our data tier formalization. This telemetry helps determine the
topology of the cluster with regard to the content, hot, warm, & cold tiers/roles.

An example of the telemetry looks like:

```
GET /_xpack/usage?human
{
  ...
  "data_tiers" : {
    "available" : true,
    "enabled" : true,
    "data_warm" : {
      ...
    },
    "data_cold" : {
      ...
    },
    "data_content" : {
      "node_count" : 1,
      "index_count" : 6,
      "total_shard_count" : 6,
      "primary_shard_count" : 6,
      "doc_count" : 71,
      "total_size" : "59.6kb",
      "total_size_bytes" : 61110,
      "primary_size" : "59.6kb",
      "primary_size_bytes" : 61110,
      "primary_shard_size_avg" : "9.9kb",
      "primary_shard_size_avg_bytes" : 10185,
      "primary_shard_size_median" : "8kb",
      "primary_shard_size_median_bytes" : 8254,
      "primary_shard_size_mad" : "7.2kb",
      "primary_shard_size_mad_bytes" : 7391
    },
    "data_hot" : {
       ...
    }
  }
}
```

The fields are as follows:

- node_count :: number of nodes with this tier/role
- index_count :: number of indices on this tier
- total_shard_count :: total number of shards for all nodes in this tier
- primary_shard_count :: number of primary shards for all nodes in this tier
- doc_count :: number of documents for all nodes in this tier
- total_size_bytes :: total number of bytes for all shards for all nodes in this tier
- primary_size_bytes :: number of bytes for all primary shards on all nodes in this tier
- primary_shard_size_avg_bytes :: average shard size for primary shard in this tier
- primary_shard_size_median_bytes :: median shard size for primary shard in this tier
- primary_shard_size_mad_bytes :: [median absolute deviation](https://en.wikipedia.org/wiki/Median_absolute_deviation) of shard size for primary shard in this tier

Relates to elastic#60848
dakrone added a commit that referenced this pull request Oct 1, 2020
Backports the following commits to 7.x:

    Add telemetry for data tiers (#63031)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Data Management Meta label for data/management team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants