Skip to content

Commit

Permalink
[agent_metrics] Update README and metadata (#10793)
Browse files Browse the repository at this point in the history
Co-authored-by: Sarina Bloodgood <[email protected]>
  • Loading branch information
olivielpeau and sarina-dd authored Dec 8, 2021
1 parent 03a3d14 commit 34f8dee
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 97 deletions.
39 changes: 16 additions & 23 deletions agent_metrics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,57 +2,50 @@

## Overview

Get metrics from the Agent Metrics service in real time to:
Get internal metrics from the Datadog Agent in real time to visualize and monitor
the Datadog Agent's internal metrics.

- Visualize and monitor `agent_metrics` states.
- Be notified about `agent_metrics` failovers and events.

**NOTE**: The Agent Metrics check has been rewritten in Go for Agent v6 to take advantage of the new internal architecture. Hence it is still maintained but **only works with Agents prior to major version 6**.

To collect Agent metrics for Agent v6+, use the [Go-expvar check][1] with [the `agent_stats.yaml` configuration file][2] packaged with the Agent.
Note: The list of metrics collected by this integration may change between minor Agent versions.
Such changes may not be mentioned in the Agent's changelog.

## Setup

### Installation

The Agent Metrics check is included in the [Datadog Agent][3] package, so you don't need to install anything else on your servers.
The Agent Metrics integration, based on the [go_expvar][1] check, is included in the [Datadog Agent][2] package, so you don't need to install anything else on your servers.

### Configuration

1. Edit the `agent_metrics.d/conf.yaml` file, in the `conf.d/` folder at the root of your [Agent's configuration directory][4], to point to your server and port, set the masters to monitor. See the [sample agent_metrics.d/conf.yaml][5] for all available configuration options.
1. Rename the [`go_expvar.d/agent_stats.yaml.example`][3] file, in the `conf.d/` folder at the root of your [Agent's configuration directory][4], to `go_expvar.d/agent_stats.yaml`.

2. [Restart the Agent][6].
2. [Restart the Agent][5].

### Validation

[Run the Agent's status subcommand][7] and look for `agent_metrics` under the Checks section.
[Run the Agent's status subcommand][6] and look for `go_expvar` under the Checks section.

## Data Collected

All data collected are only available for Agent v5.

### Metrics

See [metadata.csv][8] for a list of metrics provided by this integration.
The Agent Metrics integration collects the metrics defined in [`agent_stats.yaml.example`][3].

### Events

The Agent Metrics check does not include any events.
The Agent Metrics integration does not include any events.

### Service Checks

The Agent Metrics check does not include any service checks.
The Agent Metrics integration does not include any service checks.

## Troubleshooting

Need help? Contact [Datadog support][9].
Need help? Contact [Datadog support][7].

[1]: https://docs.datadoghq.com/integrations/go_expvar/
[2]: https://github.com/DataDog/datadog-agent/blob/master/cmd/agent/dist/conf.d/go_expvar.d/agent_stats.yaml.example
[3]: https://app.datadoghq.com/account/settings#agent
[3]: https://github.com/DataDog/datadog-agent/blob/master/cmd/agent/dist/conf.d/go_expvar.d/agent_stats.yaml.example
[4]: https://docs.datadoghq.com/agent/guide/agent-configuration-files/#agent-configuration-directory
[5]: https://github.com/DataDog/integrations-core/blob/agent-v5/agent_metrics/datadog_checks/agent_metrics/data/conf.yaml.default
[6]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
[7]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
[8]: https://github.com/DataDog/integrations-core/blob/master/agent_metrics/metadata.csv
[9]: https://docs.datadoghq.com/help/
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
[6]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
[7]: https://docs.datadoghq.com/help/
2 changes: 1 addition & 1 deletion agent_metrics/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"maintainer": "[email protected]",
"manifest_version": "1.0.0",
"metric_prefix": "datadog.agent.",
"metric_to_check": "datadog.agent.collector.cpu.used",
"metric_to_check": "",
"name": "agent_metrics",
"public_title": "Datadog-Agent Metrics Integration",
"short_description": "agent_metrics description.",
Expand Down
74 changes: 2 additions & 72 deletions agent_metrics/metadata.csv
Original file line number Diff line number Diff line change
@@ -1,74 +1,4 @@
metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name
datadog.agent.aggregator.checks_metric_sample,gauge,,,,,0,agent_metrics,agg check sample
datadog.agent.aggregator.dogstatsd_metric_sample,gauge,,,,,0,agent_metrics,agg dogstatsd sample
datadog.agent.aggregator.event,gauge,,,,,0,agent_metrics,agg event
datadog.agent.aggregator.events_flushed,gauge,,,,,0,agent_metrics,agg events flushed
datadog.agent.aggregator.flush.checks_metric_sample_flush_time.last_flush,gauge,,,,,0,agent_metrics,agg flush metric sample time last
datadog.agent.aggregator.flush.event_flush_time.last_flush,gauge,,,,,0,agent_metrics,agg flush event time last
datadog.agent.aggregator.flush.main_flush_time.last_flush,gauge,,,,,0,agent_metrics,agg flush main time last
datadog.agent.aggregator.flush.metric_sketch_flush_time.last_flush,gauge,,,,,0,agent_metrics,agg flush metric sketch time last
datadog.agent.aggregator.flush.service_check_flush_time.last_flush,gauge,,,,,0,agent_metrics,agg flush service check time last
datadog.agent.aggregator.flush_count.events.last_flush,gauge,,,,,0,agent_metrics,agg flush count events last
datadog.agent.aggregator.flush_count.series.last_flush,gauge,,,,,0,agent_metrics,agg flush count series last
datadog.agent.aggregator.flush_count.service_checks.last_flush,gauge,,,,,0,agent_metrics,agg flush count service checks last
datadog.agent.aggregator.flush_count.sketches.last_flush,gauge,,,,,0,agent_metrics,agg flush count sketches last
datadog.agent.aggregator.hostname_update,gauge,,,,,0,agent_metrics,agg hostname update
datadog.agent.aggregator.number_of_flush,gauge,,,,,0,agent_metrics,agg number flush
datadog.agent.aggregator.series_flushed,gauge,,,,,0,agent_metrics,agg series flushed
datadog.agent.aggregator.service_check,gauge,,,,,0,agent_metrics,agg service check
datadog.agent.aggregator.service_check_flushed,gauge,,,,,0,agent_metrics,agg service check flushed
datadog.agent.aggregator.check_ready,gauge,,,,,0,agent_metrics,agg check ready
datadog.agent.collector.cpu.used,gauge,,,,,0,agent_metrics,coll cpu used
datadog.agent.dogstatsd.event_packets,count,,,,,0,agent_metrics,dogstatsd event packet
datadog.agent.dogstatsd.event_parse_errors,gauge,,,,,0,agent_metrics,dogstatsd event err
datadog.agent.dogstatsd.metric_packets,count,,,,,0,agent_metrics,dogstatsd metric
datadog.agent.dogstatsd.metric_parse_errors,gauge,,,,,0,agent_metrics,dogstatsd metric err
datadog.agent.dogstatsd_udp.packet_reading_errors,gauge,,,,,0,agent_metrics,dogstatsd udp pkt err
datadog.agent.dogstatsd_upd.packets,count,,,,,0,agent_metrics,dogstatsd udp pkt
datadog.agent.dogstatsd_uds.origin_detection_errors,gauge,,,,,0,agent_metrics,dogstatsd uds origin err
datadog.agent.dogstatsd_uds.packet_reading_errors,gauge,,,,,0,agent_metrics,dogstatsd uds pkt err
datadog.agent.emitter.emit.time,gauge,,,,,0,agent_metrics,emitter emit time
datadog.agent.forwarder.transactions.check_runs_v1,gauge,,,,,0,agent_metrics,fwd trans check run
datadog.agent.forwarder.transactions.dropped_on_input,gauge,,,,,0,agent_metrics,fwd trans dropped
datadog.agent.forwarder.transactions.errors,gauge,,,,,0,agent_metrics,fwd trans err
datadog.agent.forwarder.transactions.events,gauge,,,,,0,agent_metrics,fwd trans event
datadog.agent.forwarder.transactions.host_metadata,gauge,,,,,0,agent_metrics,fwd trans host metadata
datadog.agent.forwarder.transactions.intake_v1,gauge,,,,,0,agent_metrics,fwd trans intake
datadog.agent.forwarder.transactions.metadata,gauge,,,,,0,agent_metrics,fwd trans metadata
datadog.agent.forwarder.transactions.retry_queue_size,gauge,,,,,0,agent_metrics,fwd trans retry queue
datadog.agent.forwarder.transactions.series,gauge,,,,,0,agent_metrics,fwd trans series
datadog.agent.forwarder.transactions.service_checks,gauge,,,,,0,agent_metrics,fwd trans service checks
datadog.agent.forwarder.transactions.success,gauge,,,,,0,agent_metrics,fwd trans success
datadog.agent.forwarder.transactions.timeseries_v1,gauge,,,,,0,agent_metrics,fwd trans timeseries
datadog.agent.logs_agent.destination_errors,gauge,,,,,0,agent_metrics,logs destination err
datadog.agent.logs_agent.is_running,gauge,,,,,0,agent_metrics,logs running
datadog.agent.logs_agent.logs_decoded,gauge,,,,,0,agent_metrics,logs decoded
datadog.agent.logs_agent.logs_processed,gauge,,,,,0,agent_metrics,logs processed
datadog.agent.logs_agent.logs_sent,gauge,,,,,0,agent_metrics,logs sent
datadog.agent.memstats.alloc,gauge,,,,,0,agent_metrics,mem alloc
datadog.agent.memstats.free,gauge,,,,,0,agent_metrics,mem free
datadog.agent.memstats.heap_alloc,gauge,,,,,0,agent_metrics,mem heap alloc
datadog.agent.memstats.heap_idle,gauge,,,,,0,agent_metrics,mem heap idle
datadog.agent.memstats.heap_inuse,gauge,,,,,0,agent_metrics,mem heap inuse
datadog.agent.memstats.heap_objects,gauge,,,,,0,agent_metrics,mem heap objects
datadog.agent.memstats.heap_released,gauge,,,,,0,agent_metrics,mem heap released
datadog.agent.memstats.heap_sys,gauge,,,,,0,agent_metrics,mem heap sys
datadog.agent.memstats.lookups,gauge,,,,,0,agent_metrics,mem lookup
datadog.agent.memstats.mallocs,gauge,,,,,0,agent_metrics,mem malloc
datadog.agent.memstats.num_gc,gauge,,,,,0,agent_metrics,mem num gc
datadog.agent.memstats.pause_ns.95percentile,gauge,,,,,0,agent_metrics,mem pause 95 percentile
datadog.agent.memstats.pause_ns.avg,rate,,,,,0,agent_metrics,mem pause avg
datadog.agent.memstats.pause_ns.count,gauge,,,,,0,agent_metrics,mem pause count
datadog.agent.memstats.pause_ns.max,gauge,,,,,0,agent_metrics,mem pause max
datadog.agent.memstats.pause_ns.median,gauge,,,,,0,agent_metrics,mem pause median
datadog.agent.memstats.pause_total_ns,gauge,,,,,0,agent_metrics,mem pause total
datadog.agent.memstats.total_alloc,gauge,,,,,0,agent_metrics,mem total
datadog.agent.python.version,gauge,,,,,0,agent_metrics,py version
datadog.agent.running,gauge,,,,,0,agent_metrics,running
datadog.agent.scheduler.checks_entered,gauge,,,,,0,agent_metrics,scheduler check
datadog.agent.scheduler.queues_count,gauge,,,,,0,agent_metrics,scheduler queues
datadog.agent.splitter.not_too_big,gauge,,,,,0,agent_metrics,splitter not too big
datadog.agent.splitter.payload_drops,gauge,,,,,0,agent_metrics,splitter payload drops
datadog.agent.splitter.too_big,gauge,,,,,0,agent_metrics,splitter too big
datadog.agent.splitter.total_loops,gauge,,,,,0,agent_metrics,splitter total loops
datadog.agent.started,count,,,,,0,agent_metrics,started
datadog.agent.running,gauge,,,,,0,agent_metrics,running
datadog.agent.python.version,gauge,,,,,0,agent_metrics,py version
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,11 @@ def validate(self, check_name, decoded, fix):


class MetricToCheckValidator(BaseManifestValidator):
CHECKS_EXCLUDE_LIST = {
'agent_metrics', # this (agent-internal) check doesn't guarantee a list of stable metrics for now
'moogsoft',
'snmp',
}
METRIC_TO_CHECK_EXCLUDE_LIST = {
'openstack.controller', # "Artificial" metric, shouldn't be listed in metadata file.
'riakcs.bucket_list_pool.workers', # RiakCS 2.1 metric, but metadata.csv lists RiakCS 2.0 metrics only.
Expand All @@ -136,7 +141,7 @@ class MetricToCheckValidator(BaseManifestValidator):
PRICING_PATH = {V1: "/pricing", V2: "/pricing"}

def validate(self, check_name, decoded, _):
if not self.should_validate() or check_name == 'snmp' or check_name == 'moogsoft':
if not self.should_validate() or check_name in self.CHECKS_EXCLUDE_LIST:
return

metadata_path = self.METADATA_PATH[self.version]
Expand Down

0 comments on commit 34f8dee

Please sign in to comment.