-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add container metrics fields from ECS #87
Conversation
8a10c6f
to
3dd54a3
Compare
995ab45
to
1076e36
Compare
Signed-off-by: ChrsMark <[email protected]>
1076e36
to
d781478
Compare
Signed-off-by: ChrsMark <[email protected]>
|
||
## HTTP Server | ||
|
||
### Metric: `metric.container.cpu.usage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name should just be container.cpu.usage
The metric.
classifier is for our YAML database, not the metric name. (Sorry for confusion)
Same comment for all other metrics.
groups: | ||
- id: metric.container.cpu.usage | ||
type: metric | ||
metric_name: container.cpu.usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed separate from system.cpu.*
?
(there was some related discussion in open-telemetry/opentelemetry-specification#2388)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank's @trask! That's interesting one and it's quite similar to open-telemetry/opentelemetry-specification#2388 (comment).
I will try to share my view on this :).
So, from infrastructure's observability point of view, container metrics would be collected from outside the containers themselves (as a best practice).
For example by using the Docker runtime's APIs or Kubelet's API's.
So the point here is that it does not make a lot of sense to treat a container as a host and collect its metrics by running a collector inside the container.
So for metrics collected through the runtime/orchastrator APIs I think we need to be specific and have a resource specific namespace like container.cpu
, pod.cpu
etc.
In many cases the CPU/Memory resources are limited as well:
- https://docs.docker.com/config/containers/resource_constraints/#cpu
- https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit
In very specific usecases, if someone wants to treat a container as a host (kind
uses containers' as hosts) then the collector should be running on the container/host directly and report metrics under the system.*
namespace. In that case the resource is a "host" not a container from the observation/collection point of view.
Also related to #226 (comment).
on all network interfaces | ||
by the container since | ||
the last metric collection. | ||
instrument: gauge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a counter instead (since values are additive)?
on all network interfaces | ||
by the container since | ||
the last metric collection. | ||
instrument: gauge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be a counter?
The total number of bytes written | ||
successfully (aggregated from all disks) | ||
since the last metric collection | ||
instrument: gauge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The total number of bytes read | ||
successfully (aggregated from all disks) | ||
since the last metric collection. | ||
instrument: gauge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here: it's a counter
Closing in honor of #282. After the project restructuring it was hard to rebase and recover this branch due to some weird permission errors. Starting fresh was faster. |
This PR adds container related metrics fields as part of #72.
cc: @AlexanderWert @kaiyan-sheng @mlunadia