Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Prometheus instrumentation #99

Merged
merged 4 commits into from
Nov 6, 2019
Merged

Implement Prometheus instrumentation #99

merged 4 commits into from
Nov 6, 2019

Conversation

stefanprodan
Copy link
Collaborator

@stefanprodan stefanprodan commented Nov 6, 2019

This PR adds instrumentation for the controller App Mesh API calls and exports Prometheus metrics for meshes, virtual nodes and virtual services.

Metrics:

  • appmesh_mesh_state{name} gauge
  • appmesh_virtual_node_state{name, mesh} gauge
  • appmesh_virtual_service_state{name, mesh} gauge
  • appmesh_api_request_duration_seconds{kind, object, operation} histogram

The controller instrumentationr records mesh, virtual node and virtual service operations as gauges. For each object the gauge value represents the current state, 1 means that the object is active while 0 means that the object has been deleted.

The API client instrumentation records the duration of App Mesh API calls based on object kind, name and operation type. The operation type can be get, create, update or delete. The object kind can be mesh, virtual node, virtual route, virtual router or virtual service.

The appmesh_api_request_duration_seconds histogram helps track issues like #96

Fix: #98

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Collaborator

@kiranmeduri kiranmeduri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! thanks for bringing this in and improving the operational readiness of controller.

@@ -231,6 +246,11 @@ func (v *VirtualNode) BackendsSet() set.Set {

// GetVirtualNode calls describe virtual node.
func (c *Cloud) GetVirtualNode(ctx context.Context, name string, meshName string) (*VirtualNode, error) {
begin := time.Now()
defer func() {
c.stats.SetRequestDuration("virtual_node", name, "get", time.Since(begin))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it better to normalize? <meshName>.<virtualNodeName>

Copy link
Collaborator Author

@stefanprodan stefanprodan Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use dots in Prometheus metrics, we could add the mesh as a separate label if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @stefanprodan, let's add as a separate label.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to do this and add more instrumentation in a followup PR. The metrics as they are now are sufficient for the Grafana dashboard I made in aws/eks-charts#30

@nckturner
Copy link
Contributor

Can you rebase?

- appmesh_mesh_state gauge
- appmesh_virtual_node_state gauge
- appmesh_virtual_service_state gauge
- appmesh_api_request_duration_seconds histogram

Signed-off-by: stefanprodan <[email protected]>
Records the duration of App Mesh API calls based on object kind, name and operation type. The operation type can be get, create, update or delete. The object kind can be mesh, virtual node, virtual route, virtual router or virtual service.

Signed-off-by: stefanprodan <[email protected]>
Record mesh, virtual node and virtual service operations as gauges. For each object the gauge value represents the current state, 1 means that the object is active while 0 means that the object has been deleted.

Signed-off-by: stefanprodan <[email protected]>
@stefanprodan
Copy link
Collaborator Author

@nckturner rebase done

@nckturner nckturner merged commit 533b0e6 into aws:master Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prometheus instrumentation
3 participants