Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Kubernetes API server requests metrics #637

Merged
merged 2 commits into from
Mar 8, 2019

Conversation

aLekSer
Copy link
Collaborator

@aLekSer aLekSer commented Mar 5, 2019

New dashboard with dropdown Agones CRD selector.
Contains 4 graphs of apiserver_request_count by verb and error code as proposed in the ticket as well as 3 quantiles (0.5, 0.9 and 0.99) for apiserver_request_latencies_summary.

Closes #546 .

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 67fd0b78-ed97-4f12-adf8-0d1a50744eca

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/637/head:pr_637 && git checkout pr_637
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=0.9.0-159947e

@jkowalski
Copy link
Contributor

I tried the change, looks promising but we should make it easier to read:

  • add legend "as table" and "to the right" and expose "current"
  • set label units to requests/sec for throughput and milliseconds or microseconds for latency
  • have separate graphs for 50th, 90th, 95th and 99th percentile latencies
  • perhaps exclude dimensions with zero value (just add " != 0" to metric)
  • perhaps use something like "{{ verb }} {{ resource }} {{ subresource }}"
  • perhaps have separate graphs for each subresource (main resource, scale, status, could be a dropdown)

@aLekSer
Copy link
Collaborator Author

aLekSer commented Mar 5, 2019

Hello @jkowalski ,
Thanks for review, will apply your comments.
All this graphs should be on one dashboard and fit in one screen. am I right?
For some reason 0.9 percentile does not show me the plot, I need to understand why.

@aLekSer aLekSer force-pushed the grafana-apiserver-metrics branch from 159947e to da76ff6 Compare March 6, 2019 15:36
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: aca59d45-a927-42b9-b7b9-4a28381a7e33

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/637/head:pr_637 && git checkout pr_637
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=0.9.0-da76ff6

@jkowalski
Copy link
Contributor

Took a look again.

  • Can you swap "Main Resource Request Count Per Second" and "Subresource Request Count Per Second"
  • The metric in Main Resource Request Count Per Second needs to be sum(rate(apiserver_request_count{resource=~"[[CustomResourceDefinition]]",verb!~"WATCH|LIST",subresource=""}[5m])) by (resource,verb) with legend format {{verb}} {{resource}}
  • Can you change the legend format for Request Error Rate to {{resource}} {{subresource}} {{ code }}
  • TIP: When setting up axes units you can use Units dropdown instead of Label - this makes the axis able to use smarter labels
  • Can you change the title Request Latency - 0.5 quantile, milliseconds to simply Request Latency and drop the other latency graph? Alternatively we can have 4 hard-coded graphs for all quantiles and remove the quantile dropdown completely. It's actually quite useful to see all quantiles at once even if it requires scrolling (you can make them as wide as the page so they align nicely)

@aLekSer aLekSer force-pushed the grafana-apiserver-metrics branch 2 times, most recently from daad257 to a5a5612 Compare March 6, 2019 20:56
@aLekSer
Copy link
Collaborator Author

aLekSer commented Mar 6, 2019

@jkowalski I have applied all your comments. I only can not make 0.95 metrics show up. When switching to value of 0.9 in query graph become depicted.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 3d778aa8-e7e1-4bb0-a58f-03ef4c12bd2d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@aLekSer
Copy link
Collaborator Author

aLekSer commented Mar 6, 2019

Current look of the graphs (time period by default is 1hour):
screenshot 2019-03-07 at 00 01 46
screenshot 2019-03-07 at 00 02 00

@aLekSer aLekSer force-pushed the grafana-apiserver-metrics branch from a5a5612 to 3494041 Compare March 6, 2019 21:07
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 860c647d-756e-409c-ac92-2383f3dfb3f9

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 4459c7ad-ccc9-413b-ba70-a684ffc404c8

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/637/head:pr_637 && git checkout pr_637
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=0.9.0-3494041

@aLekSer aLekSer force-pushed the grafana-apiserver-metrics branch from 3494041 to bbc8ddd Compare March 7, 2019 09:40
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: c2e40b8e-f420-4e27-804d-1eac19260683

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/637/head:pr_637 && git checkout pr_637
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=0.9.0-bbc8ddd

New dashboard with dropdown Agones CRD selector. Added docs.
Contains 4 graphs as proposed in the ticket.
@aLekSer aLekSer force-pushed the grafana-apiserver-metrics branch from bbc8ddd to 907d70e Compare March 7, 2019 10:03
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: d4ac739b-28a6-4117-b04d-63798a9f88e1

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/637/head:pr_637 && git checkout pr_637
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=0.9.0-907d70e

@aLekSer
Copy link
Collaborator Author

aLekSer commented Mar 7, 2019

Checked that we have only 3 quantiles available (0.5, 0.9 and 0.99). So I updated the resulting dashboard file it now contains 7 plots.

@jkowalski
Copy link
Contributor

LGTM!

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: a9d07f91-075c-41bc-9bf5-40ba619c97b4

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/637/head:pr_637 && git checkout pr_637
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=0.9.0-7311e3e

@jkowalski jkowalski merged commit fc2ab01 into googleforgames:master Mar 8, 2019
@markmandel markmandel added this to the 0.9.0 milestone Mar 14, 2019
@markmandel markmandel added the area/operations Installation, updating, metrics etc label Mar 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operations Installation, updating, metrics etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants