Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[telemetry] Fix telemetry causing high DB CPU load #8638

Closed
geropl opened this issue Mar 7, 2022 · 2 comments · Fixed by #8858
Closed

[telemetry] Fix telemetry causing high DB CPU load #8638

geropl opened this issue Mar 7, 2022 · 2 comments · Fixed by #8858
Assignees
Labels
team: delivery Issue belongs to the self-hosted team type: bug Something isn't working

Comments

@geropl
Copy link
Member

geropl commented Mar 7, 2022

Currently we're executing an expensive query more often than we expect. This caused 3 incidents over the course of the last 2,5 weeks. We still are not sure why it's triggered this often (there are no traces/logs), but still we can improve the situation by:

  1. DB:
  2. improve the query itself: our hypothesis is that the ORM generates a query like this SELECT COUNT(1) FROM (SELECT ... JOIN...) where the subquery is the reason for the slowness (MySQL tries to materialize the table). Try writing direct SQL ala `SELECT COUNT(1) FROM d_b_workspace_instance AS wsi JOIN d_b_workspace AS ws ON ws.id = wsi.workspaceId WHERE ws.type = 'regular'. Test this against a failover prod DB.
  3. double-check we have an index on workspace.type in both prod DBs
  4. API: use different API/HTTP calls/requests for "config" and "telemetry data": e.g., don't execute the queries in case we are not sending the result anyway
  5. better observability: add tracing to the HTTP endpoint

/cc @corneliusludmann

@geropl geropl added type: bug Something isn't working team: delivery Issue belongs to the self-hosted team labels Mar 7, 2022
@corneliusludmann
Copy link
Contributor

@geropl There is no index for workspace.type at all. A simple SELECT count(*) FROM d_b_workspace WHERE type = 'regular'; takes ages. Would it be okay to add such an index? What is the proper way?

@geropl
Copy link
Member Author

geropl commented Mar 7, 2022

@corneliusludmann Here's an example of a migration that adds an secondary index: https://github.com/gitpod-io/gitpod/blob/4d9d49d81c7cecb29b906563c8e90094e2ec19b8/components/gitpod-db/src/typeorm/migration/1643589063084-IndexProjectCloneUrl.ts.

Note. please add the typeorm-specification (incl. full name) as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team: delivery Issue belongs to the self-hosted team type: bug Something isn't working
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants