Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CF] Convert lf_rollover_health query to clickhouse #5819

Merged
merged 3 commits into from
Oct 25, 2024

Conversation

ZainRizvi
Copy link
Contributor

@ZainRizvi ZainRizvi commented Oct 25, 2024

Converts the lf_rollover_health query to use clickhouse.
It also:

  • Corrects a bug in the query about how it calculated the average duration by fixing the denominator.
  • Removes the temporary amz2023 label prefix handling, since those workflows are obsolete now anyways

Validation:

  • Ensured that the ClickHouse query showed the same results as the Rockset query (except for the duration comparison chart, which also shows matching results if I reintroduce the average duration bug)

Copy link

vercel bot commented Oct 25, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
torchci ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 25, 2024 6:35pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 25, 2024
@ZainRizvi ZainRizvi requested review from clee2000 and a team October 25, 2024 03:23
WHERE 1=1
AND j.labels is not NULL
AND j._event_time > CURRENT_DATETIME() - DAYS(:days_ago)
workflow_job as j
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on offline conversation, if you are willing to eat the cost of not using final (maybe some incorrect rows, dup rows, esp for recent data), then you don't need final. A comment about not needing it would be nice tho

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, we'll leave this query without FINAL. Added comments to the query explaining the logic

torchci/clickhouse_queries/lf_rollover_health/query.sql Outdated Show resolved Hide resolved
granularity={"day"}
timeFieldName={"bucket"}
yAxisLabel="increase ratio"
yAxisFieldName={"success_duration_increase_ratio"}
yAxisRenderer={(value) => value}
groupByFieldName={"job_name"}
useClickHouse={useClickHouse}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized these 3 panels use the same values, is it worth it to query once initially and reuse the values? Or maybe the caching is good enough to handle it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some smarts that are already there under the hood, perhaps automated by React. If you look at the network requests only one request is actually made to the query, and it populates all three charts simultaneously

@ZainRizvi ZainRizvi merged commit 5b770fb into main Oct 25, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants