Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flake-stats: per day values not correct #2980

Open
dhiller opened this issue Sep 7, 2023 · 3 comments
Open

flake-stats: per day values not correct #2980

dhiller opened this issue Sep 7, 2023 · 3 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@dhiller
Copy link
Contributor

dhiller commented Sep 7, 2023

Problem

I stumbled over flake-stats from Sep 7th, 2023, which showed an increase of the flakes to 70+, previous day was < 40.

Top most flake says that it occured twice on Wed Sep 6th, 2023.

However, neither testgrid for pull-kubevirt-e2e-k8s-1.28-sig-storage

testgrid for pull-kubevirt-1.28

nor testgrid for pull-kubevirt-e2e-k8s-1.26-sig-storage

screenshot of testgrid for pull-kubevirt-e2e-k8s-1.26-sig-storage

shows any failures.

Looking at the flakefinder report for Sep 6th I saw failures on the lanes in row 4, both from this PR. Those failures were actually quite old, from Aug 30th, 2023.

Reason

This lead me to looking at the aggregation per day in flake-stats. There instead of taking the date of the build where a failure occurred, we actually just take the report date. This effectively is the day where any PR got merged, but failures on tests of that PR can actually occur much earlier.

Fix

To fix this, we need to touch flakefinder output data and consume the right field in flake-stats.

flakefinder

we need to add the job finish date from here to the data we pass here into the Job data.

This needs to be transferred several levels, since I believe the output json data from flakefinder that we are aggregating is indirectly generated from JobResult data. However, what we need is the buildDate inside the Job data.

flake-stats

we need to use that date instead of the report date that is used currently

@dhiller
Copy link
Contributor Author

dhiller commented Sep 7, 2023

@brianmcarey FYI

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 6, 2023
@dhiller
Copy link
Contributor Author

dhiller commented Dec 6, 2023

/remove-lifecycle stale
/lifecycle frozen

Need to keep this around for documentation to happen eventually.

@kubevirt-bot kubevirt-bot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

2 participants