Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong Scorecards result in CodeReview check due to Dependabot/RenovateBot PR reviews #2450

Closed
inferno-chromium opened this issue Nov 10, 2022 · 7 comments · Fixed by #2542
Closed
Labels
kind/bug Something isn't working priority/must-do Upcoming release

Comments

@inferno-chromium
Copy link
Contributor

Describe the bug
A single maintainer project will return incorrect result on CodeReview check if that single maintainer is reviewing Dependabot/RenovateBot PRs. This was indicated by Sonatype folks who ran Scorecards on a large number of maven projects.

Reproduction steps
Try a single maintainer repo that uses Dependabot/RenovateBot

Expected behavior
Ignore Dependabot and RenovateBot PRs completely from processing in CodeReview check. Don't count them as code reviews. Codereview are usually intended to check human reviews on human created PRs.

Additional context
Add any other context about the problem here.

@inferno-chromium inferno-chromium added kind/bug Something isn't working priority/must-do Upcoming release labels Nov 10, 2022
@spencerschrock
Copy link
Member

Can take a look later today. Any specific repositories would be good, but can look for some that match the criteria too.

@spencerschrock spencerschrock removed their assignment Nov 10, 2022
@spencerschrock
Copy link
Member

@raghavkaul will pick up #2311 which should address this

@raghavkaul
Copy link
Contributor

In the scorecard biweekly meeting the following proposal was discussed:

The Code Review check should move towards leveled, rather than proportional scoring.

  • Projects that review all code in the default branch should receive a 10/10
  • For projects where any human commits are unreviewed, a flat 7 points are deducted from the score. For projects where any bot commits are unreviewed, a flat 3 points are deducted from the score. Previously, the Code Review score (/10) would be the %age of reviewed changesets out of the last 30 commits.
    • Bot commits should still penalized because, as @david-a-wheeler points out, bots can propose incorrect changes, or behave unpredictably
  • If scorecard sees that the project only has commits made by a single-maintainer (with or without a bot), the Code Review category cannot be scored, and scorecard should say there's Insufficient Data to proceed
  • If scorecard only sees automated/bot commits in the last 30 commits, the Code Review category cannot be scored, and scorecard should say there's Insufficient Data to proceed

Assuming this is the path forward, the assignee of this bug should follow a process similar to #2462 where the Code-Review check is run against a set of Top 300-400 repos to compare the Code Review scores before/after the changes. This analysis should be part of the Pull Request implementing the check.

@david-a-wheeler
Copy link
Contributor

That sounds like what we talked about. The idea is that we want to encourage review, where it's possible.

I don't recall the rule "If scorecard sees that the project only has commits made by a single-maintainer (with or without a bot), the Code Review category cannot be scored, and scorecard should say there's Insufficient Data to proceed". We probably did discuss it, I just don't remember. Clearly it's better if there were multiple people so there would be a separate review, but on the other hand, when there's only 1 maintainer that's typically impractical. I don't have an obviously better solution.

@azeemshaikh38
Copy link
Contributor

azeemshaikh38 commented Nov 29, 2022

My 2 cents - I believe penalizing unreviewed bot commits is the right behavior and should not be changed. There have been reported incidents in the past where bots have been fooled to get malicious PRs automatically committed (e.g https://brew.sh/2021/04/21/security-incident-disclosure/). I don't see a strong reason why engineering effort needs to be spent here to change this behavior.

@david-a-wheeler
Copy link
Contributor

The 0..10 score is intended to give an estimate of the "level of risk". Clearly a bot commit can be malicious, I completely agree, but a malicious commit from a bot already approved by the project seems less likely than a random human's. If we agree that the risks are different, then the scores should be different. If the argument is that the risks are approximately equal, then they should be scored teh same. Are you arguing that bot commits are just as likely to be malicious, even if the bot was approved by the project (which implies some acceptance of the bot)?

@azeemshaikh38
Copy link
Contributor

Are you arguing that bot commits are just as likely to be malicious, even if the bot was approved by the project (which implies some acceptance of the bot)?

Yes. I agree that the probability of a malicious bot commit is lesser than malicious human commit. But the risk from merging an unreviewed commit should be the same.

Also, I don't see what usecase the new proposal solves. The way Code-Review is scored today is - for Y out of past X commits which are reviewed, the score is Y/X. So if there is 1/10 unreviewed bot code, it would lead to a score of 9/10 with current scoring scheme. But in the new scheme it would lead to a score of 7/10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working priority/must-do Upcoming release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants