Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Implement local repo client for local folders #1146

Merged
merged 33 commits into from
Oct 28, 2021

Conversation

laurentsimon
Copy link
Contributor

  • Please check if the PR fulfills these requirements
  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    feature

  • What is the current behavior? (You can also link to an open issue here)
    no support for local folders

  • What is the new behavior (if this is a feature change)?
    add support for local client repo from a folder

  • Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
    yes for the internal API

  • Other information:

@laurentsimon laurentsimon marked this pull request as draft October 20, 2021 00:57
@laurentsimon laurentsimon changed the title ✨ DRAFT: Implement local repo client for local folders ✨ DRAFT(do not review): Implement local repo client for local folders Oct 20, 2021
@laurentsimon laurentsimon force-pushed the feat/localint2 branch 9 times, most recently from 259d759 to f24bfc5 Compare October 21, 2021 22:08
@laurentsimon laurentsimon changed the title ✨ DRAFT(do not review): Implement local repo client for local folders ✨ Implement local repo client for local folders Oct 21, 2021
@laurentsimon laurentsimon marked this pull request as ready for review October 21, 2021 22:22
@laurentsimon
Copy link
Contributor Author

laurentsimon commented Oct 26, 2021

friendly ping

clients/repo.go Outdated Show resolved Hide resolved
Copy link
Contributor

@azeemshaikh38 azeemshaikh38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Left comments to be fixed.

checks/security_policy.go Outdated Show resolved Hide resolved
Comment on lines 47 to 52
// Local clients do not have a repo's full name for the OSS-Fuzz lookup.
if c.RepoClient.IsLocal() {
e := sce.WithMessage(sce.ErrScorecardInternal, "not supported for local repos")
return checker.CreateRuntimeErrorResult(CheckFuzzing, e)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't check what the RepoClient implementation type is here. Its ok if it doesn't find anything here and fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll be providing an erroneous result if we fail. I think we need to tell users this check is not supported, like we do for other checks the local repo client does not support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, I don't think its erroneous or misleading. We are basically checking whether the repo does Fuzzing using OSS-Fuzz or has a CII badge and we report a failure since they don't. In future, if we expand our support for Fuzzing (e.g using ClusterFuzzLite or looking for local files for evidence of Fuzzing), even local repos will be able to have a non-fail score for Fuzzing.

Copy link
Contributor Author

@laurentsimon laurentsimon Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point regarding fuzzing: you're right there may be local files indicating fuzzing configuration in the repo itself. I don't think we'll ever be able to check for CII badge using local repos, though. @david-a-wheeler to comment on this.

@naveensrinivasan @oliverchang any opinion on this thread?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar point holds true for CII too - if we extend the local file implementation in the future to return a valid HTTP URL associated with the local source code, local repos can have a valid CII badge value.

In general, my contention is changing the checks behavior based on the implementation detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @azeemshaikh38. It would be better to not encode this implementation detail/type check as part of the RepoClient interface as it's breaking the interface somewhat.

If we do need to make such a distinction for displaying better error messages, maybe could we do this at registerCheck time? I.e. add a supportsLocal bool in registerCheck and enable/disable it at runtime based on the type of repo we're checking.

We could also in theory support this Fuzzing check today, if we add support to the local repo client to take the git remote result and use that as the repo URL to search for.

Copy link
Contributor Author

@laurentsimon laurentsimon Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be part of registerCheck or the checks.yml?
Any thought on checks that use both source code and GH APIs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checks.yml also works and would be better for documentation!

Maybe for local only mode, we just don't support any any GH APIs calls? How many checks would that disable? I'm not sure it's worth the complexity to support.

Then we don't need to make much of a distinction for a hybrid local source code + GH API option. If the user provides a github token as well, we can just have a hybrid RepoClient which is treated in the same way as a regular remote RepoClient, assuming we can extract the remote repo URL.

Copy link
Contributor Author

@laurentsimon laurentsimon Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checks.yml also works and would be better for documentation!

yes, I'm on the same page.

Maybe for local only mode, we just don't support any any GH APIs calls? How many checks would that disable? I'm not sure it's worth the complexity to support.

yes, I like removing complexity, and disabling them would be nice. I initially use IsLocal(), but documentation checks.yaml is better.

I'll check how that would work and see how many would be disabled. @azeemsgoogle please chime in for additional comments.

Then we don't need to make much of a distinction for a hybrid local source code + GH API option. If the user provides a github token as well, we can just have a hybrid RepoClient which is treated in the same way as a regular remote RepoClient, assuming we can extract the remote repo URL.

doable, thanks for the suggestion, I'll keep this in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

checks/cii_best_practices.go Outdated Show resolved Hide resolved
@david-a-wheeler
Copy link
Contributor

I don't fully see how CII would work on local files, but maybe you're right. It would need a signed blob from CII validation backend for us to verify.

The CII badge requirements mostly focus on the project. You could use git to find the relevant remote URL, then query the CII Best Practices badge for the information about that URL. You might check "origin" and then "upstream" remotes, that should be enough.

@laurentsimon
Copy link
Contributor Author

I don't fully see how CII would work on local files, but maybe you're right. It would need a signed blob from CII validation backend for us to verify.

The CII badge requirements mostly focus on the project. You could use git to find the relevant remote URL, then query the CII Best Practices badge for the information about that URL. You might check "origin" and then "upstream" remotes, that should be enough.

thanks for chiming in. We envision 3 different repo clients: online (GitHub, GitLab, etc), local folder (plain source code with no info), "hybrid" (source code with info, e.g. by git). it would work well for the 3rd option.

@laurentsimon
Copy link
Contributor Author

laurentsimon commented Oct 26, 2021

another thing I wanted to mention: requiring a GitHub token to run scorecard on a local folder feels weird.
We currently create GitHub repo interface to query OSV database (for example) in the Fuzzing check, and users will be prompted with an error if they don't provide a GH token. It's not a problem in this PR because I specifically check whether the repo interface is local IsLocal(), but it may become useful when/if we support "local git client", etc.
Should we change the code to use our own HTTP client for the request, or tweak the API CreateGithubRepoClient(.., noToken=yes), or something else?

@naveensrinivasan @oliverchang @azeemshaikh38

@david-a-wheeler
Copy link
Contributor

thanks for chiming in. We envision 3 different repo clients: online (GitHub, GitLab, etc), local folder (plain source code with no info), "hybrid" (source code with info, e.g. by git). it would work well for the 3rd option.

I agree, that'd work for third option. If you only have a local folder with source code and NO other info, you could establish some convention for recording its home repo URL, and then use that.

@laurentsimon
Copy link
Contributor Author

thanks for chiming in. We envision 3 different repo clients: online (GitHub, GitLab, etc), local folder (plain source code with no info), "hybrid" (source code with info, e.g. by git). it would work well for the 3rd option.

I agree, that'd work for third option. If you only have a local folder with source code and NO other info, you could establish some convention for recording its home repo URL, and then use that.

Do you mean using the path as name, or taking it from an existing file?

@laurentsimon
Copy link
Contributor Author

FYI @naveensrinivasan pre-submit failing with bin/bash: ko: command not found, see https://github.com/ossf/scorecard/runs/4016410307?check_suite_focus=true

@oliverchang
Copy link
Contributor

another thing I wanted to mention: requiring a GitHub token to run scorecard on a local folder feels weird.

+1! We should make it painlessly easy for someone to run this on their own local checkout.

thanks for chiming in. We envision 3 different repo clients: online (GitHub, GitLab, etc), local folder (plain source code with no info), "hybrid" (source code with info, e.g. by git). it would work well for the 3rd option.

I agree, that'd work for third option. If you only have a local folder with source code and NO other info, you could establish some convention for recording its home repo URL, and then use that.

Do you mean using the path as name, or taking it from an existing file?

How do we plan to support local repos using the CLI? If it's a separate option, maybe we can do

scorecard --local=/path/ --repo=repo-repo-url

But it's not clear we need this when we can just take the upstream by doing a git remote.

@laurentsimon
Copy link
Contributor Author

@azeemshaikh38 @oliverchang I've removed the IsLocal() and added TODO in the checks that need to be disabled by checks.yaml. I'll do that in a follow-up PR. The code is gated with and env variable, so nobody should be affected. Let me know if I you have further comments for this PR. Thanks again for the discussion and suggestions!

Copy link
Contributor

@azeemshaikh38 azeemshaikh38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thnaks @laurentsimon !

@@ -111,7 +110,8 @@ func SecurityPolicy(c *checker.CheckRequest) checker.CheckResult {
if r {
return checker.CreateMaxScoreResult(CheckSecurityPolicy, "security policy file detected")
}
case !errors.Is(err, sce.ErrRepoUnreachable):
// err != nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment

Comment on lines +119 to +124
files := []string{}
var err error

once.Do(func() {
files, err = listFiles(client.path, predicate)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is working right.

  1. once should be defined and initiated inside localDirClient struct. Else, it'll be shared by all instances of localDirClient.
  2. The files and err variables should also be inside localDirClient. Else, they get re-initiated and will return empty values when ListFiles is called more than once.

}

// GetFileContent implements RepoClient.GetFileContent.
func (client *localDirClient) GetFileContent(filename string) ([]byte, error) {
// TODO
return nil, nil
return getFileContent(client.path, filename)
}

// ListMergedPRs implements RepoClient.ListMergedPRs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, Search API can also be implemented for local client. Maybe the next PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! I think checks that are supported don't use it, so not needed in the short term. Thanks for flagging!

clients/localdir/client_test.go Show resolved Hide resolved
@laurentsimon laurentsimon enabled auto-merge (squash) October 28, 2021 17:54
@laurentsimon laurentsimon merged commit 4cca9b4 into ossf:main Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants