Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detection of files without curation #9487

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kikofernandez
Copy link

Based on #9435, it was stated that we could simply apply curations to files not listed in the license findings. The current PR is one such design but breaks backwards compatibility. I would like some feedback on how it would be preferred to deal with this.

Below I state some of the approaches I have thought of. Any feedback is welcomed.

PR Approach

  • detects files without license and these are not added to the output of the scanner (as requested in Include unlicensed files in scanner results #9435. As such, the evaluator reports errors for files without license. (breaks backwards compatibility)
  • if we add curations, curations are applied to the unlicensed files (this is simply because of the curation mechanism)
  • unlicensed files use the default value NONE when created as part of a LicenseFinding.
  • Not complete, missing tests, looking for feedback

Design Option

Approach
I have tried to look for ways to changes to the FindingCurationMatcher.kt (recommended by @sschuberth ). There are a bunch of methods in this file, but none of them have available the files needed from ScanResults. I do not mind to "patch" all call sites that pass findings:

  • FindingCurationMatcher().matches(finding, curation)
  • FindingCurationMatcher().apply(finding, curation)
  • FindingCurationMatcher().applyAll(finding, curation)

and add to the findings all files without licenses as LicenseFinding(license="NONE"), location=...). In this way, a curation matcher with a glob pattern on a folder and detected_license: NONE can apply the curation.

Questions/Comments
We apply curations to files not listed in the scanner results.
To me, this may seem counter-intuitive, since I do not think ORT has ever dealt with files not listed in the scanner. The idea is to apply curations to unlicensed files, but I think it is more uniform if unlicensed files are listed in the scanner. Thoughts?

Design Option 2

Add to the .ort.yml an option (or to the cli) to state that unlicensed files are part of the scanner and should be shown there.

Use case

  • Opt-in to curate unlicensed files. AFAIK, unlicensed files are not considered in the evaluator.
    From my point of view, we need to analyse Erlang/OTP and would like warnings/errors for any files without license. Design option 2 makes explicit this option, and the scanner shows the truth result of scanned files.

detects files without curations without adding them to the scanner.
files without curation use the default value `NONE` when created as part
of a `LicenseFinding`.

Signed-off-by: Kiko Fernandez-Reyes <[email protected]>
@fviernau
Copy link
Member

I believe we should first create a proposal how this could be done, before working on an actual implementation and decide for a specific approach.

Copy link

codecov bot commented Nov 22, 2024

Codecov Report

Attention: Patch coverage is 44.44444% with 5 lines in your changes missing coverage. Please review.

Project coverage is 67.90%. Comparing base (eeba28e) to head (b27d33d).

Files with missing lines Patch % Lines
...main/kotlin/licenses/DefaultLicenseInfoProvider.kt 44.44% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #9487      +/-   ##
============================================
- Coverage     67.93%   67.90%   -0.03%     
- Complexity     1289     1290       +1     
============================================
  Files           249      249              
  Lines          8794     8802       +8     
  Branches        913      913              
============================================
+ Hits           5974     5977       +3     
- Misses         2434     2438       +4     
- Partials        386      387       +1     
Flag Coverage Δ
funTest-docker 64.82% <ø> (ø)
test 35.76% <44.44%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@kikofernandez
Copy link
Author

Question1: Should I create a proposal as a comment inside of #9435?
Question2: how does one usually create a proposal? (apart from this small PR with suggestions and asking for feedback on a new approach)?

In any case, the proposals that I have thought of are written in this PR:

  • Current PR proposal (breaks backwards compatibility)
  • Option 1
  • Option 2
  • Option 3? Different approach or more guidance?

Question3 Should I try to detail more the proposals written here? (I am new to the project and do not yet understand all possible outcomes/impact from the approaches outlined in this PR)

@sschuberth
Copy link
Member

The idea is to apply curations to unlicensed files, but I think it is more uniform if unlicensed files are listed in the scanner. Thoughts?

I'd still have a preference for not including unlicensed files / files without findings to the scan results to keep it smaller. Instead, as outlined elsewhere, I'd look into changing the license finding curation matcher logic to accept a detectedLicense of NONE in if path matches a file from the list list.

In a way, what you want is the opposite of "deleting" a finding by setting the concludedLicense to NONE, you want to "invent" findings.

Somehow these invented findings then need to make it into the license info resolver. I have not thought about how to do that yet.

@sschuberth
Copy link
Member

Question2: how does one usually create a proposal?

I good way would be to join our weekly community meeting to discuss such ideas.

@kikofernandez
Copy link
Author

Question2: how does one usually create a proposal?

I good way would be to join our weekly community meeting to discuss such ideas.

Oh, that would great! I would like to see what the community is doing, the roadmap, and learn. I will drop by for sure next Thursday!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants