Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Limit Binary Artifact file reads to first 1024 bytes #3923

Merged
merged 3 commits into from
Mar 6, 2024

Conversation

spencerschrock
Copy link
Member

What kind of change does this PR introduce?

bug fix

What is the current behavior?

The whole file is read, even though we only use 1024 bytes when determining if a file is text or binary

What is the new behavior (if this is a feature change)?**

Only up to the first 1024 bytes are read.

  • Tests for the changes have been added (for bug fixes/features)

Which issue(s) this PR fixes

Part 1 of 2 to address #3831

Special notes for your reviewer

Does this PR introduce a user-facing change?

For user-facing changes, please add a concise, human-readable release note to
the release-note

(In particular, describe what changes users might need to make in their
application as a result of this pull request.)

NONE

@spencerschrock spencerschrock requested a review from a team as a code owner March 5, 2024 21:30
@spencerschrock spencerschrock requested review from naveensrinivasan and laurentsimon and removed request for a team March 5, 2024 21:30
Copy link

codecov bot commented Mar 5, 2024

Codecov Report

Merging #3923 (5c9220f) into main (0a6b06a) will decrease coverage by 6.53%.
The diff coverage is 64.51%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3923      +/-   ##
==========================================
- Coverage   75.11%   68.58%   -6.53%     
==========================================
  Files         234      234              
  Lines       15859    15877      +18     
==========================================
- Hits        11912    10890    -1022     
- Misses       3187     4294    +1107     
+ Partials      760      693      -67     

@spencerschrock
Copy link
Member Author

/scdiff generate Binary-Artifact

Copy link

github-actions bot commented Mar 5, 2024

@pnacht
Copy link
Contributor

pnacht commented Mar 5, 2024

This is already a big improvement, but why 1024 bytes? From what I understood (described in #3831), h2non/filetype (the package Scorecard uses to determine the filetype) only needs the first 262 bytes.

@spencerschrock
Copy link
Member Author

This is already a big improvement, but why 1024 bytes

Getting the file type is just one part, we also check some of the file for false positives. You can read a bit more here: #2412.

@spencerschrock spencerschrock enabled auto-merge (squash) March 6, 2024 21:31
@spencerschrock spencerschrock merged commit c779392 into ossf:main Mar 6, 2024
37 of 38 checks passed
@spencerschrock spencerschrock deleted the binary-artifact-reader branch March 6, 2024 21:39
fhoeborn pushed a commit to fhoeborn/scorecard that referenced this pull request Apr 1, 2024
* add OnMatchingFileReaderDo

Signed-off-by: Spencer Schrock <[email protected]>

* switch binary artifact to using reader

Signed-off-by: Spencer Schrock <[email protected]>

---------

Signed-off-by: Spencer Schrock <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants