Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add go binary h1 digest to SPDX #1265

Merged
merged 7 commits into from
Oct 19, 2022

Conversation

kzantow
Copy link
Contributor

@kzantow kzantow commented Oct 14, 2022

This PR adds support for go binary h1 digest support being output to the SPDX checksum field.

Fixes #1261

@kzantow kzantow marked this pull request as draft October 14, 2022 20:18
Signed-off-by: Keith Zantow <[email protected]>
@kzantow kzantow marked this pull request as ready for review October 19, 2022 17:48
@kzantow kzantow requested a review from a team October 19, 2022 17:48
Copy link
Contributor

@spiffcs spiffcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment about FilesAnalyzed

syft/formats/spdx22json/to_format_model.go Outdated Show resolved Hide resolved
@kzantow kzantow merged commit 78a0af2 into anchore:main Oct 19, 2022
@kzantow kzantow deleted the feat/h1-digest-spdx branch October 19, 2022 20:33
aiwantaozi pushed a commit to aiwantaozi/syft that referenced this pull request Oct 20, 2022
@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

Thanks @spiffcs .

How do I find that output? I cloned the repo, ran make bootstrap and then ran the binary against a go project directory, using both spdx-json and spdx output. Where should I find the hashes?

@kzantow
Copy link
Contributor Author

kzantow commented Oct 20, 2022

@deitch -- you can find the hashes in the checksums field, e.g.:

      "checksums": [
        {
          "algorithm": "SHA256",
          "checksumValue": "33d5c1b2232c970da56f66738250b44ce9013cae4d422d3f9e852b767a05c148"
        }

These have been converted from the h1:<base64> versions.

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

Huh, you are right, it is there. I guess I didn't find it on the larger one I ran. I had better figure out why. Likely my error, but if not, I will open an issue.

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

It looks like it is finding some but not all. Is that possible?

And I cloned latest (checked that it has this commit) ran go run cmd/syft/main.go -o spdx-json test > /tmp/test.json but only has checksums for 2 of the packages. All in this gist

Am I doing something wrong?

@kzantow
Copy link
Contributor Author

kzantow commented Oct 20, 2022

Ah -- this was implemented only for go binaries at the moment! Are you running against the source directory with go.mod/go.sum in it? We should probably add support for that too.

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

It didn't even occur to me that this was for compiles binaries. How funny!

So I should run it against an actual binary?

Also, curiously, what happens if I run it against an OCI image? Does it find the go binaries in there and scan them too? Or do I need to be explicit?

@kzantow
Copy link
Contributor Author

kzantow commented Oct 20, 2022

@deitch Syft does find go binaries when you do container scans, you shouldn't need to do anything special to enable this 👍 -- it should be noted, only binaries compiled using go mod will have this information, though.

If you go build that project, and scan the resulting test executable, you'll see checksum entries for the 2 dependencies.

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

If you go build that project, and scan the resulting test executable, you'll see checksum entries for the 2 dependencies

Actually, I see 2 checksums when scanning both source and binary, so that is equivalent already.

What I don't see is all of the dependencies. Take a look at the gist.

  • go.mod: 3 dependencies:
    • github.com/spf13/cobra (direct)
    • github.com/inconshreveable/mousetrap (indirect)
    • github.com/spf13/pflag (indirect)
  • go.sum: 7 dependencies
  • spdx.json: checksums for 2 of the (3 or 7) dependencies: cobra and pflag

@kzantow
Copy link
Contributor Author

kzantow commented Oct 20, 2022

@deitch this is a little nuanced -- the information included in the go binary is for everything included in the binary. What that means is: depending on the actual code paths the binary includes, it may or may not include all the dependencies in the go.mod/sum. Given how simple the test example is, I assume it actually isn't including a code path that causes mousetrap or the other dependencies to get included, and thus only pflag and cobra are included in the binary metadata. This is good -- it is a better indicator of what the binary actually contains!

As for scanning the source, this is a different question! I don't see any digests included in the source scan (did you scan the directory with the binary still present?). This is something we should probably add, but it explains the discrepancy between only seeing 2 dependencies with the binary and 3 in the go.mod. It looks like currently, go.sum is not being used in the cataloging process for go source (but it could be at least to get the h1digests).

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

did you scan the directory with the binary still present?

Huh, maybe I did. I just reran it and am getting no checksums. Go figure. PEBKAC.

but it explains the discrepancy between only seeing 2 dependencies with the binary and 3 in the go.mod.

meaning that the actual binary only uses 2 even if go.mod has 3 and go.sum has 7?

I assume it actually isn't including a code path that causes mousetrap or the other dependencies to get included, and thus only pflag and cobra are included in the binary metadata. This is good -- it is a better indicator of what the binary actually contains!

That is somewhat surprising. Wouldn't you expect the go mod tools to be more efficient about it? Or are you saying they are moderately efficient, while the final go build is more efficient?

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

btw, I just ran it on a complex binary, the results showed ✔ Cataloged packages [106 packages], and the output does indeed show 106 pkg:golang, but not a single one has checksums.

@kzantow
Copy link
Contributor Author

kzantow commented Oct 20, 2022

meaning that the actual binary only uses 2 even if go.mod has 3 and go.sum has 7?

Exactly.

Go mod uses a minimal version selection algorithm to determine dependencies that should be included. There is more information in the go.sum that feeds this algorithm but doesn't actually mean that the packages listed in go.sum are even used in the end. What you see in go.mod, like the // indirect entries, is supposed to be a reasonable representation of the dependencies used. I asked the team about this and go.sum is a much worse representation of dependencies than go.mod, which is why we are using this.

I just ran it on a complex binary, the results showed ✔ Cataloged packages [106 packages], and the output does indeed show 106 pkg:golang, but not a single one has checksums.

This is still dependent on various things -- the version of go used to compile, build time flags. Basically, if the h1 digests are included in the go binary build info section, Syft should be picking those up now.

You can see if that information is included by running: go version -m <binary>. When running this on the test binary, it shows:

test: go1.18
	path	test
	mod	test	(devel)	
	dep	github.com/spf13/cobra	v1.6.0	h1:42a0n6jwCot1pUmomAp4T7DeMD+20LFv4Q54pxLf2LI=
	dep	github.com/spf13/pflag	v1.0.5	h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
	build	-compiler=gc
	...

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

How interesting. I ran it for my other binary, it shows go 1.18.6 and all of the dep, all of them with the semver or other module version from go.mod, but not a single one with the h1 hash. 🤷‍♂️

GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include go binary h1 digests in SPDX
3 participants