Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: only generate PURL on empty string #1312

Merged
merged 2 commits into from
Nov 3, 2022
Merged

fix: only generate PURL on empty string #1312

merged 2 commits into from
Nov 3, 2022

Conversation

spiffcs
Copy link
Contributor

@spiffcs spiffcs commented Nov 3, 2022

Summary

Purl Generation has changed a bit between v0.59.x and v0.60.x

A good example of this can be demonstrated by the following values for alpine:

PRE v0.60.x
pkg:alpine/[email protected]?arch=x86_64&upstream=busybox&distro=alpine-3.12.5

Included in v0.60.x
pkg:alpine/[email protected]

There are two places within syft that can generate two different values for PURL given a certain package type:

Let’s look at the above alpine case:
Here is the first section where a PURL can be set:

// packageURL returns the PURL for the specific Alpine package (see https://github.com/package-url/purl-spec)
func packageURL(m pkg.ApkMetadata, distro *linux.Release) string {
if distro == nil || distro.ID != "alpine" {
// note: there is no namespace variation (like with debian ID_LIKE for ubuntu ID, for example)
return ""
}
qualifiers := map[string]string{
pkg.PURLQualifierArch: m.Architecture,
}
if m.OriginPackage != "" {
qualifiers[pkg.PURLQualifierUpstream] = m.OriginPackage
}
return packageurl.NewPackageURL(
// note: this is currently a candidate and not technically within spec
// see https://github.com/package-url/purl-spec#other-candidate-types-to-define
"alpine",
"",
m.Package,
m.Version,
pkg.PURLQualifiers(
qualifiers,
distro,
),
"",
).ToString()
}

This can be found when the "apkdb-cataloger" is running and parsing the apk DB.

Syft gets a second pass at PURL generation here in catalog.go for all packages

for _, p := range packages {
// generate CPEs (note: this is excluded from package ID, so is safe to mutate)
// we might have binary classified CPE already with the package so we want to append here
p.CPEs = append(p.CPEs, cpe.Generate(p)...)
// generate PURL (note: this is excluded from package ID, so is safe to mutate)
p.PURL = pkg.URL(p, release)
// if we were not able to identify the language we have an opportunity
// to try and get this value from the PURL. Worst case we assert that
// we could not identify the language at either stage and set UnknownLanguage
if p.Language == "" {
p.Language = pkg.LanguageFromPURL(p.PURL)
}
// create file-to-package relationships for files owned by the package
owningRelationships, err := packageFileOwnershipRelationships(p, resolver)
if err != nil {
log.Warnf("unable to create any package-file relationships for package name=%q: %w", p.Name, err)
} else {
allRelationships = append(allRelationships, owningRelationships...)
}
// add to catalog
catalog.Add(p)
}

The function that gets a crack at each package post the catalogers running is func URL from the package pkg

syft/syft/pkg/url.go

Lines 28 to 60 in e0acfa9

func URL(p Package, release *linux.Release) string {
if p.Metadata != nil {
if i, ok := p.Metadata.(urlIdentifier); ok {
return i.PackageURL(release)
}
}
// the remaining cases are primarily reserved for packages without metadata struct instances
var purlType = p.Type.PackageURLType()
var name = p.Name
var namespace = ""
switch {
case purlType == "":
purlType = packageurl.TypeGeneric
case p.Type == NpmPkg:
fields := strings.SplitN(p.Name, "/", 2)
if len(fields) > 1 {
namespace = fields[0]
name = fields[1]
}
}
// generate a purl from the package data
return packageurl.NewPackageURL(
purlType,
namespace,
name,
p.Version,
nil,
"",
).ToString()
}

Previously alpine packages were gated behind the urlIdentifier interface
I believe moving to the generic cataloger removed this method PackageURL from certain package metadata types causing this check to no longer short circuit PURL generation:

	if p.Metadata != nil {
		if i, ok := p.Metadata.(urlIdentifier); ok {
			return i.PackageURL(release)
		}
	} 

There is a pretty simple hack to fix this where we only do a PURL generation in catalog.go if one has not already been set if p.PURL="" {try again} , but I wanted to run it by everyone to talk about since we had a stricter check on the interface earlier vs just covering for a blank string.

Signed-off-by: Christopher Phillips [email protected]

Signed-off-by: Christopher Phillips <[email protected]>
@github-actions
Copy link

github-actions bot commented Nov 3, 2022

Benchmark Test Results

Benchmark results from the latest changes vs base branch
name                                                       old time/op    new time/op    delta
ImagePackageCatalogers/alpmdb-cataloger-2                    15.8ms ±28%    14.5ms ± 2%    ~     (p=0.841 n=5+5)
ImagePackageCatalogers/ruby-gemspec-cataloger-2              1.63ms ± 2%    1.71ms ± 1%  +4.84%  (p=0.016 n=5+4)
ImagePackageCatalogers/python-package-cataloger-2            4.19ms ± 3%    4.23ms ± 3%    ~     (p=0.310 n=5+5)
ImagePackageCatalogers/php-composer-installed-cataloger-2    1.34ms ± 4%    1.36ms ± 2%    ~     (p=0.151 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2         965µs ± 3%     970µs ± 1%    ~     (p=1.000 n=5+5)
ImagePackageCatalogers/node-binary-cataloger-2               8.15µs ± 1%    8.03µs ± 2%    ~     (p=0.151 n=5+5)
ImagePackageCatalogers/dpkgdb-cataloger-2                    1.09ms ± 4%    1.15ms ± 1%  +5.54%  (p=0.008 n=5+5)
ImagePackageCatalogers/rpm-db-cataloger-2                    1.58ms ± 3%    1.63ms ± 2%  +3.27%  (p=0.032 n=5+5)
ImagePackageCatalogers/java-cataloger-2                      18.0ms ± 4%    17.9ms ± 1%    ~     (p=0.841 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                     1.58ms ± 2%    1.63ms ± 0%  +3.63%  (p=0.008 n=5+5)
ImagePackageCatalogers/go-module-binary-cataloger-2          8.17µs ± 1%    8.00µs ± 2%  -2.11%  (p=0.032 n=5+5)
ImagePackageCatalogers/dotnet-deps-cataloger-2               1.67ms ± 1%    1.77ms ± 2%  +5.95%  (p=0.008 n=5+5)
ImagePackageCatalogers/portage-cataloger-2                    894µs ± 5%     909µs ± 1%    ~     (p=0.310 n=5+5)

name                                                       old alloc/op   new alloc/op   delta
ImagePackageCatalogers/alpmdb-cataloger-2                    5.27MB ± 0%    5.27MB ± 0%    ~     (p=0.841 n=5+5)
ImagePackageCatalogers/ruby-gemspec-cataloger-2               204kB ± 0%     204kB ± 0%    ~     (p=0.310 n=5+5)
ImagePackageCatalogers/python-package-cataloger-2             960kB ± 0%     959kB ± 0%  -0.09%  (p=0.008 n=5+5)
ImagePackageCatalogers/php-composer-installed-cataloger-2     216kB ± 0%     216kB ± 0%  -0.11%  (p=0.016 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2         159kB ± 0%     159kB ± 0%  -0.21%  (p=0.008 n=5+5)
ImagePackageCatalogers/node-binary-cataloger-2               1.12kB ± 0%    1.12kB ± 0%    ~     (all equal)
ImagePackageCatalogers/dpkgdb-cataloger-2                     200kB ± 0%     200kB ± 0%  -0.12%  (p=0.008 n=5+5)
ImagePackageCatalogers/rpm-db-cataloger-2                     302kB ± 0%     301kB ± 0%  -0.21%  (p=0.008 n=5+5)
ImagePackageCatalogers/java-cataloger-2                      3.45MB ± 0%    3.46MB ± 0%  +0.09%  (p=0.032 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                     1.25MB ± 0%    1.25MB ± 0%    ~     (p=0.310 n=5+5)
ImagePackageCatalogers/go-module-binary-cataloger-2          1.12kB ± 0%    1.12kB ± 0%    ~     (all equal)
ImagePackageCatalogers/dotnet-deps-cataloger-2                379kB ± 0%     376kB ± 0%  -0.74%  (p=0.008 n=5+5)
ImagePackageCatalogers/portage-cataloger-2                    138kB ± 0%     137kB ± 0%  -0.13%  (p=0.008 n=5+5)

name                                                       old allocs/op  new allocs/op  delta
ImagePackageCatalogers/alpmdb-cataloger-2                     85.7k ± 0%     85.7k ± 0%    ~     (p=0.238 n=4+5)
ImagePackageCatalogers/ruby-gemspec-cataloger-2               4.24k ± 0%     4.24k ± 0%    ~     (p=0.444 n=5+5)
ImagePackageCatalogers/python-package-cataloger-2             16.5k ± 0%     16.5k ± 0%    ~     (p=0.627 n=5+5)
ImagePackageCatalogers/php-composer-installed-cataloger-2     5.51k ± 0%     5.51k ± 0%    ~     (p=0.484 n=5+5)
ImagePackageCatalogers/javascript-package-cataloger-2         3.34k ± 0%     3.33k ± 0%  -0.21%  (p=0.000 n=5+4)
ImagePackageCatalogers/node-binary-cataloger-2                 38.0 ± 0%      38.0 ± 0%    ~     (all equal)
ImagePackageCatalogers/dpkgdb-cataloger-2                     4.51k ± 0%     4.51k ± 0%    ~     (all equal)
ImagePackageCatalogers/rpm-db-cataloger-2                     8.11k ± 0%     8.11k ± 0%    ~     (all equal)
ImagePackageCatalogers/java-cataloger-2                       57.5k ± 0%     57.5k ± 0%    ~     (p=0.286 n=5+5)
ImagePackageCatalogers/apkdb-cataloger-2                      5.39k ± 0%     5.39k ± 0%    ~     (p=1.000 n=5+5)
ImagePackageCatalogers/go-module-binary-cataloger-2            38.0 ± 0%      38.0 ± 0%    ~     (all equal)
ImagePackageCatalogers/dotnet-deps-cataloger-2                7.33k ± 0%     7.24k ± 0%    ~     (p=0.079 n=4+5)
ImagePackageCatalogers/portage-cataloger-2                    3.58k ± 0%     3.58k ± 0%    ~     (all equal)

Signed-off-by: Christopher Phillips <[email protected]>
@spiffcs spiffcs marked this pull request as ready for review November 3, 2022 14:00
@spiffcs spiffcs merged commit 1046464 into main Nov 3, 2022
@spiffcs spiffcs deleted the PURL-selection branch November 3, 2022 14:00
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants