Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3122 valid license url characters #3449

Merged
merged 4 commits into from
Nov 19, 2024
Merged

Conversation

spiffcs
Copy link
Contributor

@spiffcs spiffcs commented Nov 17, 2024

Description

This PR updates the license constructors to strip unwanted characters from URLs in license metadata and make sure all URLs conform to RFC 3987 IRI-reference.

Fix Validation

  • Download .jar file here
  • Run the following command using this branch:
    go run cmd/syft/main.go --output cyclonedx-json=file.json --verbose

The URL now listed for the UserAgentUtils should no longer have special characters in it as listed in the issue.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

syft/pkg/license.go Outdated Show resolved Hide resolved
cleanedURL = strings.TrimSpace(cleanedURL)

// Step 3: Validate the cleaned URL
_, err := url.ParseRequestURI(cleanedURL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we leave these in the core model but then filter them when encoding sbom specs that have specific requirements? That is, should we move this to the cyclonedx encoder? We could check for any similar requirement for SPDX and do the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to clean it as early as possible in the process and get our own model's URL inline with the RFC 3987 IRI-reference

Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
@spiffcs spiffcs enabled auto-merge (squash) November 19, 2024 15:26
@spiffcs spiffcs merged commit f4cad63 into main Nov 19, 2024
12 checks passed
@spiffcs spiffcs deleted the 3122-valid-license-url-characters branch November 19, 2024 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Special characters (tab, newline) in license URL
2 participants