Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust/Go packages license issues #1052

Open
isuruf opened this issue Apr 30, 2020 · 49 comments
Open

Rust/Go packages license issues #1052

isuruf opened this issue Apr 30, 2020 · 49 comments

Comments

@isuruf
Copy link
Member

isuruf commented Apr 30, 2020

A typical rust package use dozens of packages which have different licenses and requirements. A rust package and its dependencies are usually compiled into one library or executable.
For eg: conda-forge/staged-recipes#11315 has a rust package with 91 dependencies with various MIT/BSD-3-Clause/Apache-2.0 licenses and maybe others.

This implies that the licenses and copyrights of the dependencies need to be distributed with the package. There are some tools to help do this like https://github.com/maghoff/cargo-license-hound, https://github.com/onur/cargo-license.

I'm opening this issue so that @conda-forge/staged-recipes and @conda-forge/core know about this when reviewing Rust recipes.

cc @andfoy, @mingwandroid

@isuruf isuruf changed the title Rust package license issues Rust packages license issues Apr 30, 2020
@andfoy
Copy link
Contributor

andfoy commented Apr 30, 2020

What I'm doing in particular is using the JSON output information produced by cargo-license and then grab the repository urls across GitHub, BitBucket and GitLab to call their respective APIs to locate and download all the licenses. However, some libraries need a manual license download still.

@nehaljwani
Copy link
Member

Doesn't the same concern apply to go packages?

@dbast
Copy link
Member

dbast commented May 9, 2020

To not re-invent the wheel here, how are other packaging eco systems solving that e.g. linux distributions like debian or homebrew?

@isuruf isuruf changed the title Rust packages license issues Rust/Go packages license issues May 15, 2020
@isuruf
Copy link
Member Author

isuruf commented May 15, 2020

Yes, the same concern apply to Go packages. See also https://github.com/google/go-licenses

I've no idea how others fix this.

@hadim
Copy link
Member

hadim commented Jun 5, 2020

I am not sure how you want to address that but it does not seem straightforward. We could use a script that goes over all the dependencies, parse for the licenses, and list all the licenses per deps in the conda package?

@hadim
Copy link
Member

hadim commented Jun 5, 2020

Also at what level this script should be run? conda or conda-forge?

@isuruf
Copy link
Member Author

isuruf commented Jun 5, 2020

@hadim, what @andfoy did for rust was to use a script to download licenses and put them in the recipe (and manually add licenses for packages that the script failed). He also added a check in build.sh to check that each dependency had a license file in the recipe. Same can be done for Go.

@hadim
Copy link
Member

hadim commented Jun 5, 2020

It makes sense.

That being said I probably don't have the bandwidth at the moment to do that for conda-forge/staged-recipes#11799

@isuruf
Copy link
Member Author

isuruf commented Jun 5, 2020

@SylvainCorlay
Copy link
Member

Quick thought, this also applies to C++ packages when you link statically with your dependencies.

@nehaljwani
Copy link
Member

Should this be extended to header only dependencies as well? For example, if you use pybind11, boost, etc, do you need to package the license file used by them as well? Because that's as good as statically linking parts of them.

@chrisburr
Copy link
Member

Perhaps there needs to be a licence_exports field in the conda build metadata.

@isuruf
Copy link
Member Author

isuruf commented Jun 6, 2020

Should this be extended to header only dependencies as well?

Depends on the license.

For example, if you use pybind11, boost, etc, do you need to package the license file used by them as well?

pybind11: yes. boost: no.

@bollwyvl
Copy link
Contributor

Thanks for the guidance here on this topic: texlab-feedstock is now using the same approach as pysyntect-feedstock, and "only" required manually hunting down 20 licenses (of 200+). Perhaps we should package cargo-license... seems to cost a couple minutes per build.

@bollwyvl
Copy link
Contributor

bollwyvl commented Mar 1, 2021

As this has come up again for @conda-forge/cryptography:

I wonder if we start curating a community package, e.g. conda-forge-rust-licenses and conda-forge-go-licenses (or just lump them together under conda-forge-license-library) which has some automation to at least allow centralizing the list of known/used <thing>/<version>/(UN)LICEN(S|CE(-.*)(.(txt|md))? (oh and don't forget COPYRIGHT.*). Then packages can demand said package during builds, copying the assets from a well-known location to wherever there license_file points... now that we can use folders, that's much easier. If a new crate/mod shows up, the build would fail, but might suggest...

Some wild crates and mods approach!

- <crate>@<version> <url>
- <mod>@<version> <url>

From inspection, I've found the below licenses. Please visit the upstream repos and verify, then 
make a pull request to https://github.com/conda-forge/conda-forge-license-library adding the lines:

### recipe/licenses/cargo.txt

<repo>@<tag>/LICENSE-MIT
<repo>@<tag>/LICENSE-APACHE

### recipe/licenses/go-mod.txt

<repo>@<tag>/LICENSE-ZLIB-WITH-FREAKY-SPEC

this would in turn update the recipe (once) so we actually have the licenses sha256sums.

@bollwyvl
Copy link
Contributor

bollwyvl commented Mar 2, 2021

So would a conda-incubator/* be the right path? I'm imagining a small (potentially single file) python package with a simple in-build CLI like cargo-licenses | dmv -o $SRC_DIR/third-party-licenses. The JSON/CSV file with, at the very least, the couple hundred licenses URLs/SHAs, would then live in the feedstock... but could contain the actual licenses texts themselves.

@sstadick
Copy link

Hello! I've been working on a tool to hopefully mitigate this issue / make it less painful to publish rust tools on conda-forge. It can be found here.

In short, it crawls the package dependencies and searches out the license files that correspond to what is in the Cargo.toml. If a license isn't found or looks suspicious it will write a warning message. It also provides a "check" flag that takes a previous version of a THRIDPARTY file and compares that against the new one, failing if they are different.

The idea is that the workflow would go as follows:

  1. Run cargo bundle licenses once, address all warnings by manually finding licenses where needed and copy-pasting them into the generated file. CHeck that file into version control and include it your manifest.
  2. Include cargo bundle licenses --output CI-THIRDPARTY --previous THIRDPARTY --check-previous in your CI. This will carry forward any manually changed entries for you, then do a whole file check for sameness, so if a version changed it would fail and force you back to step 1.

Currently this tool supports three formats: yaml, json, and toml. See the above repo for an example yaml THIRDPARTY file.

In the view of conda-forge maintainers, would this satisfy the requirement of licenses and copyrights of the dependencies need to be distributed with the package?

@bollwyvl
Copy link
Contributor

Looks good! Really anything that moves things forward sounds great to me... I'm wagering if:

  • the proposed tool (and/or cargo-licenses, if not superseded) is packaged (dogfooding itself) through staged-recipes
    • so that we can just add it to requirements/build
    • and/or test/requires, and call it, simply, in build.sh|bld.bat
  • its use is demonstrated on a PR to a "tent pole" package like ripgrep
    • so that we have something to point to on other staged-recipe PRs/a knowledge base text chunk

... I don't see what complaints there would/could be.

From a KISS perspective, and as I don't really want to hand edit this file, I'd see JSON being the preferable serialization format... to that end, now that SPDX 2.2.1 is ISO5962, I'd really hope we start seeing it adopted more broadly (and provided by upstream packagers) and can stop needing to re-implement clever stopgaps.

@BastianZim
Copy link
Member

BastianZim commented Mar 21, 2022

This came up again recently for go and I was wondering if we shouldn't recommend the same approach here as for cargo-bundle-licenses.

As mentioned above, go-licenses is the recommended tool to collect these licenses so how about adding this to the build step and then adding the output to the license_file list?

Something like

build:
   number: 0
   script:
     - go-licenses save "github.com/google/trillian/server/trillian_log_server" --save_path="/trillian_log_server"

The only problem is that it produces folders not a single file but we can either zip that afterwards or ask upstream to provide a single output option.

Edit: license_file also supports folders

What's everybody's opinion?

@pkgw
Copy link
Contributor

pkgw commented Mar 22, 2022

If there's a recommended tool, it definitely seems like we should try to integrate it into our best-practices workflows, yeah!

@BastianZim
Copy link
Member

Do we have a go feedstock that is controlled by a member of core somewhere? I'd like to test this against a real feedstock before adding it to the docs but I don't have any go ones.

@xhochy
Copy link
Member

xhochy commented Mar 23, 2022

Feel free to use https://github.com/conda-forge/go-sops-feedstock for this

@maresb
Copy link
Contributor

maresb commented Jun 12, 2022

Regarding go-licenses, I don't know any Go myself, but I'm quite satisfied with the recipe I came up with for the Dasel feedstock. I hope it might be useful as a reference for others working on Go packages.

I'm especially satisfied about how it compiles on linux/osx/win without needing separate build scripts, which is an improvement over other recipes I've seen.

One peculiarity was needing to download the source to a subfolder (src/dasel) in order to avoid the error

$GOPATH/go.mod exists but should not

Another peculiarity was coming up with the particular syntax of

cd src/dasel
...
go-licenses save . --save_path=license-files

which works across platforms.

@BastianZim
Copy link
Member

Oh that's great news, thanks! I ran into the same problem when testing this myself so that's awesome.

@conda-forge/go Do you think this is reproducible? Then I'll add this to the docs and we can close this issue.

@maresb
Copy link
Contributor

maresb commented Jun 13, 2022

You might want to hang on for one moment, I'm looking at adding the osx-arm64 migration to Dasel, and I'm getting an error from cross-compilation due to $GOBIN being defined.

Maybe we should make one final push to fix conda-forge/dasel-feedstock#2 before settling on a standard? (BTW, please help, since I don't know Go! 😂)

@BastianZim
Copy link
Member

That is probably the same as conda-forge/go-licenses-feedstock#9?
I only found this so far: https://stackoverflow.com/questions/55532868/how-to-build-install-cross-compiled-nested-packages-quickly and golang/go#11778 (But you know those probably already😄)
But messing that "deeply" with cross-compilation should probably be discussed with someone more in the topic. Maybe it can also be set/done conda-forge wide, or we can at least add something to the docs?

@maresb
Copy link
Contributor

maresb commented Jun 13, 2022

Thanks @BastianZim! Those links were actually new to me. Good to know for instance that I'm not alone on Conda-Forge, and that gives me a few ideas.

But messing that "deeply" with cross-compilation should probably be discussed with someone more in the topic.

I'm not sure what you mean by "messing", but anything I do here related to Go should definitely be sanity-checked. 😄

@BastianZim
Copy link
Member

Ahh ok great! :)

I was primarily talking about where to place the binaries as discussed in the stack overflow post and for the conda-forge wide solution, not what you did already. But I'm no expert here either so no idea how forgiving/strict conda-build is here. 😄

@maresb
Copy link
Contributor

maresb commented Jun 14, 2022

For osx-arm, I couldn't find a cross-compilation command which is also Windows-compatible. I ended up using the # [arm64] selector to switch between go install for normal installation and go build for cross-compilation. But I still think it's relatively elegant.

@BastianZim
Copy link
Member

Hmm interesting. How about going with go build for everything? Or is install better?

@maresb
Copy link
Contributor

maresb commented Jun 16, 2022

How about going with go build for everything?

Excellent question! I attempted this, and the problem is the environment variable $PREFIX/bin/ for the build command needs to be %PREFIX%\bin\ (or similar) on Windows.

In other words, my go install command is multi-platform (linux, osx, windows) because it requires no envvars, but is not compatible with cross-compilation.

In contrast, my go build command with the environment variable is compatible with cross-compilation but not multiplatform.

Since cross-compilation currently takes place only on Linux, I can get away with using $PREFIX/bin/.

@maresb
Copy link
Contributor

maresb commented Jun 16, 2022

It should also work with something similar to

    - go build -o "${PREFIX}/bin/" -v -ldflags "-X github.com/tomwright/dasel/internal.Version=v{{ version }}" ./cmd/dasel  # [not win]
    - go build -o "%PREFIX%\bin\" -v -ldflags "-X github.com/tomwright/dasel/internal.Version=v{{ version }}" ./cmd/dasel  # [win]

If one assumes that everything should be built for osx-arm, then probably this form is actually more desirable.

(I was just thinking it was slick how the go install command is cross-platform and works with no selectors, but that unfortunately ignores the current necessity of osx-arm cross-compilation.)

@maresb
Copy link
Contributor

maresb commented Jun 18, 2022

It should also work...

I was just playing around with this in conda-forge/dasel-feedstock#3 and failing. Debugging Windows via the CI is really slow and painful, so I give up. Here's the info in case anyone else wants to try. (I suspect this would be easy for anyone with a better understanding of conda-build and a Windows computer.)

Where I'm stuck is with go build on Windows. I'm not so sure what to set as the output directory, and conda-build isn't finding my binaries, so I always get number of files: 0.

@isuruf
Copy link
Member Author

isuruf commented Aug 12, 2022

Haskell/Cabal is another programming language/package manager that runs into this issue.

cc @ocefpaf, @msarahan

@ocefpaf
Copy link
Member

ocefpaf commented Aug 12, 2022

More info on this. Isuru mentioned that cabal-db can help get a list of the licenses for the dependencies in a haskell project. However, cabal-db is pretty much impossible to install. I tried multiple cabal versions and in different systems. I guess the project is abandoned. Are there other alternatives?

@ocefpaf
Copy link
Member

ocefpaf commented Aug 12, 2022

cabal-plan seems more promising: https://hackage.haskell.org/package/cabal-plan#description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests