Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Package subsetting support for vendored code #7058

Open
Leonidas-from-XIV opened this issue Feb 13, 2023 · 8 comments
Open

[RFC] Package subsetting support for vendored code #7058

Leonidas-from-XIV opened this issue Feb 13, 2023 · 8 comments
Labels
proposal RFC's that are awaiting discussion to be accepted or rejected

Comments

@Leonidas-from-XIV
Copy link
Collaborator

Desired Behavior

When other Dune projects are vendored, I would like to be able to define which packages are visible to Dune. This is currently supported using the -p flag to dune build when building/installing with OPAM: in that case Dune only builds (and OPAM installs) only the artifacts that are part of the packages specified via the -p flag.

This allows to use multiple third party libraries that contain the same package name (e.g. dune packages that dune vendors) or multiple versions of the same package (e.g. libfoo-1.0 and libbar-1.0.1 from the same dev-repo, sharing a dune-project but installed in different versions).

Unfortunately, the same is not possible when vendoring code, as in such case dune aborts because the same packages exist multiple times and it doesn't know which code to pick.

Example

Currently to designate a directory as vendored we can use

(vendored_dirs *)

In this RFC I would like to suggest an extension to the vendored_dirs stanza that would allow specifying additional metadata on the directories to be vendored:

(vendored_dirs 
  (foo (packages foo))
  (bar (packages bar))
  *)

Would in such case use the directory name foo and only build the packages that are foo, bar would do similarly, while the other directories would have all their packages evaluated.

Other solutions

I've attempted to write a tool to rename packages to random names (e.g. package bar in dir foo would become bar-ab43ec and foo in dir bar would become foo-e21fce, but unfortunately such a tool is not possible without running into a lot of problems:

  • A lot of dune files need to be rewritten to patch the project-internal references, despite ending up not actually building the renamed packages.
  • dune-project files need to be rewritten to rename the package stanzas
  • <project>.opam files need to be renamed

While these issues are solvable with a bit of elbow-grease, the thing that broke the camels back was

  • dune files need to have their (package) stanzas patched in rules
  • This is only possible in dune files that exist statically. If the rules are generated, e.g. using an ML file either as dune file or as generator that generates dune.inc-like files that need to contain references to packages, these generators would need to be patched which is not possible.

As such, my suggestion is to implement this extension as part of Dune, which has all the information to subselect packages in a dune project already (with -p) so the extension is most likely rather limited in scope.

@Leonidas-from-XIV
Copy link
Collaborator Author

In a way this is similar to #3734 as it arises from the same issue, dealing with large duniverses, but this suggestion seems to me like a somewhat simpler solution as it also allows multiple versions of the same source tree, but at the moment my knowledge of dune scopes is too shallow to say for sure, hence I hope for suggestions and ideas on how to solve this problem.

@rgrinberg
Copy link
Member

This proposal is a decent workaround, but I don't think I understand the semantics yet.

E.g. suppose that we have:

(library
 (name foo)
 (libraries private_baz)
 (public_name bar))
(library
 (name private_baz)
 (public_name baz))

And we try to remove the package baz. How will private_baz be resolved?

@Leonidas-from-XIV
Copy link
Collaborator Author

Isn't this the same problem as doing dune build -p bar[1] on this project, i.e. a misconfiguration/impossible request?

[1]:

File "dune", line 3, characters 12-23:
3 |  (libraries private_baz)
                ^^^^^^^^^^^
Error: Library "private_baz" not found.
-> required by library "bar" in _build/default
-> required by _build/default/META.bar
-> required by _build/install/default/lib/bar/META
-> required by _build/default/bar.install
-> required by alias install

@rgrinberg
Copy link
Member

i.e. a misconfiguration/impossible request?

Nope, the above is correct. When baz is installed, private_baz will resolve correctly to it

@Leonidas-from-XIV
Copy link
Collaborator Author

Yes, but I don't see how this this would be a different case?

If we assume a hypothetical configuration like this:

(vendored_dirs 
  (example-bar (packages bar))
  (example-baz (packages baz)))

Then the private_baz dependency of the example-bar folder will resolve to the private_baz library from the example-baz folder where we enable the baz package.

If you don't specify it then it fails, but that's the same as if depending on private_baz without installing baz (through e.g. OPAM). The only difference is that dune can't assume it is installed and fail if not, it needs to try to find and build it in the vendored_dirs.

@rgrinberg
Copy link
Member

That sounds fine. I'm just trying to understand the semantics. As far as I can see, this feature is basically like --only-packages but the fully qualified package path can be provided.

My original proposal can be used to fix other issues as well, so I'm not yet certain if this intermediate step is worth doing.

@Leonidas-from-XIV
Copy link
Collaborator Author

Yes indeed, this is more like --only-packages (more than it is -p since it is not about release mode at all). The reason I am suggesting this is that this allows us to vendor parts of source trees in a very similar way as OPAM allows installing parts of source trees.

It is for sure less comprehensive than #3734, but I fear just the solution proposed in #3734 is not enough to solve the problem we have in opam-monorepo, since dune needs to somehow know where to take a public library from. In the Menhir example there it is clear that fix should come from vendor/fix and not vendor/menhir/fix but it will still break down in this case:

vendor/
  foo-v1/
    foo.opam
    bar.opam
  foo-v2/
    foo.opam
    bar.opam

as dune won't know which vendored version of the same repo is supposed to provide the foo public library and which one is supposed to provide the bar public library.

Thus I don't think a fully inferred solution is possible. In the case of opam-monorepo we infer which package to use for which vendored source tree from the -p flag to dune build from the OPAM file that the solver has returned to us as a solution. Not necessarily elegant, but it works in practice because build fields in OPAM are of a pretty standard format.

@rgrinberg rgrinberg added the proposal RFC's that are awaiting discussion to be accepted or rejected label Feb 20, 2023
@rgrinberg
Copy link
Member

Yes, I agree that a fully inferred solution isn't possible. Anyway, this proposal sounds reasonable but we need a little more requisite work and a better idea on how it's going to work with other features that we're proposing before tackling this. I would suggest that we pause the discussion until we a prototype of the package management ready before we can be more confident about what to do about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal RFC's that are awaiting discussion to be accepted or rejected
Projects
None yet
Development

No branches or pull requests

2 participants