Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coq] Preliminary support for split native compilation #4750

Closed
wants to merge 3 commits into from

Conversation

ejgallego
Copy link
Collaborator

@ejgallego ejgallego commented Jun 17, 2021

Coq 8.14 can generate native files independently from .vo file
generation by using a new coqnative compiler, which will do the .vo -> .cmxs translation.

This allows Dune to have a simpler rule setup, as it is always
possible to rely on coqnative to compile native files, as opposed to
before where native file generation was controlled by a Coq-level
configure flag.

The new rule setup is enabled for (lang coq 0.4), and we provide a
couple of configuration options in the mode field via a couple of
optional fields:

(coq.theory
  ...
  (mode
   (native
     (package pkg)
     (profile p1 ... pn))))

The first field, (package ) allows to override the package under
which native files for the corresponding theory will be
installed. This is useful for packagers willing to split the native
compilation under a different package, given that it can be extremely
resource-intensive. The default for this setting is the same package
name.

The second field controls under which profiles Dune will setup the
native build rules; the default for this setting is release, so Dune
will skip Coq's native compilation when in dev mode.

(mode vo) is still accepted, in case developers would like to
completely disable native generation.

@ejgallego ejgallego added the coq label Jun 17, 2021
@ejgallego ejgallego requested a review from rgrinberg June 17, 2021 19:54
@ejgallego ejgallego marked this pull request as draft June 17, 2021 19:54
@ejgallego

This comment has been minimized.

@ejgallego ejgallego force-pushed the coq+coqnative branch 2 times, most recently from bb54f5d to d166e97 Compare June 17, 2021 21:03
@Zimmi48
Copy link
Contributor

Zimmi48 commented Jun 18, 2021

I don't think there is a very pressing need to have this in 2.9 if the next non-patch release (be it 2.10 or 3.0) has it and comes in just a few months.

@ejgallego ejgallego force-pushed the coq+coqnative branch 3 times, most recently from 8a6a3b2 to 404bfc5 Compare June 18, 2021 18:58
@ejgallego
Copy link
Collaborator Author

I don't think there is a very pressing need to have this in 2.9 if the next non-patch release (be it 2.10 or 3.0) has it and comes in just a few months.

Sounds good, thanks for the feedback, re-targetting 2.9 then.

The code, while not perfect, is in a reviewable state, thus un-drafting.

@ejgallego ejgallego marked this pull request as ready for review June 18, 2021 19:01
@ejgallego ejgallego force-pushed the coq+coqnative branch 2 times, most recently from 2b567b2 to 3746727 Compare June 18, 2021 19:23
ejgallego added a commit to ejgallego/dune that referenced this pull request Jun 18, 2021
ejgallego added a commit to ejgallego/dune that referenced this pull request Jun 18, 2021
… are used

Backported from ocaml#4750

Signed-off-by: Emilio Jesus Gallego Arias <[email protected]>
@ejgallego ejgallego added this to the 3.0 milestone Jun 18, 2021
@ejgallego
Copy link
Collaborator Author

ejgallego commented Jun 18, 2021

@proux01 this should be interesting for testing with coq-master, or coq-v8.14.dev once a small patch gets backported, in particular this should provide a nice speedup I assume if you got a few core on your machine.

Note that support for (mode (native (package ....))) is still missing, but coming soon; the rest of the code more or less make sense, so modulo a bit of cleanup I would expect it merged soon.

ejgallego added a commit to ejgallego/dune that referenced this pull request Jun 18, 2021
@ejgallego ejgallego force-pushed the coq+coqnative branch 2 times, most recently from f7eff0b to 2127386 Compare June 18, 2021 19:40
ejgallego added a commit that referenced this pull request Jun 18, 2021
… are used (#4753)

Backported from #4750

Signed-off-by: Emilio Jesus Gallego Arias <[email protected]>
@ejgallego ejgallego marked this pull request as draft June 18, 2021 21:31
ejgallego added a commit to ejgallego/dune that referenced this pull request Jun 19, 2021
@ejgallego ejgallego force-pushed the coq+coqnative branch 2 times, most recently from 87195a5 to 61ceeb5 Compare June 19, 2021 16:41
@ppedrot
Copy link

ppedrot commented Jun 22, 2021

@ejgallego yes, I am not surprised that for the stdlib we get a non-trivial overhead due to repeated requires, because all the compilation times are short and dominated by the chain of requires.

I think that in theory we shouldn't even need to transitively require dependencies for native compilation, it could probably be performed directly on the kernel term without knowing anything about the dependencies.

@ppedrot
Copy link

ppedrot commented Jun 23, 2021

@ejgallego could you retry with coq/coq#14540? I don't know how much linking takes time in the absolute, but removing it can only make things faster.

@ejgallego
Copy link
Collaborator Author

ejgallego commented Jun 23, 2021

@ejgallego could you retry with coq/coq#14540? I don't know how much linking takes time in the absolute, but removing it can only make things faster.

Thanks for the quick turnaround @ppedrot , timings added to the above post; improvements are significant, like 50% of the overhead seems gone which is good news; I'll keep testing stuff with the patch and let you know if I find any trouble [seems unlikekly]

@proux01
Copy link
Contributor

proux01 commented Jun 24, 2021

@ejgallego I tested this with multinomials on top of Coq master. I had to modify multinomials' dunes files but it worked fine (I was able to build CoqEAL on top of it), that's great!

However, I still fail to see the big picture here:

  1. currently (Coq 8.13), installing the coq-native OPAM package recompiles the entire switch with native-compiler, this is heavy but works for all packages compiled with coq-makefile (and even the remake ones) without requiring any action from their authors nor package maintainers
  2. with the advent of coqnative in 8.14, we could envision a split package approach as discussed in CEP 48 but this still requires some work on the opam repo side

I think that dune should work just as well as coq-makefile or remake with the current setting 1. This is unfortunately currently not the case: the simple fact that coq-multinomials switched to dune as a build system made it (hence CoqEAL and everything based on it) unbuildable with coq-native. I offered what I hope could be a reasonably simple short term solution in this comment #4722 (comment) do you have a better idea ?

Do you have a precise view of how the current PR will help with 2. ? (the details on the OPAM repo side remain unclear to me)

@erikmd you might be interested in those discussions

@ejgallego
Copy link
Collaborator Author

Hi @proux01 , indeed we can improve the situation, Dune 3.0 can actually query a particular configuration value from coqc [would be great if opam provided an interface for this] and setup the rules after that. That would amount to actually fixing your patch so the query happens thru the build system interface.

A thing to consider tho is that the Coq developers have been considering removing the old mode where coqc generates native files, so I dunno.

@proux01
Copy link
Contributor

proux01 commented Jun 25, 2021

Hi @proux01 , indeed we can improve the situation, Dune 3.0 can actually query a particular configuration value from coqc [would be great if opam provided an interface for this] and setup the rules after that. That would amount to actually fixing your patch so the query happens thru the build system interface.

That could indeed be a nice fix. Do we have an idea of the schedule for Dune 3.0 ?

A thing to consider tho is that the Coq developers have been considering removing the old mode where coqc generates native files, so I dunno.

That's certainly the way to go in the long term. But it is impossible to proceed until the split package approach gets implemented on the OPAM repo side (otherwise, that would mean going back to the pre 8.13 status where native-compiler was practically unusable, which would be pointlless). Unfortunately, no one seems to be working on it yet, or even have a precise idea of the technical details. Thus I still think that we need to fix dune in the current setting.

@ejgallego
Copy link
Collaborator Author

Do we have an idea of the schedule for Dune 3.0 ?

Likely early fall. But if we add this code it would be useful for many other things, like exposing a %{coq:version} variable, etc... In Dune < 3.0, rule generation was outside of the build monad itself, so it was hard to do the following "1. build Coq 2. query Coq for config values 3. setup rules based on that value" However, in Dune 3.0 this is easy to do, as I suggested I would start by adding a rule that generates a .coq-config file with the values [so this file is only rebuilt if coqc itself has been updated], then rule generation can read that file and generate rules based on some config value.

This reduces parallelism a bit, but not a big deal IMO.

But it is impossible to proceed until the split package approach gets implemented on the OPAM repo side (otherwise, that would mean going back to the pre 8.13 status where native-compiler was practically unusable, which would be pointlless).

Note that there is a migration path in this sense that doesn't require any action on OPAM path, with this patch, Dune can build the native objects regardless of how Coq was configured, thus you can have a Dunerized package foo always build the native objects, and indeed depend on coq-native so the deps are there.

This has the downside of being less granular, but on the other hand it works, and can be made granular in the following sense,

@proux01
Copy link
Contributor

proux01 commented Jun 25, 2021

Do we have an idea of the schedule for Dune 3.0 ?

Likely early fall.
[...]

Great, feel free to ask if I can be of any help.

Note that there is a migration path in this sense that doesn't require any action on OPAM path, with this patch, Dune can build the native objects regardless of how Coq was configured, thus you can have a Dunerized package foo always build the native objects, and indeed depend on coq-native so the deps are there.

This has the downside of being less granular, but on the other hand it works, and can be made granular in the following sense,

Good point, although I don't see how that would fix for instance the issue with multinomials pointed above (most users don't want mutlinomials to build native objects but the users of coq-native want it).

@ppedrot
Copy link

ppedrot commented Jun 25, 2021

But it is impossible to proceed until the split package approach gets implemented on the OPAM repo side

Yes and no. The coq_makefile build system does not rely on coqc producing the native binaries, while still using the configure flag to decide whether to compile them. So the removal is not so much tied to the split package approach than to the existence of packages that have their own build system and would need to be adapted (which indeed might result complex enough to prefer implementing split packaging and not do the work twice).

@ejgallego
Copy link
Collaborator Author

Note that opam 2.1 has a new opam option system, that could prove very convenient for non-split packages.

…ges.

In some cases, it is convenient for stanzas to produce objects that
will be installed under different packages.

For example, in Coq we may want to install the costly native files in
a `foo-native` package, or distributions such as Debian like to split
packages in `-doc`, `-dev`, and `-dbg` variants.

This PR modifies the code so this is possible.

This code is cherry-picked from ocaml#4750 (which would be the first user
of the feature), but I am submitting separately as I think the change
deserves more discussion.

For example, this implies that `Dune_file.stanza_package stanza` is not
canonical anymore. Likely, a better modelling would be to introduce
the notion of "sub-package".

The code changes themselves don't make me super-happy, consider them a
draft. If you folks are OK with the idea, I can improve it so the
common case of one stanza installing files only for one package is
better optimized.

Signed-off-by: Emilio Jesus Gallego Arias <[email protected]>
Coq 8.14 can generate native files independently from .vo file
generation by using a new `coqnative` compiler, which will do the `.vo
-> .cmxs` translation.

This allows Dune to have a simpler rule setup, as it is always
possible to rely on coqnative to compile native files, as opposed to
before where native file generation was controlled by a Coq-level
configure flag.

The new rule setup is enabled for `(lang coq 0.4)`, and we provide a
couple of configuration options in the `mode` field via a couple of
optional fields:

```lisp
(coq.theory
  ...
  (mode
   (native
     (package pkg)
     (profile p1 ... pn))))
```

The first field, `(package )` allows to override the package under
which native files for the corresponding theory will be
installed. This is useful for packagers willing to split the native
compilation under a different package, given that it can be extremely
resource-intensive. The default for this setting is the same package
name.

The second field controls under which profiles Dune will setup the
native build rules; the default for this setting is `release`, so Dune
will skip Coq's native compilation when in `dev` mode.

`(mode vo)` is still accepted, in case developers would like to
completely disable native generation.

TODO: take into account the `(package field)` for `native` , to be
implemented in the next commit.

Signed-off-by: Emilio Jesus Gallego Arias <[email protected]>
The code changes doesn't make me super-happy, tho I think that in
general it makes sense for stanzas to produce object files that do
belong to multiple packages, as for example it is common in some
upstream packagers to allow to install debug and doc files into a
different package.

However this implies that `Dune_file.stanza_package stanza` is not
canonical anymore.

Likely to model this better, we may need to define the notion of
sub-packages of a stanza / package, but awaiting for feedback.
ejgallego added a commit to ejgallego/dune that referenced this pull request Sep 9, 2021
…ges.

In some cases, it is convenient for stanzas to produce objects that
will be installed under different packages.

For example, in Coq we may want to install the costly native files in
a `foo-native` package, or distributions such as Debian like to split
packages in `-doc`, `-dev`, and `-dbg` variants.

This PR modifies the code so this is possible.

This code is cherry-picked from ocaml#4750 (which would be the first user
of the feature), but I am submitting separately as I think the change
deserves more discussion.

For example, this implies that `Dune_file.stanza_package stanza` is not
canonical anymore. Likely, a better modelling would be to introduce
the notion of "sub-package".

The code changes themselves don't make me super-happy, consider them a
draft. If you folks are OK with the idea, I can improve it so the
common case of one stanza installing files only for one package is
better optimized.

Signed-off-by: Emilio Jesus Gallego Arias <[email protected]>
@rgrinberg rgrinberg removed this from the 3.0 milestone Oct 21, 2021
@rgrinberg
Copy link
Member

I removed the 3.0 milestone as this PR isn't ready and we're being very strict with what's actually blocking 3.0

@rgrinberg
Copy link
Member

I looked over this PR, and the code looks good to me. I'm a bit skeptical of the end user experience proposed though. I only have a cursory understanding of coq users, so feel free to dismiss my objections.

If I understand correctly, there are essentially two groups of coq users:

  1. Those who care about native compilation
  2. And those who don't

Well I think the proposal here will irritate both groups. Those who don't care about native compilation will be grumbling about slower builds and resorting to inserting random metadata (in their eyes) to gain back some speed. Those who do care about native compilation will now need to track their dependencies and make sure they depend on x and x-native in their opam metadata. Given that there is now direct incentive to split your packages this way, I expect this will lead to a lot of breakage.

In the ocaml world, we had a similar problem with js_of_ocaml compilation. I think we solved this problem quite handily and I wonder if coq could learn a lesson or two.

In development mode, we rely on jsoo's "separate compilation" mode. This requires a .cma -> .js rules for every single library. Now clearly, it's undesirable to execute this rule when installing a package. Not everyone cares about compiling to js, and the jsoo compiler may not even be present. So what do we do? Well we just setup .cma -> .js rules for all external libraries in the background. If we do this naively, there are two disadvantages:

  • Setting up rules for every single external library can get expensive.
  • We are recompiling the same .cma files multiple times in different projects.

Thankfully, we are able to adequately address both of these concerns.

  • We setup the directory structure for the rules in a way that takes advantage of dune's lazy loading of rules. Users will pay the setup costs for only the rules they'll end up using in their build.
  • Enabling the dune cache shares all .js artifacts between projects and avoids re-running the jsoo compiler needlessly.

I realize that some people might feel uneasy about delaying building the artifacts post the installation of the package. But I think these are mostly bad ideas we inherited from inflexible package managers like apt and opam. There's no reason to be constrained by them now.

@proux01
Copy link
Contributor

proux01 commented Oct 22, 2021

Thanks for sharing this experience! It indeed seem that this feature requires further discussion.

If I dare a question: currently part of the Coq packages are built with CoqMakefile, another part with dune and a few other with other build systems and even if dune could improve its "market share", that situation will likely last for long, if not forever. How would the jsoo "delayed" solution work in this setting?

@ejgallego
Copy link
Collaborator Author

Thanks a lot for the feedback @rgrinberg , I will try to revisit this proposal; actually the lazy generation of native files was done by coqc itself not so long ago: if a library was not compiled with native, coqc will compile the "native" object on the fly.

Unfortunately this didn't work so well due to the lack of cache and other issues [maybe related to file lookup] , so this mode was removed in favor of a global setting to enable native.

I need to ponder how things would work with the Dune cache, but this PR is for now stuck due to this, and due to some concerns with the "subpackage" idea that require a bit of analysis.

@rgrinberg
Copy link
Member

currently part of the Coq packages are built with CoqMakefile, another part with dune and a few other with other build systems and even if dune could improve its "market share", that situation will likely last for long, if not forever. How would the jsoo "delayed" solution work in this setting?

In the jsoo world, it was never a problem because we a have simple convention that every library must have a cma archive file no matter what build system was used to generate it. If there's an analogous rule in coq (e.g. every .v file needs a .cmxs), I imagine there would also be little difficulty. The downside would be for non dune based coq projects using native artifacts. But perhaps it is acceptable to only provide this feature for newer coq projects that decide to use dune? There's always a limit to what you can do in make, so I doubt it will be the only limitation.

actually the lazy generation of native files was done by coqc itself not so long ago: if a library was not compiled with native, coqc will compile the "native" object on the fly.

I can definitely see why it didn't work well. Coqc just doesn't have enough context here. The build system is the right place to add all such optimizations IMO.

@ejgallego
Copy link
Collaborator Author

@rgrinberg the discussion above is related to your recent question

@ejgallego
Copy link
Collaborator Author

I'll comment here as I know some people has been interested in working on this setup (for example @Alizter )

The code in this PR is fine from the point of view of Coq's funcionality, and I'll be happy to resurrect it if there is interest.

Main reason progress on this got blocked was due to a pre-requisite for this PR: #4908

So indeed, if there is interest in doing this, we should first resolve how the install targets would work.

It still seems to me that a stanza could belong to different packages, in principle I see nothing essential about the design of such a setup that would not work; how the code is today in Dune, that's a different story.

@rgrinberg
Copy link
Member

Please re-open if this is still relevant

@rgrinberg rgrinberg closed this Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants