Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate compilation #62

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

gmalecha
Copy link

@gmalecha gmalecha commented Feb 14, 2022

Gregory Malecha and others added 30 commits February 2, 2022 23:15
This rewrite preserves most of the old content, but I hope it flows better and
is more obviously compelling.
Yes, a reviewer asked specifically, but this caveat is awkward and the point
should be clearer from the rest of the document.
Also change the later example.
checking. Such a link-time would be effectively the same as `Require`ing all of
the modules in a library preferring their *implemenation* (`.v` file) rather
than their interface (`.vi` file). If the combined set of constraints is
satisfiable, then there is no problem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with universes is not so much satisfiability than the fact you can get wildly different results depending on the link order, because this affects unification and the like. It is somewhat easy to craft a Definition foo : bool. whose value in separate compilation mode depends on the scheduling order.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this affects unification and the like.

It is somewhat easy to craft a Definition foo : bool. whose value in separate compilation mode depends on the scheduling order.

Link-time checks happen on concrete proof terms, so what's the unification in question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I mentioned happens for vos files, but a similar problem arises with separate compilation. Namely, there are cases where replacing the interface with the implementation will result in later files being compiled differently, breaking the abstraction boundary. For a simplified and contrived example, you can consider

Universe i j.

Lemma dummy : True.
Proof.
constr_eq Type@{i} Type@{j}.
constructor.
Qed.

Definition b : bool.
Proof.
(let _ := constr:(let _ : Type@{i} := Type@{j} in tt) in exact true) || exact false.
Defined.

where the value of b depends whether the proof script of dummy has been evaluated or not at the time b is computed. In separate compilation mode, it means that an interface where dummy is an axiom won't produce the same proof as one where the proof is explicitly given, despite being clearly compatible as per the module rules. This is not specific to Qed proofs, one can also trigger this kind of behaviours with explicit constraints hidden under module interfaces.

Copy link
Contributor

@SkySkimmer SkySkimmer Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the separate compilation model we don't expect to be able to remove the interface and still compile dependants, so it's not a problem IMO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, “removing the interface breaks clients” is an intended feature. But the only existing way to fully compile this code in non-vos mode would leak the extra constraint to clients, despite opaque ascriptions; this was brought up in December and it’s concerning. I’ve seen no evidence this is a real problem (compared to the periodic universe inconsistencies we do see), but in principle it seems fixable (very invasively!) if you’re that serious about separate compilation: always defer checking of these constraints to the link-time check. But that’s independent of this proposal, and relevant to vos builds in general.

despite being clearly compatible as per the module rules

I agree that the subsumption rule would suggest ”dropping interfaces” should work, but subsumption on source-language modules doesn’t actually work (in Coq, OCaml and many other languages). Even ignoring hints etc., Import + width subtyping is enough to break subsumption: if you Import foo and foo adds a member bar, then foo changes to a more specific type, yet bar can shadow existing bar and break clients.

@ppedrot
Copy link
Member

ppedrot commented Feb 14, 2022

FTR, I am not very convinced by the fact that we really want a vi file separate from the v file in Coq, despite the argumentation from the abstract. In the ML realm, It perfectly makes sense because the interface is typically reduced to types, but this is not is not the case for Coq. A typical Coq interface exports a lot of stuff that you definitely don't want to copy-paste between files. Sometimes you can't even write it properly because it's autogenerated in some way, e.g. Equations and the like, and you care about the proof term but you don't want to copy-paste the horrendous output of Print with Set Printing All. Not even mentioning the fact that some terms are not even parsable in a faithful way.

I believe it would be better for usability to have a pair of public / private attributes to selectively export stuff. Otherwise we are at risk of introducing an unwieldy system that will not be used on a large scale.

@Blaisorblade
Copy link

A typical Coq interface exports a lot of stuff that you definitely don't want to copy-paste between files.

We agree that we'd want a mechanism to avoid the copy-paste ergonomically, even if that is not discussed in the above CEP (a colleague was working on this problem).

  • I'm certain that public/private would be more convenient in some cases, while .vi files would be better in some others where implementations aren't enlightening, and I can see why you'd think the first is more common. Moreover, in principle the two interfaces could be equivalent ways to access the same functionality: if you imagine you had a Coq parser, you can split a file with #[private] attributes into a .vi and .v file and viceversa by source-code transformation.
  • OTOH, would you have to extend each top-level command to support a #[private] attribute, both in code and in the semantics? Would #[private] Require be skipped when producing a .vio file? Reusing opaque ascription seems conceptually simpler.

I can also imagine some of the design might be shared between the two approaches, tho not sure how much.

@ejgallego
Copy link
Member

I am not very convinced by the fact that we really want a vi file separate from the v file in Coq

I am very convinced we don't want that in fact indeed, but thanks a lot for doing the proposal, hopefully we can all discuss on the WG these coming days.

@gmalecha
Copy link
Author

Thanks for the feedback. I will look forward to discussing. Coq is clearly a very complex system with a lot of pieces, adding a new piece to it should be carefully considered. In my mind, there are a few things at play here and they muddy the waters quite a bit.

The proposal itself adds nothing to the core language of Coq, but there are many well founded concerns about the way that it interoperates with existing features of Coq. To me, this suggests that there are concerns about the design of the core language. In this proposal, those concerns seem to be related to module types and functors, and my rough summary of them from the discussion is that they are clearly broken to the point that it is questionable (at least by some) whether they should be fixed at all or if instead Coq should go the Haskell route and abandon them entirely. That may be a discussion to consider very seriously, and if the choice is to abandon them entirely, then I might very well fall on the side of not accepting this proposal.

Regarding the statement that the interface is much larger in dependent type theory than in simpler languages such as Ocaml, I definitely agree. And I also agree that there are clearly modules where the implementation is (almost) the interface, e.g. Coq.Bool.Bool, etc. (I say "almost" because if the interface was truly the implementation, then there would be no Qed-d definitions in those files, but that is not the case.) But in many uses of Coq this is not so, and program verification is one of those cases but (I do not believe) the only one.

@gares
Copy link
Member

gares commented Feb 15, 2022

I like this proposal and I think it is indeed the opportunity to remedy the universe leakage mess. All the rest seems to me a improved .vio/.vos thing, which is kind of working already modulo universes. I mean, I don't see any blocking issues in providing the interface beforehand (as a separate files or as attributes, again it seems a detail to me). I personally don't consider the link-time-check for universes a good solution (even if it is the best I could suggest myself when I coded the .vio/-quick thing).

About universes we do have a few commands already to name universes and impose constraints on them. I wonder if it would be feasible to mandate such declarations in .vi files and also have a mode where Coq resolves the typical ambiguity of Type not to a Type_j for a fresh j, but to one of these named universes part of the public API. We did experiment with this feature a long ago in Matita where the universe graph has to be defined once and for all in the first file, and then any universe variable would have to be resolved to one existing node in the graph, as some point. We did not test it in the large, but I was wondering if the industrial-scale use case you have in mind could fit this frame or not at all.

The funny story is that math-comp compiles with 3 universes. It used to be true, maybe it is still. I would not be scared of declaring by hand a graph with, say, 5 nodes and have Coq assign all my Type_j to a named node before saving the .vo file.
I clearly see a class of users that benefits from full ambiguity and universe inference, but for the others I believe separate compilation is doable (in a sound way).

@ppedrot
Copy link
Member

ppedrot commented Feb 15, 2022

About universes, one feature I've been trying to implement without much success was to extend the private universe feature to monomorphic universes. That is, levels and constraints local to a Qed proof that do not affect the global graph should not be declared globally. Furthermore, if there are constraints that still affect the global graph (typically by transitivity) you get a warning except if you annotate the definition accordingly. In this model, only constraints coming from accessible data (types and defined bodies) are part of the public interface.

@gares
Copy link
Member

gares commented Feb 15, 2022

I naively thought that this is what "minimization" is supposed to do.

@ejgallego
Copy link
Member

@Blaisorblade
Copy link

We agree that we'd want a mechanism to avoid the copy-paste ergonomically, even if that is not discussed in the above CEP (a colleague was working on this problem).

Regarding copy-paste, the anonymous colleague @gstew5 now posted #63 :-)

Copy link
Member

@jfehrle jfehrle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts and questions

Comment on lines +11 to +12
To address these problems, and support information hiding and Cardelli-style separate
compilation, in this CEP we propose a notion of Coq interface files, which we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be helpful to briefly spell out what "Cardelli-style" means. A reference would be nice, although I personally am not always able to access referenced papers.

to today's `vos` builds (initial builds can be more parallel, and incremental
builds need to recompile fewer files).

NOTE: For concreteness, in this CEP we use the `.vi` extension for for interface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NOTE: For concreteness, in this CEP we use the `.vi` extension for for interface
NOTE: For concreteness, in this CEP we use the `.vi` extension for interface

Export __lib.
```

## Only a `.v` File
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect that this will be the most common case for the long term?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I would hope that there are a lot of libraries that would have meaningful theory that could be put in a .vi file. @charguer mentioned some things like this in his standard library talk.

- only `lib.vi`;
- or only `lib.v`.

## Both a `.vi` and `.v` File
Copy link
Member

@jfehrle jfehrle Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any things that can't appear in a .vi file (or in practice wouldn't be included in a .vi file)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

specification. However, this is an approximate specification, because the exact
semantics cannot be expressed via source-to-source transformation.

As a key principle: `.vi` interfaces are meant to hide implementations and
Copy link
Member

@jfehrle jfehrle Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case for hiding "implementations" is primarily proofs (plus necessary definitions)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qed is typically good enough to hide proofs, but you often want to hide more: both hints needed by the implementation, but also the bodies of definitions, etc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. same job as interfaces in java

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just, it can be useful to hide bodies of definitions that you do not want clients to think about though this does not work well if you want to compute on the definitions.

suggesting an implementation strategy, we sketch a semantics as a
source-to-source transformation, to be taken as an informal and _approximate_
specification. However, this is an approximate specification, because the exact
semantics cannot be expressed via source-to-source transformation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would much prefer a version where this can be fully implemented as a source-to-source translation (in part because that makes things easier for the bug minimizer). I think it also solves some design questions without making things significantly worse for the user.

I propose the following:

  1. Interfaces are interpreted as module types
  2. Compiling a .v file with a corresponding .vi acts as if the module type were ascribed to it (except possibly Requires, which may be considered to be above the module type ascription as far as global side effects go)
  3. At Require time there should be three flavors: a file can Require just the compiled .vi file, it can Require both the compiled .vi file and the compiled .v file, or it can Require just a compiled .v file. The last should be an error if a .vi file (compiled or source) exists alongside the .v/.vo file. Requiring both the .v(o) and compiled .vi should be like Requiring the compiled .vi and then attaching bodies to the constants for Print Assumptions, etc (in the same way it would happen for modules opaquely ascribed with a type), and also including the Requires of the .v file.

This pushes all the issues with side effects of require, universe constraints, ability to use Extraction, etc, onto the user side in a systematic way, by putting linking under the user's control (linking corresponds to requiring the compiled .v file).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except possibly Requires, which may be considered to be above the module type ascription as far as global side effects go

To actually achieve separate compilation, the Requires would have to be inside the module and be hidden by the opaque ascription. Today this gives a warning, but in the Coq call it wasn't clear why, but until that's resolved I consider a source-to-source translation impossible today :-).

Copy link
Member

@gares gares Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I tried to explain (very badly) by saying that an interface should morally be

Require stuff.
Module Type this.
Import some stuff.
...
End this.

There are a few "actions" performed by require which don't play well with modules, which are usually seen as a linguistic construct to limit the scope of things.

Ideally, if one does not import a module, he does not get any effect. But require, today, leaks stuff, like global hints. So placing it inside a (sub) module does not do what you may think. Also, there is special code hoisting some actions at the top level, like loading ML plugins and co which is ugly and has no real use case.

Finally, it would be nice to have a header in each file which lets one compute the dependencies of the file easily. If requires are only allowed in there, then we kind of have this header. If we allow them to be anywhere, well... Similarly, require is slow, if they are all at the beginning of the file, a UI pays the cost once upfront, and then it is good to go. If they are interleaved with other sentences, well, you may need to backtrack over that slow line way more often.

To sum up, it is both an "hygienic" and would allow some code simplification if all requires were on top.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally (though this is a fairly purist point of view), I have been thinking of .v files as corresponding to functors from their Required dependencies. Currently, Coq does not provide a mechanism to ascribe a type to these modules, so there is no means to abstract. With this approach, you can change what the module type being abstracted over is which provides more flexibility to the implementer of the interface. If I follow this mental model, then the second flavor of Require that you describe above means that you can "abstract over" the "natural" module type (the one that does not hide anything, including the body of opaque definitions); however, it becomes a bit odd because to typecheck those definitions you might need the body of other constants which could introduce problems.

@maximedenes
Copy link
Member

Separate compilation is of course a (very) desirable feature.

Similarly to others above, my personal opinion is that separating implementation and interfaces in two files brings more disadvantages than benefits : duplication, cross-files errors (when an interface is not satisfied). To me, a solution based on (public/private) attributes would be superior, if it comes with proper tooling : IDEs being able to describe a (documented) interface, efficient separate compilation, etc.

Could we sketch an attribute-based solution?

@gares
Copy link
Member

gares commented Mar 3, 2022

To make the discussion more concrete, I crafted an example using what we have today.
I guess it is not satisfactory to you, but I have an hard time understanding why.
Here it is: https://github.com/gares/coq-defunctor/tree/master

@ejgallego
Copy link
Member

Could we sketch an attribute-based solution?

Indeed I have a feeling similar to @maximedenes , but I don't know enough yet about the problem.

I am in particular missing a good description of the problem in the CEP, which if I read it correctly, jumps directly to propose a solution. Do we have that somewhere?

[github interface and I don't get along sometimes too well]

@ejgallego
Copy link
Member

I did give to this some more thought, and I think that we should not use files to track leakage, but the document model should take care of it, as it is done for opaque or incomplete proofs already in the Flèche prototype (and I've got quite confident in that it works)

That is to say, when checking a document, a delta that only updates the implementation should not force re-checking of the parts not depending on it.

@Blaisorblade
Copy link

Just three high-level questions on https://github.com/gares/coq-defunctor, after @derkha's prodding:

  1. can we enable Link for individual definitions, beyond just modules?
  2. does Linking a constant A_impl with A create a definitional equality A_impl = A? The same question extends to modules with constants in an obvious way (intuitively, if you over-simplify modules as name -> option term functions, the extension is pointwise: do you get def. eq.s for each constant that's linked to a parameter?).
  3. users can write inconsistent Link statements — detecting them when loaded together will prevent proofs of False, but should they also be detected more eagerly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants