-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple public libraries in a package #4206
Comments
I think that this is a good idea, and |
Another noteworthy example besides
|
My opinion on this hasn't changed since last time. I'm still opposed to this change. This will massively impact tooling across the board for minimal gains. I've seen no reason why the situation is different now. |
@23Skidoo I updated the bottom of the ticket with the two major changes from the previous proposal. @snoyberg Let me take some of your comments from the previous time around and say how things have changed with this version of the proposal.
Our compatibility with new and old GHCs works due to the existing implementation of convenience libraries. The basic premise is that the library foo from package bar is munged into the package name
This proposal will require code from Hackage/Stackage to understand how to display multiple libraries from a single package. But this can come later; since many package maintainers want to maintain compatibility with old versions of Cabal, they will not want to migrate to this new syntax immediately. Once the window of supported Cabal versions is large enough and people start using it, Hackage and Stackage can gain knowledge how to support this.
Permissions management is, as before, per-package. No change here, no change necessary.
The new proposal does not use the backwards-compatible hyphens syntax, so this is not an issue anymore.
Dependency solving is actually unchanged, because you can always interpret Yes, this does mean that if you have a per-package build system, naively, you'll end up building every library inside In any case, whatever we do, I'm not suggesting we start having people use it, I just want to make sure Setup.hs understands this so that, if later the ecosystem catches up, we can more easily flip the switch... or we can just as easily remove it again and call the experiment dead. |
As the author of I currently don't have an issue developing or building multiple packages thanks to @ezyang Made the following statement about package usage:
While the above statement is true most of the time, regarding version ascription there are often small updates, bug-fixes, or release corrections that get pushed out as a minor version bump of the form Since there's quite a lot of information floating around here what I'd like distilled is exactly how a user of my packages benefits, and how this would affect presentation of the library in Haddock, on Hackage, etc. |
@brendanhay Definitely agreed about the need for individual version numbers. |
I think the expected mode of use for multi-library package is that if you need to release a small bugfix for one library, you just do a new release for the entire package (with all the libraries). The biggest downside here is that when a user takes the update, a change to an edge package could cause the user to need to rebuild a core package; you'll do the same amount of rebuilding if you updated a core package, and these patch-level releases aren't supposed to be bounded against via the PVP, so dependency bounds aren't loosing expressivity.
There's no patch here yet, but my thought is that https://hackage.haskell.org/package/amazonka will eventually have a table of contents for each supported library, containing the module listings for each library under a heading. Haddock shouldn't be much different than it is today. I think multi-library packages are primarily an improvement for maintainers. Probably the biggest improvement from the user end is that all of the available libraries are in one place (as opposed to having to go to the category page), and that when doing dependency provenance auditing, amazonka can be treated as a single dep, rather than 90 packages that need auditing: it's easier to tell that one package comes from a single source, than a hundred packages that come from a single source. |
It occurred to me today that #2832 is more or less essential for the wider adoption of this feature. |
... and per component solving, TBH. |
Is public / private enough? Or is a test scope needed as well? |
As a counterpoint to @snoyberg's views, I'd like to say that I'm heavily in favor of this change. Why? I've been using backpack heavily in the design of some of my recent projects. However, this currently necessitates, at least in one case, me splitting one library up into 2 dozen or more separate libraries:
I already personally have a dozen or so old project names "tombstoning" up the hackage namespace from earlier package consolidations and refactorings. The fine-grained model backpack drives us to would push this ratio up, by a great deal over time. |
Reading through this discussion, I'm not convinced. @ekmett I'm not convinced by some of your arguments.
I don't know what specifically this is about, but most systems use
The first part of this argument might or might not make sense - I don't know backpack well enough. The larger graph annotation issue should be fixed before optimizing a local issue which might be subsumed by a solution to the larger problem. The annotation layer should be on top of the packages. My point of view is that hackage should be calling out to a host of external services for this - a microservices based approach. Alternatively a standard graph representation, such as DOT-ish files could be loaded into hackage as layers. It is possible that some graph annotation should be collapsed into the cabal file, but at a minimum that means being able to talk about external packages in this graph language. @ezyang gives this argument:
I agree with this, but I think that just proves that the problem is elsewhere, namely that hackage has no annotation layer. Where is the I don't see the provenance auditing argument. You can trust the author, the homepage, or the repository, all which represent different views on source. What does this grouping mechanism bring to the table? Maybe I'm misunderstanding what provenance auditing entails? |
@alexanderkjeldaas: None of your points there address the central "churn" issue I raised above. Adding and deleting library components is a thing I can and frankly must do version by version because of how much backpack I'm using. Adding and deleting packages pollutes hackage forever. To make this concrete https://github.com/ekmett/coda/tree/66480f3e3ee4e6cd19bcb3a44b9c8cd698314901/lib provides a pile of packages, almost 80% of which no user will or should ever see. 20% of which they should. https://github.com/ekmett/coda/blob/62d0ca91778a7c0fa7a6cdce3a0ecc6f3ba30bfd/coda.cabal on the other hand manages to package all of these things in the original pre-backpack API by using multiple libraries internal to the package. This drops a couple hundred lines of configuration duplication. Once we get common blocks I can consolidate even further. I give it one version, I'd maintain it as one package. If I add or remove components, which I'm doing daily, and will likely have to do long into the future, they won't separately churn the hackage package namespace after the package is released. Unfortunately, it is useless. As I want users to be able to instantiate packages like the I don't want 20 packages. I want ~4 visible libraries, but to get them I have to spam hackage with 20 packages. The current situation leaves me hoist upon the horns of a dilemma:
At the moment I am opting out of the problem by not uploading anything, which serves nobody. |
@ekmett's arguments seem convincing to me, now we just need to find someone willing to take this on. Perhaps a potential HSoC project? |
@alexanderkjeldaas Are these theoretical concerns or do you have "war stories" to share? If not, @ekmett "wins" just by virtue of being the maintainer of more Hackage packages than any other single Haskeller alive. (I think that's about right.) On a much smaller scale I've felt a similar pain to ekmett's just maintaining a "generic" library across "in-memory-database", "postgres", etc. |
It took me a minute to try to extract the key argument from @ekmett's comment, so I'm going to try to restate it as I understand it: Backpack gives us modules but it does so by organizing things at the library level. Modules can be useful for parametrizing packages by other packages. But they can also be useful as a tool to organize abstractions in code, as an alternative to typeclasses with different tradeoffs and benefits. However, to take advantage of this second use-case, we need multiple libraries, since modules live at the library level. So posit a package that uses modules to organize code. Now, it can use internal libraries that are not visible to anyone else, and the final package is useful. However, those internal libraries then are not importable or usable by others. So fine -- you can now choose to expose those internal libraries so others can use them, by putting them each in their own package. But now, you've got a ton of packages, each taking up top level namespace. So the ease of refactoring you had when you only had one package with multiple internal libraries drops significantly. Adding or removing a typeclass between versions of a package shouldn't be a big deal. Adding or removing an internal library won't be a big deal. But adding or removing a top level package is a big deal. So a rough analogy (only rough, because I think it captures the wrong parts of what are shared and parameterized over, etc.) would be a world where we could define whatever typeclasses we wanted, but everytime we wanted to share a typeclass with others, we needed to put it in a separate package. So the argument as I see it is that packages should be able to provide multiple modules. And because modules are handled at the library level, then packages need to provide multiple libraries. |
@gbaz is it a new package for each typeclass or a new package for each instance? |
@ivan-m One package for the "class", one for each "instance" along with another to actually use the instance, as you can't define the modules it depends on in the same package that mixes in the backpack package, and test suite to check that it matches the signature. This yields a footprint like 2n+1 libraries + n test suites for n "instances". |
@BardurArantsson I'm fine with @ekmett winning by default, but the rationale for the decision will be this issue and if the arguments aren't easy to understand, that's too bad. @ekmett thanks, and with the help of @gbaz I now understand the churn issue better. I think @snoyberg has a point that this is quite a change for benefits that probably shouldn't be requiring incompatible changes. So, repeating the technical requirements in my words, it would be something like this:
IMO, if sub-libraries are allowed to have different versions, then the given proposal doesn't really handle the tombstone issue, as I cannot know whether
@ekmett does this cover the churn issue? |
@alexanderkjeldaas I think @ekmett wants to have same version number for all public sub-libraries. |
I'm not seeking to "win" by fiat, but rather by strength of argumentation. ;) @alexanderkjeldaas Your proposal leaves me uploading 20 separate packages, implementing doctest independently 20 times, unable to use common blocks to reduce boilerplate between them, and still doing a ton of manual maintenance.
Except now its even worse as the names are still being taken, but aren't shown. My personal estimation is that continuing the current policy will result in backpack being considered an all-but-unusable curiosity, when it can directly address a bunch of practical performance and cut-and-paste coding issues in Haskell if we just make some changes to the ecosystem to better accommodate it.
Cabal sub-libraries are inside the same package, there is only one We already have a similar dependency story for build tools, where you can depend on |
@ekmett has pretty well summarized most of my thinking on the issue. I don't think it's practical to expect Backpack users to split and upload 20 packages, and I don't want this to happen. I am less bearish, in the sense that I think there are useful use-cases for Backpack even without multiple public libraries, but these are all "big scale" Backpack, and not the fine grained use that are easy for early adopters to play around with. Maybe this fine-grained use truly is the most useful thing; in that case I'd really like to see this proposal take wings :) I'll add one more bit of information; by far the part of implementing this which is most opaque to me is how to adjust Hackage and Stackage to accommodate this new model, since it's entirely new UI that has to be designed and iterated on. There are some structural problems with making this happen, but I am not on a tenure clock and I can take the time necessary to make it happen. |
As a nascent user of Backpack, I've already had to split one of my projects into ~6 different internal libraries, and Any non-trivial project will almost certainly see its library count explode by a few factors. This is unavoidable, as far as I can tell. So I am strongly in favor of this, even if migration time is slow. As Ed K mentioned, this is effectively much like extending And while the proposal explicitly states it stands on its own -- at this rate, no matter what way you slice it, I think this feature is going to be crucial for larger Backpack adoption, which is my primary motivation, anyway. Discussions about hackage UX or whatnot aren't as relevant to me at all, admittedly, because I don't use hackage for "discoverability", and even if I did -- even if the discoverability can be improved in 2 dozen ways -- it doesn't take away from the core complaint "I now have to upload and maintain ~15x as many .cabal files" when I use Backpack. I think discoverability should be better, I guess, and this could impact it, but it doesn't solve the same problems. You can ultimately pile a lot of stuff into Hackage to make this appear nice, but it'll be tough to convince me that's a better solution than handling it natively in the build tool. As it stands? I honestly can't consider myself taking the time to break up my experiments, all my signatures and whatnot, into mini-.cabal files, each exactly in sync, and play the upload-dance 6 times in a row every time I make minor changes. |
I support this change. Making it is crucial for using Backpack to its full potential (as a partial replacement for type classes). My experience mirrors that of @thoughtpolice — I have a small library (about 300 LOC) and I had to split it into 4 packages. |
I rest my (likely faulty) arguments.
…On Dec 23, 2017 14:09, "Vladislav Zavialov" ***@***.***> wrote:
I support this change. Making it is crucial for using Backpack to its full
potential (as a partial replacement for type classes). My experience
mirrors that of @thoughtpolice <https://github.com/thoughtpolice> — I
have a small library (about 300 LOC) and I had to split it into 4 packages.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4206 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAUtqTdp2Z91aTU4ZKzdISVQRYB91d1jks5tDPuhgaJpZM4LaC38>
.
|
@fgaz I need one clarification on this. When a single package exposes multiple public libraries, it should |
@piyush-kurur There will be a |
Create a new syntax for depending on any library of any package. The syntax is build-depends: pkgname:{pkgname, sublibname} -any where the second `pkgname` specifies a dependency on the main unnamed library. Closes haskell#4206.
Btw I settled on the |
One of the main reasons for moving into a backpack based interface was to allow advanced users, who know about their hardware, to plugin appropriate implementations seamlessly into their cryptographic library. The cabal interface only allowed one public component to be exposed from a package and backpack needs ability to separate signatures and modules that use signatures into different component to avoid recursion. Thus the only solution was to split the raaz library into multiple packages. This had the following downside 1. Each package requires their own .cabal file and hence one cannot use the common section across all these very similar package 2. Unnecessary pollution of the package name space. Besides this and other reasons, cabal-install is now being modified to support multiple public components from the same package. In anticipation, we merge all the packages back to a single package. Reference 1. The original issue is described at <haskell/cabal#4206> 2. The GSoC work is available at <haskell/cabal#5526>
So
name: pkg
Library
...
Library pkg
...
Is invalid. Is it now already?
…Sent from my iPhone
On 21 Sep 2018, at 10.09, Francesco Gazzetta ***@***.***> wrote:
Btw I settled on the amazonka:{appstream,elb} >= 2.0 syntax. The main library can be added by using the same name as the package (pkg:{pkg,sublib1,sublib2}).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@phadej Sorry, what do you mean? The stanza naming remains the same (ie empty library name for the main library) |
Oh now I understand... I guess it is invalid already because of how internal deps shadow packages but I'll have to check. The other option was to use a reserved word (lib) to represent the main library |
@phadej fwiw, naming the sub-library the same way as the package name has been invalid so far (i.e. cabal rejected it in the past already); and this is also documented in the user's guide accordingly |
Can we extend |
@Ericson2314 yes, definitely; I already talked to @fgaz about this So I'd expect Moreover, I expect that Consequently, the |
Ok. Perfect! |
@hvr What is the proposal to handle a combination |
@piyush-kurur yes, I'd expect that to be a contradiction and hence unsatisfiable; cabal should be able to point out such cases as suspicious warnings. I don't think we need the added complexity of supporting mixing sub-libs from different releases of the same package they belong to, do we? |
@hvr I too would say that cabal should not install the two. Things can get tricky if the version bounds give |
@piyush-kurur @hvr @Ericson2314 @hvr |
...but should you? :-) Is there a legitimate use-case which requires to mix executables provided by the same package from different releases? I'm suggesting to opt for the simpler design unless we can demonstrate that there's a legitimate use-case which justifies incurring the added complexity, both technical in the implementation as well as cognitive (since you'd have an additional degree of freedom to keep in mind when specifying your package dependency spec) Also note that if we go for the simpler scheme now, and in 1-2 years discover a relevant use-case, we can still change the semantics as we have the |
Also, for example, when |
Create a new syntax for depending on any library of any package. The syntax is build-depends: pkgname:{pkgname, sublibname} -any where the second `pkgname` specifies a dependency on the main unnamed library. Closes haskell#4206.
@fgaz and @ezyang raaz now has a complete backpackised design so I am betting my money here. Is there something that I need to know to get the documention to display properly. I have a candidate package for raaz http://hackage.haskell.org/package/raaz-0.3.0/candidate for which the documentation is almost non-existent as almost all the haddock annotations have now moved to the signatures. |
@piyush-kurur cabal new-haddock does generate correct docs for sublibraries, but we don't have an appropriate index page linking to them yet. Haddock and Hackage will have to be modified accordingly, with a subsection or table row for each sublibrary |
@fgaz, it probably is slightly more complicated than that. Consider the case when you want to expose a
So I think there is some issue here that needs discussion. If this is not the right ticket for it I can open another one. |
Create a new syntax for depending on any library of any package. The syntax is build-depends: pkgname:{pkgname, sublibname} -any where the second `pkgname` specifies a dependency on the main unnamed library. Closes haskell#4206.
Create a new syntax for depending on any library of any package. The syntax is build-depends: pkgname:{pkgname, sublibname} -any where the second `pkgname` specifies a dependency on the main unnamed library. Closes haskell#4206.
Create a new syntax for depending on any library of any package. The syntax is build-depends: pkgname:{pkgname, sublibname} -any where the second `pkgname` specifies a dependency on the main unnamed library. Closes haskell#4206.
#5526 is in! 🎉 , but please remember that in the final 3.0 release, sublibraries will NOT be public by default: you won't be able to depend on them from outside of the package unless you put |
Motivation. A common pattern with large scale Haskell projects is to have a large number of tighty-coupled packages that are released in lockstep. One notable example is amazonka; as pointed out in #4155 (comment) every release involves the lockstep release of 89 packages. Here, the tension between the two uses of packages are clearly on display:
A package is a unit of code, that can be built independently. amazonka is split into lots of small packages instead of one monolithic package so that end-users can pick and choose what code they actually depend on, rather than bringing one gigantic, mega-library as a dependency of the library.
A package is the mechanism for distribution, something that is ascribed a version, author, etc. amazonka is a tightly coupled series of libraries with a common author, and so it makes sense that they want to be distributed together.
The concerns of (1) have overriden the concerns of (2): amazonka is split into small packages which is nice for end-users, but means that the package maintainer needs to upload 89 packages whenever they need to do a new version.
The way to solve this problem is to split apart (1) and (2) into different units. The package should remain the mechanism for distribution, but a package itself should contain multiple libraries, which are independent units of code that can be built separately.
In the Cabal 1.25 cycle, we've added two features which have steadily moved in the direction of making multiple public libraries possible:
Convenience libraries (Add support for convenience libraries #269) mean that we already have Cabal-file level syntax support for defining multiple libraries. Note that these libraries are only accessible inside the package, so to a certain extent, the only thing that would need to be changed is making it possible to refer to these libraries.
Per-component build (Per-component interface for Setup #3064) makes it easy to build each internal library of a package separately, without having to reconfigure a package and then rebuild. This means that, from the perspective of new-build, building multiple libraries from a package is "just as separate" as building multiple packages, and I imagine Stack would be interested in taking advantage of this Setup.hs feature.
So the time is ripe for multiple public libraries.
Proposal.
First off, I want to say that for the 2.0 release cycle, I do not think
we should add support for the feature below. However, what I do want to
do is make sure that the design for convenience libraries (which is new)
is forwards compatible with this (see also #4155)
We propose the following syntactic extensions to a Cabal file:
build-depends
shall accept the formpkgname:libname
whereeverpkgname
was previously accepted. Thus, the following syntax is now supported:amazonka
refers to the "public" library of amazonka (the contents of thelibrary
stanza with no name), whileamazonka:appstream
refers tolibrary appstream
insideamazonka
. A version range associated with sub-library dependency is a version constraint on the package containing that dependency; e.g.,amazonka:appstream >= 2.0
will force us to pick a version of theamazonka
package that is greater than or equal to 2.0.library
stanzas,public
, which indicates whether or not the library is available to be depended upon. By default, sub-libraries are NOT public.NEXT, we need the following modifications to the
Setup.hs
interface:The
--dependency
flag previously tookpkgname=componentid
; we now augment this to accept strings of the formpkgname:libname=componentid
, specifying what component should be used for thelibname
ofpkgname
.Explanation.
The primary problem is determining a syntax and semantics for dependencies on sub-libraries of a package. This is actually a bit of a tricky problem, because the
build-depends
field historically serves two purposes: (1) it specifies what libraries are brought into scope, and (2) it specifies version constraints on the packages that we want to bring in. The obvious syntax (using a colon separator between package name and library name) is something like this:But we now have to consider: what is the semantics of a version-range applied to one of these internal libraries? E.g., as in:
Does having separate version ranges even make sense? Because the point of putting all libraries in the same package is to ensure that they are tightly coupled, it doesn't make sense to consider the version range on a library; only on a package. So the
build-depends
above should be considered as levying the combined constraint>= 2.0 && >= 3.0
to theamazonka
package as a whole.This causes a "syntax" problem where, if you want to depend only on sub-libraries of a package, there is no obvious place to put the version bound on the entire package itself. One way to solve this problem is to add support for the following syntax
amazonka:{appstream,elb} >= 2.0
; now there is an obvious place to put the version range.Downsides. Prior to solving #3732, there will be some loss of expressivity if a number of packages are combined into a single package: cabal-install's dependency solver will solve for the dependencies of ALL the libraries (even if you're only actually interested in using some of them.) This is because the solver always solves for all the components of a package, whereas it won't solve for the dependencies of a package that you don't depend on.
Prior art. This is a redux of #2716 Here is what has changed since then:
The motivation has been substantially improved; last time the motivation involved some Backpack/internal code readability hand-waving; now we specifically identify some existing, lockstep packages which would benefit from this. The proposal has nothing to do with Backpack and stands alone.
The previous proposal suggested use of dashes for namespace separation; this proposal uses colons, and the fact that the package must be explicitly specified in
build-depends
means that it is easy to translate a new-stylebuild-depends
into an old-style list ofDependency
, which means tooling keeps working.The previous proposal attempted to be backwards compatible. This proposal is not: you'll need a sufficiently recent version of Cabal library to work with it.
CC @mgsloan, @snoyberg, @hvr, @Ericson2314, @23Skidoo, @dcoutts, @edsko
The text was updated successfully, but these errors were encountered: