-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standard library reform: Split base #47
Standard library reform: Split base #47
Conversation
I greatly appreciate the amount of work put into the proposal, and I'd love to fall in love with it. This is an awesome idea, but IMHO (and I hate to say it!) wrong time: too late or too early. In the current state we don't have enough resources. Firstly, I'm not confident that the disruption on the way to the eventual goal is sustainable for the ecosystem. Indeed, we Haskellers tend to brush off such temporary concerns for the greater good of eternal shining. However, even if we accept such school of thought, I claim that even skipping ahead several years of churn we would not be able to maintain the split Currently we support only 1 major version of The thing is that we don't have resources for any N > 1. So in practice, once a couple of burnt-out maintainers leave the frustrated community, we'll get back to N = 1 and status quo. With regards to |
❤️ this is still the most positive feedback I have heard you on this sort of thing, so yes, I will take it! Thank you.
Well there are no actual breaking changes proposed here,
OK a few things to take a part.
So I too think front-loading the CLC bike-shedding is silly and will waste CLC time! But the feedback I got in haskell/core-libraries-committee#105 was repeatedly "let's not do anything tell we have a design of where we want to end up" . I would much rather do the If there is consensus that it does make sense to explore the the behind the scene stuff first after all, I will gladly change this. Similarly, I would much rather just straight up commit to the idea that there should be an IO-free 100% portable standard library that the IO-full ones build upon. If we can commit to that, I would also delay wrestling with what the Browser and WASI IO interfaces should look like until later --- in fact we can simply experiment with designs on top of the IO-free design and then have the CLC ratify something that is implemented vs sketch something tentative out from a blank slate. If there is consensus that that too sounds good --- also in the past I got feedback that having too many libraries would piss off users --- I will also gladly rearrange the steps to reflect this. Finally (and I should put this in the proposal), I know this proposal is a big lift --- not because it is technically challenge, but because it is administratively challenging --- but I think that can be a good thing. This can be the marquee project that the Haskell Foundation takes on, something super user-visibility and impactful, that can drive a lot of interest and fundraising. As the saying goes, "you have to spend money to make money": I think if the Haskell Foundation can rise to the occasion and pull this up, we'll grow our resources and administrative capacity to meet the needs, and with such momentum be in a better place to tackle whatever comes next. |
Regarding the concern of CI complexity and testing: we have a full time engineer working on that already, employed by HF. Afaiu their scope currently lies on fixing GHC tests and CI stability. However this work seems very much in alignment to the required work of this proposal. |
Very nice and thoughtful write up, thanks a lot! I hope it'll happen, one way or another. |
Thank you for this thoughtful writeup John. I appreciate it. It is not easy to navigate the best path given the differing needs of our users, and limited resources. But debating and (I hope) agreeing a North Star destination would be reallly helpful, even if it takes us a while to get there. So I'd argue for not getting enmeshed too quickly in "we can't afford it". (Having said which, "we can never afford it" is a reasonable argument. e.g. It's a waste of time to debate which particular exoplanet we want to colonise when we have no feasible way to get to any of them.) |
I wonder if you could elaborate the proposal to explain why splitting base will help? After all, if we become 100% clear about what is
|
@hasufell , thanks for the mention. :) Yes, I'm here, and in fact you could say my current mission is to increase the GHC team's bandwidth. I trust that will have far-reaching positive effects on topics such as this one. This is definitely an interesting topic and a cool development. I look forward to watching it progress. I echo Simon's request for more information about how splitting base will help. As a matter of fact, I do have a couple guesses, but they are only guesses! It would be good to see the reasons fleshed out in the proposal. |
Good point. Splitting base is done to address Problem 4 without the maintenance burden explosion @Bodigrim warms of. However this is indeed not yet described well. I will update the "New Goal: Split Base" section to make the connection between these 3 things (Problem 4, maintenance cost control, and split base) clear. |
There is still
I think that is a conversation better to have if and when we un-expose items --- not now. Any time something is moved to The only material difference is when stuff in the closure of exposed things becomes internal (i.e. moved to |
I don't agree. We're trying to hash out how the future communication and collaboration is going to look like. We're not making decisions on specific items or modules (that should all be postponed). I feel these concerns are being dismissed with "but that problem already exists", which I personally find insufficient. A further split will make collaboration more challenging. I think that should be absolutely clear to all parties. Postponing the discussion about these collaboration practices until someone writes a CLC proposal is not going to go well. Parties will be overinvested and it's going to blow up. I do expect that we have absolute clarity about the collaboration process going forward, even if it touches already existing problems. |
I say this not to merely punt, but also because it genuinely depends on what the change in question is. I would expect most things that are truly internal (not reexported in Indeed, my experience with GHC itself is that micro-benchmarks that do not obviously relate to real-world concerns are incredibly annoying! On the flip side, the long term goal is to eventually have In that case, we should do performance tests, precisely because even though the implementation is GHC-specific, the interface is implementation agnostic. It therefore makes sense to test as a natural "narrow waist" between the wide expanse of misc GHC internals, and the wide expanse of implementation-agnostic code in Also it is also not a coincidence that many of these "input interfaces" from the Haskell implementation to |
That's what I proposed. Was that not clear? Imagine:
This now requires a performance regression test to be added for This was my proposal after @Bodigrim raised further concerns. Is that unreasonable? |
As a more-removed party from this discussion, I have to say I sympathise with e.g. @hasufell's concerns here. While it's true that performance changes might happen accidentally in base today (because there is no comprehensive CI testing perf) -- and that the I don't have a good concrete proposal for how to close this gap. @hasufell's suggested added perf tests are a fine idea in theory, but I see how they could become a blocker in practice. But, lacking a good concrete proposal, let me put forth a bad abstract one. (It's bad because it's hard to action, not a bad idea.) Proposed: Work to increase trust. The key problem around which the struggle in this thread has hinged is what I perceive to be a low level of trust from the CLC toward the GHC developers. Furthermore, I think this low level of trust is based on concrete evidence, not on arbitrary prejudice. To my eyes, it boils down to a difference in goals and values: the CLC's remit (and the individuals on the committee) is about maintaining an excellent I'm not suggesting that anyone is acting (at all!) in bad faith or that anyone is making poor decisions. Instead, I'm suggesting that different people have different levels of tolerance for breakage and different values around how to maintain a language and its library. Diversity on this point is essential for Haskell -- too much in any one direction would destroy what has made Haskell great. All that said, I think it's important to highlight this difference in approach so that everyone can work toward structures that can help to bridge this gap. @bgamari's automated check for API changes and @hasufell's suggested performance tests serve this goal very nicely. That is, they replace trust -- hard to achieve, hard to maintain, hard to transfer -- with verification. (Which is, I suppose, also hard to achieve and hard to maintain, but perhaps more in this group's wheelhouse.) Perhaps another high-level point worth making here: no one is going to get everything they want. Which is probably also best for Haskell, given the forces tugging at different directions here. |
OK. I am sorry, I did miss which item you intended to test. Testing There is a larger point here that I think the overall social goal of free and open source software is to organize code based on what it is / what it does, not who controls it. In other words, it is to defeat Conway's law. I started writing this before he posted, but it actually dovetails with what @goldfirere says very nicely. It is precisely in low trust situations, which is indeed what we have year, that the urge to organize code by ownership is strongest --- to hunker down in ones fortress, whether that is a repo, or a directory within a repo, or anything else. But this is not a good solution firstly because the low trust problem in and of itself is worth solving, as @goldfirere points out, but because the costs of a bad architecture based on groups of people not engineering concerns intrinsic to the task at hand is also a cost, not worth paying, and one that itself exacerbates trust issues ---- the boundaries between Conways-laws components rarely have simple, well-defined interfaces that actually enable good delegation! Going back to the
Also suppose we are agreeing to performance or other testing as a trust-building exercise. Per the original condition 1 based purpose on code motion, Per my adjusted condition 1, since
Right, and so for any version of that to be workable, per the above example I think it is crucial that we don't penalize reexported moves, but allow testing them directly, and don't need to additionally test other stuff whose used of truly internal stuff merely "factors through" the reexported functions. Going broad again, we should always strive to make choices that gets us closer to the "narrow waist" ideal where we have minimal, scoped and standardize interface and contract between |
That's right. And I'm not proposing any change whatsover there.
Can you elaborate on what would change from the status quo? We are already committed to consulting CLC on user-visible changes to the There are already many hidden modules in Likewise All of this is the status quo. Perhaps you mean that members of CLC might pro-actively monitor the GHC repository (in which Does that make sense? |
Fully in agreement.
My concerns are two-fold:
Hyrum's Law states that people will abuse That's basically why I'm opposed to the original proposal. Re-exporting entire Let me re-iterate. Unless people are actively disincentivised from using I keep saying that re-exports are bad. It's true that this is not a completely new problem, but it does not give us a leeway to aggravate it recklessly.
Reading HF TWG meeting notes, there seems to be a rush to ram the proposal asap ("It will be worrying if we don't get this done by next meeting..."), which leaves me puzzled and bewildered, as I do not possess such notion of urgency. Paraphrasing a repeated argument here about re-exports, "this is not a new problem". Unless there is a burning need to tick the box soon, I would strongly suggest to spend as much time as needed to think at least a few steps ahead. It seems likely to me that a proper solution requires more control mechanisms than are available right now. It seems wise to design and develop them before, not after. |
I am sympathetic to this argument. We could easily implement a new warning mechanism,
This sounds to me like a technical problem in need of a technical solution. I can think of a few possible approaches:
This is certainly true in principle but I do not believe that it bears on the proposal at hand. In particular, GHC developers have no interest in exposing new instances to exposed types without CLC approval. Most of
As Simon points out above, there are currently several pieces of work, some of which slated to ship with the up-coming GHC 9.8 release, which are blocked on this issue. At this point I am quite worried that without some sense of urgency we will need to further delay one or more of these items as the fork date is quickly approaching. |
Hyrum's Law -style reasoning I think is good, and is indeed why splitting base as opposed to merely documenting modules is the only thing that could possibly work --- users will use the entirety of any library they depend on, but there is a chance of getting them to not use libraries entirely. So I am remain convinced that splitting Perhaps you raise some good points that it is not sufficient, but that's OK! Between now and "a couple year's time", we can implement things, and as @bgamari says many of this constituent problems sound like they are amendable to excellent technical solutions:
New tooling / language feature work to allow us to improve the architecture and stability of our ecosystem is extremely high value, both for |
In which sense are they "slated to ship with the up-coming GHC 9.8", if they are not approved? Or if they are fully approved by all relevant parties, what blocks them? Are you referring to exception backtraces? The PR is still labelled as "needs revision" and was not deemed urgent for three years, during which GHC Steering Committee has been enjoying its freedom to bikeshed it. I don't believe it suddenly became urgent overnight. |
This is indeed the best example.
As far as I am aware, that label does not reflect the current state of the proposal. I have expressed that, from my perspective, the proposal is finished to the committee and I have not yet heard any objections. I think that the time we took refine the proposal was well-spent; the proposal is much better than it was two years ago. However, it has now been converged for two weeks and do hope that we can close this chapter soon.
The issue addressed by this proposal has been a major thorn in the side of industrial users for as long as Haskell has had commercial adoption. Every few months we see yet another call to paper over this issue with |
I see what I see:
Why would you claim it being "slated for GHC 9.8" is beyond my understanding and makes a weak argument to require urgent changes to |
It really sounds like at the end of the day, as long as there are tools to support internals code from leaking into base non-explcitly, or at least a commitment to making such tools, most of the concerns are resolved right? I believe the reason it’s a blocker is because basic support for new ghc features is historically exposed in the base library, and this proposal provides a path to not having it be so from the get go? |
Would it be feasible to prohibit Hackage uploads for packages depending on |
@Bodigrim, with every release, the GHC project has a list of items which we strive to finish and ship. The implementation of the backtrace exception proposal is one of the items which we hope to be a headline feature of GHC 9.8. |
I cannot extrapolate on the whole CLC, but this is true about myself. I have little trust in GHC developers to manage changes, and this distrust is not hypothetical or speculative. Our skirmish with @bgamari "there is a huge change, which GHC SC has not yet approved after three years, but it's urgent, and it has never crossed our minds to ask CLC about it until last two weeks, but it's URGENT, so let's split That said, I'm afraid I've overinvested in this discussion and have no more bandwidth. Happy to continue the discussion at Zurihac and later. |
I agree that we should not make bad decisions through rushing, but I am very keen to use our current momentum to move towards decisions, rather than for us to get worn out and discouraged and end up doing nothing and doing it all again in a year. We have made a lot of progress. We agree about the broad goals (stability of To that, here are three further thoughts. ghc-experimental@Bodigrim argues that we should split
Other things being equal, two is better than three, but @Bodigrim makes good arguments that the extra overheads are worth it. Specifically:
In short I'm persuaded. Let's have Re-exportsMy hope is that we can resolve the concerns about re-exports.
I agree with your diagnosis, but I'd like to suggest that the proposed split above makes these things much better, not worse. As you say, the current situation is problematic. But with the above split, the answer for tooling (e.g. HLS), and for users becomes simple:
This is a good point. But to me, the instances are part of the API. If On the hand, if CI supportAt the moment we have very little support for checking the stability of the
These are substantial improvements over the status quo which will, I hope, improve the stability of the |
@Bodigrim, for better or worse there is a long history of GHC Proposals touching I am not opposed to the CLC weighing in on such matters; far from it. However, whether this change requires a separate CLC proposal is, in light of history, far from obvious. Going forward, it would be useful to explicitly lay out the expected interaction between the GHC and CLC proposal processes to prevent this sort of ambiguity in the future. |
@Bodigrim Do you have any evidence for this? The only package I know of, that fully consists of internals noone should use, is
@tek while I agree that we should discourage users from using GHC internals, I think requiring CLC approval for such packages is wrong. Why should the CLC care what shenanigans you want to use |
I think that’s a dangerous position to take. If someone is determined enough to write the correct cpp macros/ conditional compilation to have stable code make use of some internals, that’s great. If it’s not written that way, they can go pound sand when their code breaks. |
This discussion is really about making sure base doesn’t silently export unstable stuff. Right? We aren’t talking about dictating how developers choose to use stable vs unstable apis, right? Just about making base robustly stable while still allowing ghc to evolve ? |
Wrt performanceI seem to not have done a god job at explaining this. My point was maybe a bit finicky/fine-lined.
GHC internal functions that get unexposed from base are no longer under CLC purview, but may still affect performance of public base API. CLC will lose an (indeed) accidental advantage, so to speak. This will change status quo, no matter if this problem already exists elsewhere. And this is really just a signal of a deeper issue that underlies the CLC charta:
Here it says very clearly that GHC developers cannot maintain things freely (even internals) if they affect performance. The only proper way to know if performance is affected is to have performance regression tests. Exercising common sense is not as good as a test suite, especially if there's already reservations wrt the way changes are executed. What I'd like to figure out is if we can approach this problem gradually as well. I also want to note that I think the charta needs adjustment, because here it only talks about Touch pointsSo, what you're proposing wrt CI seems already quite nice to me. We need to automate as much of those concerns of the charta as possible to minimize the use of "common sense" or "manual evaluation". This is in both parties interests, because it will reduce the required explicit communication (which is the most costly). E.g. things of interest may be:
Much of this could be statically checked/generated. It could allow both explicit monitoring by CLC, as well as aid GHC developers in deciding when to explicitly consult CLC. You asked:
Yeah, absolutely not (for me). I have neither the bandwidth nor the expertise. CLC shall be involved when it's relevant, but the interpretation of "relevant" should be very strict and easy to validate. The remaining question might be how CLC can actively validate that GHC HQ has not accidentally missed something (other than through monitoring every single MR). I think this wouldn't be hard as part of a pre-release effort that lays out the data: executed tests, API diffs, generated code closure information, etc. ... and allows CLC some time to review. If you ask me who's going to do all that work (e.g. wrt performance tests)... I really don't know. We will have to figure out if this can be done gradually as well. But really, we seem to agree that the idea is to reduce explicit communication and reliance on manual evaluation and instead rely on data, static checks, executed tests, etc. Does this sound grandiose? I don't know... to me this sounds pretty Haskell! NB: Knowing when to involve someone is really hard... I know this from the tooling side, because things are extremely interconnected and sometimes almost invisibly so. |
I have had a lot of interaction with some members of Team GHC over the past 12 months or so and I am confident that the issue of stability has been firmly registered on their radar as something that is critical for the success of Haskell. That said, I can understand why others would remain yet to be convinced. What would convincing look like in these circumstances? Doubtless it would involve "walking the walk" rather than just "talking the talk". But which walk exactly? What initial steps should we take here to build trust? |
Thanks @hausfell, that is really helpful. We seem to agree about a lot.
In the plan I suggest above (point 3), no functions whatsoever will be un-exposed from Moreover, as things stand, even for exposed functions a CLC proposal is only needed if there is a material change (including performance); and that will remain unchanged. So I'm not sure what advantage (however accidental) the CLC will lose. I don't want the CLC to lose any advantages! I still think the status quo holds pretty much unchanged.
We agree! Even with 100% trust, loads of common sense, and huge expertise, there is no way to be sure that there won't be performance regressions by guesswork. The only thing that will work an agreed set of performance tests. I would welcome that, and (another thing we agree on) they should apply to changes in But can we agree that
I don't think your plan (call it "CI-for-base") is grandiose. There is indeed Real Work to do here. But some bits of it are under way already; and I think we could invite the Haskell Foundation to consider helping resource some of the rest, so that it doesn't just stall. Meanwhile, I plead that we don't we stall our |
This is very much the kind of thing that we'd like to support, and also the kind of thing that we actually have this proposal process to hash out. Let's talk with @chreekat at Zurihac and see what bits are missing, and see if we can get a plan together to support it. |
This is getting tangential, and I must apologise for broad strokes.
There is a difference between "having it on a radar" and "sharing a value". Indeed I was under impression that repeated calls for stability were not unheard, but the run-up to GHC 9.6 release demonstrated significant deficiencies of control environment in almost every area. And yet we are not having here a heated discussion (or better: decisive action) to improve the control environment; instead GHC developers are enthusiastic to innovate even faster. There is no merit to proclaim stability as your value, if in every choice between stability and innovation you prioritise innovation. And also there is absolutely nothing wrong with having different values, but it is harmful to pretend to have the same.
Clearly, working on CI-for-base also "won't make anything worse, and will make some important things better". But Simon demonstrates his system of values, prefering to spend limited resources of our very small community on innovation, not on stability.
We can start with a talk: a public statement from GHC developers about the place of stability in their system of values would be helpful. If GHC developers not just have stability on their radar as a nuisance to innovation, but share it as a value, it would be helpful to exercise it more often by prioritising work on stability at the expense of other directions. Back to the topic. @simonpj I'm content with #47 (comment), I'm truly grateful to you for leading the discussion. |
Thanks @Bodigrim. It sounds as if we are close to having a plan that we can all support -- not just reluctantly but genuinely. Perhaps I'll try to redraft some of the above posts so we have the plan in one place to review. We expect to meet at Zurihac, which will help.
I am a GHC developer, and I defintely share the stability of I also want GHC to be able to innovate. Not just for selfish reasons (although those too of course) -- I think innovation is a foundational part of Haskell's attractiveness and culture. Finally I believe that innovation in GHC is entirely compatible with stabilty of I accept that actions speak louder than words, but we have to start with words. And while I can only speak for myself I believe that other members of the GHC team agree with, and will support in practical ways, the statements I make above. Resourcing is always an issue. Many of us (GHC devs, CLC members) are volunteers with other day jobs. That places limitations on what we can achieve: none of us can write a blank cheque. But I don't see resourcing as the primary constraint here; we can get a long way just by engaging in constructive dialogue, and deepening trust -- as I believe we have been doing in the last few weeks. |
Issues with the standard library are holding back the Haskell ecosystem. The problems and solutions are multifaceted, and so the Haskell Foundation in its "umbrella organization" capacity is uniquely suited to coordinate the fixing of them.
Here is proposed a first step in this process, splitting base.
Future work addressing a larger set of goals is also included to situate splitting base in context.
Rendered