Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split parser & types out into different packages #328

Closed
Profpatsch opened this issue Jun 29, 2018 · 36 comments
Closed

Split parser & types out into different packages #328

Profpatsch opened this issue Jun 29, 2018 · 36 comments
Labels
enhancement Betterness, without new explicit feature possibilities

Comments

@Profpatsch
Copy link
Contributor

Profpatsch commented Jun 29, 2018

With the implementation of an evaluator and various utilities picking up steam, the dependency closure has increased substantially, which means the possibility of the package building e.g. on nixpkgs master has decreased considerably.

Growth of direct dependencies:

0.2.0: ansi-wl-pprint, base (>=4.6 && <5), containers, data-fix, hnix, parsec, parsers (>=0.10), text, transformers, trifecta, unordered-containers

0.3.2: ansi-wl-pprint, base (>=4.6 && <5), containers, data-fix, deepseq, hnix, parsec, parsers (>=0.10), semigroups (==0.18.*), text, transformers, trifecta, unordered-containers

0.4.0: ansi-wl-pprint, base (>=4.9 && <5), containers, data-fix, deepseq, deriving-compat (==0.3.), hnix, parsec, parsers (>=0.10), regex-tdfa, regex-tdfa-text, semigroups (==0.18.), text, transformers, trifecta, unordered-containers

0.5.0: aeson, ansi-wl-pprint, array (>=0.4 && <0.6), base (>=4.9 && <5), base16-bytestring, binary, bytestring, containers, cryptohash-md5, cryptohash-sha1, cryptohash-sha256, cryptohash-sha512, data-fix, deepseq (>=1.4.2 && <1.5), deriving-compat (>=0.3 && <0.5), directory, exceptions, filepath, hashable (>=1.2.4 && <1.3), hashing, haskeline, hnix, http-client, http-client-tls, http-types, lens-family, lens-family-core, lens-family-th, logict, megaparsec (>=6.0 && <6.6), monadlist, mtl, optparse-applicative, pretty-show, process, regex-tdfa, regex-tdfa-text, repline, scientific, semigroups (==0.18.*), serialise, split, syb, template-haskell, text, these, time, transformers, unix, unordered-containers (>=0.2.9 && <0.3), vector, xml

I suggest splitting the parser and maybe the types out into their own packages, while keeping all three here in one git repository. This way tools that depend only on the parser and don’t use the evaluator still work if an exclusive evaluator dependency breaks.

@jwiegley
Copy link
Member

We will very likely perform such a split in the near future, after nixpkgs evaluates. If we do it too early, before there will be any real "customers", then we'd just be slowing down development for no real gain.

@jwiegley jwiegley added enhancement Betterness, without new explicit feature possibilities blocked Waiting on other work to be completed labels Sep 8, 2018
@jwiegley
Copy link
Member

jwiegley commented Sep 8, 2018

Marking this as blocked until we've completed initial development (i.e., we have a full, functional replacement for nix-instantiate). Then development can slow and we can look at architectural restructuring.

@jwiegley jwiegley removed the blocked Waiting on other work to be completed label Sep 13, 2018
@Ericson2314
Copy link
Member

Ericson2314 commented Oct 6, 2018

CC @taktoa who also wrote a Nix parser using megaparsec. If we can fold that and @peti's stuff used by cabal2nix into one surface syntax library, I think would be a good demonstration of code reuse.

@Ericson2314
Copy link
Member

@Ericson2314
Copy link
Member

https://github.com/peti/language-nix is @peti's surface syntax stuff, for the record.

@jwiegley
Copy link
Member

jwiegley commented Oct 7, 2018

We have a complete parser now, with a round-tripping pretty printer. Why would we need to fold in another parser?

@Ericson2314
Copy link
Member

Ericson2314 commented Oct 7, 2018

There isn't. I'm thinking more language-nix deprecated, becomes a shim, and/or has a major version bump and becomes the name under which hnix's surface syntax stuff is published.

(I suppose I am contuously bugged but the amount of duplication on hackage and would like to reverse the trend anywhere there is no design beef. And cabal2nix would be a fine initial "client" of hnix's work given it's wide use and status.)

@jmackie
Copy link

jmackie commented May 5, 2019

Not sure what the status of this is, but I would really appreciate the expression types and parser living in a separate (lightweight) package. There have been a few occasions when I've needed only these things, but couldn't justify pulling in the entirety of hnix in order to get them. 👍

@jwiegley
Copy link
Member

At this point, I think things are mature enough to consider this split.

@jmackie When you say the expression types, do you mean just the types and not the evaluator or the implementation of builtins?

@Ericson2314
Copy link
Member

I would split things up, and also combine repos with hnix-store. Things are more mature, and there are more ways to fake loading multiple packages into ghci than before. (Beside stack's tricks, Obelisk, ghcide also do it, there's also progress on proper support in GHC.)

@sorki
Copy link
Member

sorki commented Apr 25, 2020

I'm now using https://github.com/sorki/hnix-overlay for integration work between hnix-store and hnix - it's pretty convenient so far and I'm not sure if it's actually worth merging these two as there are two more repositories needed - nix-derivation and nix-narinfo.

Splitting hnix would be nice thought, similar to hnix-store multiproject repository.

I might take a shot at this after the integration as I'm slowly becoming familiar with hnix codebase.

@Ericson2314
Copy link
Member

I would just merge in those other repos too? To me the critical thing is I can easily imagine making changes that effect many or all the libraries; it's much easier to synchronously commit to all of them with a monorepo.

We can re-split later if things become super-stabilized.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 16, 2020

Yes, the current evolve process comparison to super-stabilized is the key. Indeed it is easier to evolve, migrate & coordinate closely intertwined projects under one roof and then split them back once the APIs indeed stabilize into proper final form.

@Ericson2314
Copy link
Member

The GSOC I'm mentoring this summer is putting us on the path to true multi-package ghci/ghcide, and going great (props to @fendor!) so I am quite optimistic about splitting things into small packages without harming developer efficiency too.

@jwiegley
Copy link
Member

I'm happy with breaking up hnix into multiple repositories at this point.

@sjakobi
Copy link
Member

sjakobi commented Jun 18, 2020

A split could also be useful for dhall-nix, which only uses the modules Nix.Atoms, Nix.Expr and Nix.Pretty (for prettyNix). So if Nix.Pretty could end up in the core package, that would be nice. I see that Nix.Pretty currently depends on Nix.Normal for dethunk – maybe that could be untangled?!

For now, here's a module graph, produced with

$ find src/ -name '*.hs' | xargs graphmod -q -r Repl -r Main | dot -Tpng -Gdpi=600 > hnix.png

hnix

It could probably be tweaked further, but I'm a graphmod and dot noob. :/


EDIT: BTW, here's somewhat similar discussion about the dhall package: dhall-lang/dhall-haskell#1102

@sjakobi
Copy link
Member

sjakobi commented Jun 18, 2020

I'm happy with breaking up hnix into multiple repositories at this point.

Just a question for clarification: We're discussing splitting hnix into multiple packages that would still live in the same repo, right?

@Ericson2314
Copy link
Member

Yes, and furthermore I hope the hnix-store packages in the same repos.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 18, 2020

The solution to the closure size - is to do clean-up and refactoring.

For example, the CI of our project should check for unused dependencies and imports - that is all, really.

The closure size argument is contrary to the Haskell language, to the Haskell ecosystem cornerstones of modularity, and to the UNIX philosophy.


Main argument oh the header post:

" the dependency closure has increased substantially, which means the possibility of the package building e.g. on nixpkgs master has decreased considerably."

Is not true currently.

HNix in Nixpkgs, it builds fine.


Nixpkgs needs to have better package options management - that would decrease all closures.


I would be happy to have consolidated effort and merge-in the HNix-store for maintenance and refactoring of both projects at the same time. We need refactoring before splitting.

After John wrote the package there were no real refactoring of it.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 18, 2020

#328 (comment)

Yes. I am for refactoring/modularization of source files first.

Source files sometimes are >1000 lines long, modules can be splitted-up into smaller modules hierarchy, and reexports if we wish so.

I prefer to make module hierarchy more granular and have refactoring of the basic data types before any splitting into separate packages, it is easier.

@sjakobi
Copy link
Member

sjakobi commented Jun 18, 2020

Yes. I am for refactoring/modularization of source files first.

Source files sometimes are >1000 lines long, modules can be splitted-up into smaller modules hierarchy, and reexports if we wish so.

I prefer to make module hierarchy more granular and have refactoring of the basic data types before any splitting into separate packages, it is easier.

What's the motivation for splitting the modules into even more modules? I think that could also make it harder to navigate the project.

My understanding of the package-split idea is that it would be easier for users to use the parts of hnix they're interested in, without pulling in dependencies for other parts. That seems like a nice value proposition to me.

@Ericson2314
Copy link
Member

Ericson2314 commented Jun 18, 2020

@Anton-Latukha I don't think closure size is bad per se.

The real thing is people should only pay for what they use. You can thinking of a packages of horn clauses: dep-interface-0 -> dep-interface-1 .... -> dep-interface-n -> new-interface. they are composed into the whole dependency graph. A good package is one where an item in new-interface "typically" depends on as much of dep-interface-0...dep-interface-n---i.e. i.e. whatever sub-interface of the package you actually use, you don't end up paying for dependencies you don't need. If you think of dependencies as capabilities or privlages, this relates also relates to the "principle of least privilege".

This is a nice design principle because it can be applied locally and concurrently to individual packages with the result being a a nice overall-ecosystem---following it constitutes distributed algorithm for ecosystem curation.

@sjakobi
Copy link
Member

sjakobi commented Jun 18, 2020

FWIW, some module splitting might be necessary to help untangle the module graph a bit. It's just not necessarily a good idea per se, IMHO.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 18, 2020

From my point of view the refactoring needed, since:

  1. I very agree with majority of Suggested improvements to NExpr #377
  2. Also it is too obvious NAtom is not fully formed.
  3. We did not look into the current Nix type system to draw experience for us from it.
  4. We not updated the Nix submodule for a long time, and Nix internals changed quite drastically, we can not afford to rely on old Nix verison.
  5. And we should break free from Nix submodule in the first place.

Guys, like, basic data model should be cleaned-up, refactored and consolidated first, we should not depend on the Nix. There refactoring and some code needed. When data model would form - yes, lets split project into submodules and then into packages.

Lets do the documentation and structure of it through the Haddock, then it would be way easier to split in into more modules using literate structural editing, we would clean-up imports in that process, the project would become documented and more understandable to everyone in that one movement.

Then when data types stabilize - module structure can be split into as many packages as needed, even every module can be a package if people want so.

@sjakobi
Copy link
Member

sjakobi commented Jun 18, 2020

@Anton-Latukha Apparently neither #377 nor this issue have seen much progress in recent months. I also don't see why one or the other ought to wait for the other one.

So if anyone wants to start implementing this one, I don't see a reason to wait.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 18, 2020

reduced

1 similar comment
@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 18, 2020

reduced

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jun 20, 2020

Ok.

TL;RD

  1. REPL (executable):
    1.1. is not working,
    1.2. needs MVP features.
  2. From after the creation - there was no refactoring of data type system. And for recursive schemes programming, it is logical that it needs post bottleneck refactoring to settle things down to leverage the paradigm fully. And it was already proven that refactoring is possible and needed. Refactoring of datatypes means interface breakages.
  3. There is a lot to clean-up and maintain in the project.
  4. For me for the project community growth is important, so it naturality leverages free evolution, that would exponentially increase quality and impact in the long run. For community growth - so simple onboarding, especially since the project already.

I am all hands up for splitting, but it is not logical at the moment, until types and executable would be ready to be independent packages.

Of course if John would move the agenda - I would not stand in the way. I just see it as impractical at the current moment, strategically.

@Ericson2314
Copy link
Member

I am busy with C++ Nix at the moment due to https://discourse.nixos.org/t/obsidian-systems-is-excited-to-bring-ipfs-support-to-nix/7375, so I won't be pushing this at the moment before it is convenient to you. (I do hope that will prepare me to return to hnix and contribute better than before after!)

@jwiegley
Copy link
Member

@Anton-Latukha I have no agenda at the moment, so feel free to postpone this.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jul 9, 2020

Well, REPL could handle only one line expression.

With last work being done,

AFAIK Sorki made REPL accept multi-line expression. We would make a release soon.

When I test HNix REPL and the basics would work for people - we indeed can form an executable separately, my main argument from above is that executable and REPL should be MVP for basic needs before we ship it as a package, then overhead of additional package and interface can be reasonable.

But this month Peti is on vacation and he did not stated who to address, so because our default.nix ≠ HNix Nixpkgs - I prefer to keep package/Nixpkgs moves to a minimum this month or we can get into a situation when HNix broken in Nixpkgs until he returns, which would be a shameful state for me at least.

@sorki
Copy link
Member

sorki commented Jul 13, 2020

Regarding hnix-store merge neither me nor @shlevy thinks it is necessary and advantageous - we would prefer to keep the current state for now.

@Anton-Latukha
Copy link
Collaborator

I'd reformulated the question.

Since splitting the Types from the Parser seems on the face a good idea, but in reality - splitting out the data typeslways craves to drag/duplicate a bunch of logic code to take with it, and trying to splitting the types from logic all the time creates cyclic dependencies in Haskell.

And the type system and parser are very interdependent.

I'd reformulated the question as: "Split the executable from the library". People who want the library are not interested in executable, executable is pretty small and has haskeline and repline specific deps. BTW:

head post

... This way tools that depend only on the parser and don’t use the evaluator still work if an exclusive evaluator dependency breaks.

So seems that Profpatsch also meant that.


I worked/working on the CLI, and after derivationStrict roundtrip quest resolves - would settle on working on the CLI.

Looking at what HNix-Store-{Core,Remote}. Reality showed that Remote already struggled/struggles much with the HNix-Store infrastructure and had problems working with dev Store, making working releases.

What HNix-Store-{Core,Remote} has, and then adding HNix-{Core,Exec} infrastructure/coupling to it - the system is interdependent in development and in released code, which means that the projects need to have and distinguish the development line and release line of work&tests, CI for patches should run the integration test of the dev branches, and also test patches using published dep releases. And people tend to not like the partially green/red CI responses. If there was a solid way of railing the prerelease procedure and a pure way to separate the dev testing from the release testing.

But overall, splitting the executable while it has pretty minimal code, still needs a lot of work, and has minimal dep difference (only haskeline & repline) - that would not solve the dependencies very much, but would complicate the internal infrastructure maintenance and dev, currently, for me.

@Anton-Latukha
Copy link
Collaborator

All of the deps of the HNix seem logical to use.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Dec 23, 2020

Well, looking at all logic in HNix - a lot of it really goes to the execution.

And from the map of imports, it looks like Expr is abstracted enough to be formed into a separate package after all.

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Jan 14, 2021

Used useful stuff, like module map - to the https://github.com/haskell-nix/hnix/wiki/Design-of-the-HNix-code-base,

And since splitting is very much an irreversible process, we must do all that is need to be done to prepare the spliting:

Since this thread become epic and went all over the place.

Maybe splitting would happen this year, realistically - first the private/public API as it is transparent, then module hierarchy change, it is already a big migration both here and downstream, which I would try to soften and automate. To simplify the discussion, reopening the thread anew.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Betterness, without new explicit feature possibilities
Projects
None yet
Development

No branches or pull requests

7 participants