Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hsinspect requirements #75

Open
tseenshe opened this issue Aug 9, 2019 · 6 comments
Open

hsinspect requirements #75

tseenshe opened this issue Aug 9, 2019 · 6 comments
Labels

Comments

@tseenshe
Copy link

tseenshe commented Aug 9, 2019

Hello!

I would like to make use of cabal-helper to solve some problems that I am encountering when writing https://gitlab.com/tseenshe/hsinspect I hope you don't mind me creating a thread here for discussion (this is not a feature request or bug report as such).

  1. I would like to have a launcher that can choose which one of multiple binaries to launch based on the current ghc version. e.g. say the user (or more likely the text editor behind the scene) installs hsinspect-ghc-8.4.4 and hsinspect-ghc-8.6.5 and their current project is using ghc-8.4.4 then I would like to invoke hsinspect-ghc-8.4.4. I can do this with a hack using cabal v2-exec ghc and parsing the output string but that is tedious and I am interested in alternatives. Note that actually installing applications this way is not easy because --program-suffix doesn't work during v2-install 🤷‍♂️

  2. Inside my program I would like to be able to provide a FilePath (the source file I am inspecting) and receive back a DynFlags that is populated with anything relevant from the .cabal and cabal.project / cabal.project.local. Most importantly are: default language, language extensions, compiler options. i.e. inferDynFlags :: FilePath -> IO DynFlags. If you don't want to depend on ghc then I would be ok with Strings and LangExt.Extensions and similar.

  3. I would like to be able to generate a packagedb that matches the most recent v2-build (or implied build from a v2-run or v2-install) and set the correct environment variable, so that when I use the ghc api it is set appropriately. Note: I want to include the current package -inplace even if it is not compilable. Ideally this would take the package and specific phase (application / library / test) into account, which would be far superior to the generated cabal .ghc.env.* files. i.e. setupGhcEnv :: FilePath -> IO ().

  4. Without adding any more dependencies 😄 It is very important to me that I do not have a large depednency graph. I had hoped to be zero dependency (only installed things) to simplify and speed up the installation process (I was considering having my app be a single file that is compiled manually by directly invoking ghc!) So adding a dependency on cabal-helper is already a big step for me.

  5. hie-bios integration. Ideally I would like to be able to abstract these things across all build tools. I understand that hie-bios plans to do that but I'd like to see it working reliably for cabal-install first. If hie-bios is going to have lots of big dependencies (e.g. on hpack or stack / yaml libraries) then that's a show stopper but perhaps I could put it behind a flag. // @mpickering

@DanielG
Copy link
Owner

DanielG commented Aug 9, 2019 via email

@tseenshe
Copy link
Author

tseenshe commented Aug 9, 2019

I have written up more thoughts in the hsinspect README. I'll copy them below.

TL;DR and to use your words, "it involves compiling the source of arbitrary user modules" and "having access to all in-scope packages of a given cabal component". But I would expect it to fail fast if there is no build plan.

If you are going to have big dependecies like aeson, I think a more sensible thing to do would be to have the ability to launch applications under cabal-helper: it would set the GHC_ENVIRONMENT and inject all ghc parameters. This seems generally useful to all kinds of tools from formatters to code analysis and refactoring.

One of the problems people cite with using HIE is the huge dependency chain, it must be kept under control for end-using tooling. I don't want my tool to depend on anything more than ghc itself, since it needs to be cross compiled per ghc version. Putting all the dependencies into the launcher makes it a one-time cost, and then tools themselves can be much slimmer.

There is also the work on the build-info to be aware of haskell/cabal#5954. If this could be packaged up as a launcher that would be good.


Firstly, hsinspect only works if all the dependencies of the file under
inspection have been compiled AND are visible to ghc via the GHC_ENVIRONMENT
(or .ghc.environment.) files that set up the packagedb. It does not require
that the file under inspection is compilable (except that the following can be
parsed: pragmas, module definition, imports).

stack does not create env files, so hsinspect cannot be used with stack.

Environment files can be created in one of two ways with cabal-install:

  1. by default 2.4.x will create .ghc.environment files for every run of
    v2-build, putting them in the base of the project. This was very
    contentious and 3.x will not generate them by default.

  2. cabal v2-exec will create a temporary environment file and make it
    available via GHC_ENVIRONMENT. Unfortunately, the parameters to v2-exec
    must match the exact v2-build
    command
    which is unworkable
    as a way to lauch hsinspect.

There are three additional problems:

  1. if the previous compile failed then the inplace package is not included,
    which means the file being inspected will not be able to see dependency
    modules that live in the same package (even if they compiled successfully).

  2. ghc-options specified in .cabal / cabal.project /
    cabal.project.local files are not visible.

  3. the produced env file does not include references to non-library
    configurations, which means that executables and tests that depend on other
    modules in the same directory are not visible.

We workaround 1 by manually performing a successful compile,
caching the env file, and pointing to it when invoking hsinspect.

We workaround 2 by requiring the user to provide language extensions manually
when invoking hsinspect.

We cannot workaround 3.

@DanielG
Copy link
Owner

DanielG commented Aug 9, 2019

If you are going to have big dependecies like aeson, I think a more sensible thing to do would be to have the ability to launch applications under cabal-helper: it would set the GHC_ENVIRONMENT and inject all ghc parameters. This seems generally useful to all kinds of tools from formatters to code analysis and refactoring.

Honestly I don't get at all how that would make things any better. The user still has to compile c-h and all it's dependencies whether or not it is an executable or a library...

One of the problems people cite with using HIE is the huge dependency chain, it must be kept under control for end-using tooling.

You know I just don't buy it, the problem with Haskell tooling right now is that it either doesn't work or doesn't exist. Optimizing build times is something we can invest time in once we have something that works. This is what's known as premature optimization my friend ;)

I don't want my tool to depend on anything more than ghc itself, since it needs to be cross compiled per ghc version. Putting all the dependencies into the launcher makes it a one-time cost, and then tools themselves can be much slimmer.

Ah now that makes more sense, O(1) vs O(n). You know you can just have your launcher use lib:cabal-helper and not your per-ghc executables. I don't see why c-h should need to provide something like that itself though.

There is also the work on the build-info to be aware of haskell/cabal#5954. If this could be packaged up as a launcher that would be good.

I know I'm in contact with the guy working on it but this is really just something to make cabal-helper more robust (to me anyways).

Firstly, hsinspect only works if all the dependencies of the file under
inspection have been compiled AND are visible to ghc via the GHC_ENVIRONMENT
(or .ghc.environment.) files that set up the packagedb. It does not require
that the file under inspection is compilable (except that the following can be
parsed: pragmas, module definition, imports).

stack does not create env files, so hsinspect cannot be used with stack.

I don't understand why you're so hell-bent on using env files. If all you need is the package environment then use cabal-helper to get the ghc options in your launcher, send those over to the per-ghc side, parse the flags using the GHC API and filter away anything you don't need/want such as inplace dependencies.

That way you can even support stack, no problem.

Admittedly cabal-helper will probably need some patches to work just how you want it to but that souldn't be a major blocker.

There are three additional problems:

  1. if the previous compile failed then the inplace package is not included,
    which means the file being inspected will not be able to see dependency
    modules that live in the same package (even if they compiled successfully).
  2. ghc-options specified in .cabal / cabal.project /
    cabal.project.local files are not visible.
  3. the produced env file does not include references to non-library
    configurations, which means that executables and tests that depend on other
    modules in the same directory are not visible.

We workaround 1 by manually performing a successful compile,
caching the env file, and pointing to it when invoking hsinspect.

We workaround 2 by requiring the user to provide language extensions manually
when invoking hsinspect.

We cannot workaround 3.

Pretty much all of these are fixed by using cabal-helper, just FYI.

@tseenshe
Copy link
Author

tseenshe commented Aug 9, 2019

You know you can just have your launcher use lib:cabal-helper and not your per-ghc executables. I don't see why c-h should need to provide something like that itself though.

Fair! I would be happy to do that. But I'm not entirely sure how to do it. Would it be too much to ask you to put together a proof of concept? I can take care of writing and publishing an executable if you don't think it fits into the scope of cabal-helper.

O(1) vs O(n)

👍

I don't understand why you're so hell-bent on using env files.

If the full packagedb can be provided explicitly as ghc parameters (along with everything else) then the GHC_ENVIRONMENT can be set to - 😄

My main reason for preferring an env file instead of explicit parameters is Windows. I have experienced the "maximum character length" problem so many times that it is an automatic reflex. If a project has a few hundred dependencies, that's going to hit the limit pretty quickly.

My launcher will probably need to implement the same hack as stack and delete / hide the .ghc.environment files (including those in ~/.ghc/). If you've looked at the hsinspect source code you'll see that I don't know how to parse those parameters into a DynFlags yet but I'm sure it's possible.

@tseenshe
Copy link
Author

tseenshe commented Aug 10, 2019

In light of the last few messages, I will refine the ideal requirements which also considers how a launcher may be able to cache results for maximum performance:

-- | For any file or directory that is contained in a source directory of a component, 
--   return the exact ghc parameters that the build would use to invoke `ghc`,
--   including the packagedb information. A switch allows the caller to decide if
--   the inplace package for the current project should be included even if it is not
--   compilable (only for inspection tools that do not write output files).
ghcOptions :: FilePath -> Bool -> IO [String]

-- | For any file or directory that is part of a project, return a project summary.
--   This is useful for caching and discovery.
projectSummary :: FilePath -> IO Summary
data Summary = Summary {
  buildFiles :: [FilePath]
  -- ^ all non-generated files that define the project (e.g. `*.cabal`, `project.cabal`,
  --   `project.cabal.{local, freeze}`, `stack.yaml`, `package.yaml`). Files may not
  --   exist yet.
, srcDirs :: [FilePath]
  -- ^ all source directory routes. To allow users to detect if a local source file
  --   is a member of the project. 
}

I'd also be happy if ghcOptions returned IO ([String], StringBuffer) where the latter is the contents for an envfile that may be used instead of parameters (to workaround Windows problems with large parameter lists).

Of course if you want to do the caching inside of cabal-helper itself, then there is no need to provide buildFiles. And srcDirs becomes entirely informational.

It's probably also possible to think of a bunch of other stuff to go into the Summary.

@DanielG
Copy link
Owner

DanielG commented Aug 18, 2019

I don't understand why you're so hell-bent on using env files.

My main reason for preferring an env file instead of explicit parameters is Windows. I have experienced the "maximum character length" problem so many times that it is an automatic reflex. If a project has a few hundred dependencies, that's going to hit the limit pretty quickly.

Ok indeed I can understand that problem. I'm surprised GHC doesn't support @ option files like most GNU tools and even cabal do.

If you're not aware of this feature the idea is that instead of providing arguments on the commandline you get the option of writing command @./some-file where ./some-file contains the arguments you'd like to pass to command, one per line I think.

Might be worth adding support for that to GHC, shouldn't be very hard if you restrict it to not have other options on the commandline. I'd be happy to point you in the right direction and review such a patch on GHC's Gitlab if you're interested.

That being said I think now I understand the disconnect between the API you're asking for and what cabal-helper provides.

You see I assume you're going to call the GHC API, not the ghc executable with the stuff c-h provides. The reason for that is simple. In order just to be able to refer to Haskell modules via source path we would need to re-implement a largeish part of GHC's compilation pipeline. Namely preprocessing source files, parsing the module header and dependency analysis (AKA 5+kloc or so in GHC at least).

This is a quite unreasonable ammount of code duplication IMO, so you should depend on the GHC library instead. At that point you might as well use it for all of your functionality instead of essentially having two copies of the ghc library inlined into two seperate executables you use for no good reason, i.e. the ghc driver program and your own program which just wraps exe:ghc.

Anyways, currently the API you want doesn't exist yet, outside of ghc-mod at least which we're kind of trying to deprecate at the moment. Using it in the way you seem to be aiming for is inefficent so I think you shouldn't unless you have a good reason to do things that way?

If you're happy restricting your users to GHC 8.8+ all the functionality you need to go from the info cabal-helper gives you to the API you want should be available natively in GHC without any need to hack around but I still havent' gotten around to testing this out myself because I'm working on finalizing cabal-helper API at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants