Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHCI fails to load in latest rc (2.5.0.1) #5407

Closed
aschmois opened this issue Sep 29, 2020 · 14 comments
Closed

GHCI fails to load in latest rc (2.5.0.1) #5407

aschmois opened this issue Sep 29, 2020 · 14 comments

Comments

@aschmois
Copy link
Contributor

aschmois commented Sep 29, 2020

General summary/comments (optional)

On the newest rc ghci can no longer be run for multiple components with autogen files.

Steps to reproduce

I have prepared a small repo to reproduce:

$ git clone [email protected]:aschmois/project.git
$ stack build
$ stack ghci project:lib project:test

Expected

In version 2.3.3 ghci loaded fine.

Actual

In version 2.5.0.1 rc ghci failed to load with:

$ stack ghci project:lib project:test
Using main module: 1. Package `project' component project:test:test with main-is file: /home/andres/project/test/Main.hs
project> configure (lib + test)
Configuring project-1...
project> initial-build-steps (lib + test)
project> Test running disabled by --no-run-tests flag.
Completed 2 action(s).
Configuring GHCi with the following packages: project

* * * * * * * *

Error: Multiple files use the same module name:
       * Paths_project found at the following paths
         * /home/andres/project/.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.0.1.0/build/autogen/Paths_project.hs (project:lib)
         * /home/andres/project/.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.0.1.0/build/test/autogen/Paths_project.hs (project:test:test)
* * * * * * * *

Not attempting to start ghci due to these duplicate modules.
Use --no-load to try to start it anyway, without loading any modules (but these are still likely to cause errors)

Stack version

$ stack --version
Version 2.5.0.1, Git revision 3cbd37ab548a34cf3fc0a8277079c4cef27c1118 RELEASE-CANDIDATE x86_64 hpack-0.33.0

Method of installation

  • Official binary, downloaded from github releases
@aschmois
Copy link
Contributor Author

aschmois commented Sep 29, 2020

After skimming through stack commits this commit seems suspect f8b7bfc from this PR #5393

@aschmois
Copy link
Contributor Author

aschmois commented Oct 1, 2020

Quoting @tfausak from sol/hpack#303 (comment) in case someone runs into the same problem and needs a quick fix:

Fortunately adding this YAML snippet to those components does the trick:

when:
- condition: false
 other-modules: Paths_pkg

I am interested to see if there may be a better or different solution since technically we're seeing a breaking change across stack versions (even if it does fix a different bug). Anyways feel free to close if this is the expected solution.

@tfausak
Copy link
Contributor

tfausak commented Oct 1, 2020

For what it's worth, I'm not sure Stack is doing anything wrong here. If I have a package with two components that define the same module and then I try to load that package in GHCi, what should happen?

Here's proof that you can run into the same problem with manually generated modules:

> stack --version
Version 2.3.3, Git revision cb44d51bed48b723a5deb08c3348c0b3ccfc437e x86_64 hpack-0.33.0
> cat stack.yaml
resolver: ghc-8.10.2
> cat gh5407.cabal 
name: gh5407
version: 0
build-type: Simple
cabal-version: >= 1.2

library
  build-depends: base
  hs-source-dirs: lib
  other-modules: Problem

executable gh5407
  build-depends: base
  main-is: Main.hs
  hs-source-dirs: exe
  other-modules: Problem
> cat exe/Main.hs
main = pure ()
> cat exe/Problem.hs
module Problem where
> cat lib/Problem.hs
module Problem where
> stack ghci
Using main module: 1. Package `gh5407' component gh5407:exe:gh5407 with main-is file: /tmp/tmp.TjAmpcgNF3/exe/Main.hs
Configuring GHCi with the following packages: gh5407

* * * * * * * *

Error: Multiple files use the same module name:
       * Problem found at the following paths
         * /tmp/tmp.TjAmpcgNF3/exe/Problem.hs (gh5407:exe:gh5407)
         * /tmp/tmp.TjAmpcgNF3/lib/Problem.hs (gh5407:lib)
* * * * * * * *

Not attempting to start ghci due to these duplicate modules.
Use --no-load to try to start it anyway, without loading any modules (but these are still likely to cause errors)

Even though the files happen to be identical, GHCi refuses to load both of them at the same time.

In this particular situation you can work around the problem by putting the shared module in a common place that both components can use. That way GHCi isn't trying to load the same module twice from different paths.

That suggests that perhaps Stack should be generating at most one Paths_* module per package and putting it in a common place, but I don't know enough about Stack's internals to say if that actually makes sense or not.

@uncle-betty
Copy link
Contributor

uncle-betty commented Oct 2, 2020

Interesting. Here's how this happens, I think:

  • stack repl leans on Cabal (as a library, not as an executable) to find out (a) which modules each component consists of, and (b) in which directories to look for Haskell source code.

  • Cabal includes the Paths_* module that it auto-generated for a component, when reporting back the modules for each component. Across all components, this potentially leads to more than one Paths_* module being reported back.

  • stack repl throws out all the modules for which it cannot find a Haskell source code file. Previously, stack repl did not look in the directories that contain auto-generated module source code. Hence, auto-generated modules were generally thrown out, because their source code files weren't found. With Also resolve auto-generated component files. #5393 in place, auto-generated modules are now kept, because their source code files are found.

  • stack repl runs a duplicate check, i.e., it looks for module names for which it found more than one source code file. Duplicates can now happen, when stack repl is run for multiple components that contain more than one Paths_* module between them. The Paths_* modules in the components all share the same module name but all have different source code files, so this leads to multiple source codes files for a single module name. If it happens, then we see the above error, and that's it.

  • Otherwise, i.e., if there aren't any duplicates, then stack repl runs GHCi, passing the collected modules to it via :add directives in a generated GHCi script in a temporary file.

  • As @tfausak pointed out, there has always been a duplicate check. It just didn't cover auto-generated modules.

My question at this point: Why does stack repl bother to do the duplicate check at all before passing modules to GHCi?

I modified @tfausak's example a little by adding problem functions to the two Problem modules, each evaluating to a different string.

tlo@carol:~/test$ cat exe/Problem.hs 
module Problem where
problem = "exe"
tlo@carol:~/test$ cat lib/Problem.hs 
module Problem where
problem = "lib"
tlo@carol:~/test$ 

Now let's run GHCi directly, thus bypassing stack repl's duplicate check. We tell GHCi about the two directories. Note that the exe directory goes first. Once we're in GHCi, we :add the ambiguous module, evaluate the problem function, and get "exe".

tlo@carol:~/test$ ghci -iexe -ilib
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Prelude> :add Problem
[1 of 1] Compiling Problem          ( exe/Problem.hs, interpreted )
Ok, modules loaded: Problem.
*Problem> problem
"exe"
*Problem>
Leaving GHCi.
tlo@carol:~/test$ 

Now let's swap the two directories, exe and lib, and run GHCi again. Note how problem now evaluates to "lib".

tlo@carol:~/test$ ghci -ilib -iexe
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Prelude> :add Problem
[1 of 1] Compiling Problem          ( lib/Problem.hs, interpreted )
Ok, modules loaded: Problem.
*Problem> problem
"lib"
*Problem> 
Leaving GHCi.
tlo@carol:~/test$ 

So, GHCi seems to be perfectly able to deal with ambiguous modules. It seems to disambiguate via the order in which the directories are given on the command line.

That's why I'm wondering why stack repl makes the effort to check for duplicates. Is this just a courtesy to the user?

If it's not strictly needed, then removing the check should fix @aschmois's issue, i.e., it should fix things for people who have to run GHCi with multiple components that provide more than one Paths_* module between them. It would also make @tfausak's example work, which exploits the fact that there's always been a duplicate check, just not for auto-generated modules.

Or maybe just turn a failed duplicate check into a warning instead of completely getting rid of it?

@snoyberg
Copy link
Contributor

@uncle-betty I'm not familiar with this code path. If you think disabling this check will address this problem, can you open a PR? The alternative will be backing out #5393 for the 2.5 release.

@uncle-betty
Copy link
Contributor

Thanks, @snoyberg, for the feedback. I'll try to learn a little more and then come back with a proposal.

@mgsloan - quick question: You seem to have contributed the duplicate module check for stack repl in dd952bf, which was later made a little less strict (#3779) by accepting duplicate modules, if they share the same source code file. Do you happen to remember why this check was introduced in the first place? (I know, it's been a while. Sorry to ask about code that's five years old.)

Background: For a package Foo with multiple components, Cabal auto-generates multiple Paths_Foo.hs files - one per component - in different directories. This makes the duplicate check fail in the current stack release candidate, as we have multiple Paths_Foo modules, each with a different source code file.

This wasn't an issue in earlier stack releases, because stack repl didn't use to include auto-generated modules in the set of modules to be loaded by GHCi. Hence, they weren't covered by the duplicate check.

I recently submitted PR #5393 to change this behavior to improve stack's IDE support (ghcide, haskell-language-server) to match Cabal's. These IDEs determine the modules of a component by running stack repl and observing the module names that are passed to GHCi. My pull request made stack repl include auto-generated modules, so that they'd become visible to IDEs. This fails the duplicate check for multi-component packages with auto-generated modules. Which is what led @aschmois to open this ticket, #5407.

I think that I am now facing two options:

As shown in my previous comment, it seems that GHCi doesn't mind duplicate modules. It seems to disambiguate by simply visiting the directories given by -i options in the order in which they are specified on the command line.

@mgsloan
Copy link
Contributor

mgsloan commented Oct 12, 2020

@uncle-betty The duplicate module check is because ghci cannot load modules with the same name, at least as far as I recall. Most of the checks in the implementation of stack ghci are to be able to load multiple packages, despite ghci itself not having explicit support for loading code from multiple packages. I just looked around, and it looks like there is a recent effort to solve this - https://gitlab.haskell.org/ghc/ghc/-/wikis/Multi-Session-GHC-API - looks great!

@uncle-betty
Copy link
Contributor

@mgsloan, thanks for looking into this and for sharing your insights. Yes, it seems that GHCi cannot load multiple different modules of the same name. As far as I can see, you'll always end up with the first matching module of that name, where "first" is defined by the order in which -i command line options are passed to GHCi to indicate source code directories.

So, in your opinion, would it be correct to say that the duplicate check is there to prevent users from accidentally ending up in confusion situations where they are in GHCi, :add a module, but surprisingly don't get the one that they'd expect, because there's another module of the same name that happens to be the first matching module? (Just trying to understand what a world without this check would look like.)

@mgsloan
Copy link
Contributor

mgsloan commented Oct 12, 2020

@uncle-betty I don't recall the particular behavior of multiple modules. But, if it does not cause an error, then it will certainly cause confusing, inconsistent behavior. If code you are loading depends on both modules that share the same name, then one will be used instead, hopefully causing a type error. Worst case, there is some compatibility between the interfaces of the two modules, and you actually get wrong behavior from your program. Behavioral differences and type errors that don't happen in a regular build are quite bad, definitely not the kind of property we want from a tool.

I haven't read this issue's discussion, but I saw it has something to do with Paths_ modules and cabal components. If the redundant Paths_ modules do not vary in content between components, then I suggest picking one via some heuristic such as "most recently updated file".

@tfausak
Copy link
Contributor

tfausak commented Oct 12, 2020

Would it be reasonable to change the duplicate module check from an error into a warning? There's a similar check for components having different default-extensions.

@snoyberg
Copy link
Contributor

That sounds like a great solution @tfausak. @uncle-betty would you be interested in sending a PR? It would be great to get this in sooner rather than later, since this is the last issue holding up the 2.5 release.

@tfausak
Copy link
Contributor

tfausak commented Oct 13, 2020

For what it's worth it looks like GHC will soon be able to support multiple home units, which might address this problem: https://mpickering.github.io/ide/posts/2020-10-12-multiple-home-units.html

@snoyberg
Copy link
Contributor

Turns out that was a lot simpler than I thought. PR here if others can test it: #5413

@uncle-betty
Copy link
Contributor

Great solution! I just tried it and it works well for my use case. Thanks, @tfausak and @snoyberg, for making this happen.

snoyberg added a commit that referenced this issue Oct 15, 2020
…odules

Warn, not error, on duplicate modules (fixes #5407)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants