Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement a new-exec command #4304

Closed
wants to merge 12 commits into from
Closed

implement a new-exec command #4304

wants to merge 12 commits into from

Conversation

dmwit
Copy link
Collaborator

@dmwit dmwit commented Feb 8, 2017

This patch set implements a new-exec command for running compiler tools and dependency executables in an environment set up to match the set of packages cabal-install would choose for its builds. This is useful for debugging broken builds and during development for running one-off tests and build tasks by hand.

The implementation here has some oddities worth calling out:

  • It relies on the GHC_ENVIRONMENT variable to choose an appropriate subset of packages from the store. This only works with GHC 8.0.2 and later. It also doesn't interact very cleanly with ghc-pkg, but I'm having a hard time succinctly describing the exact failure mode. To be sure cabal new-exec ghc-pkg list does not show all and only the packages in the current dependency tree. Perhaps this could be considered a GHC bug, though, and not an issue to solve within cabal-install itself.
  • It uses the same strategy that cabal exec does for rewriting commands, namely, going via a ProgramDb. While testing I discovered that this leads to some strange behavior. For example, cabal new-configure -w ghc-8.0.2 followed by cabal new-exec ghc -- --version will correctly report version 8.0.2 even if another ghc is on the PATH; but cabal new-exec ghci -- --version will not. (A similar caveat applies to cabal configure followed by cabal exec ghci.) Additionally, scripts which execute ghc will see whatever is on the PATH and not the correct compiler.

I am open to suggestions for mitigating these issues.

@mention-bot
Copy link

@dmwit, thanks for your PR! By analyzing the history of the files in this pull request, we identified @ezyang, @kosmikus and @dcoutts to be potential reviewers.

@23Skidoo
Copy link
Member

23Skidoo commented Feb 8, 2017

Nice, thanks for working on this!

@ezyang
Copy link
Contributor

ezyang commented Feb 12, 2017

Thank you. This needs docs though!

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is excellent. I don't mind if you go ahead and commit if you promise to fix bugs and add tests :) (We really do need tests...) I do have some substantive comments in the reviews, take a look.

(some of my comments are marked outdated but they are still applicable.)

-> NoCallStackIO FilePath
writeGhcEnvironmentFile directory platform ghcversion entries = do
writeFileAtomic envfile . BS.pack . renderGhcEnvironmentFile $ entries
return envfile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I wanted to be picky, I'd create a pure function to compute the envfile name, and then have writeGhcEnvironmentFile take the final filename to write it out to. You don't have to do this.

}

execAction :: ExecFlags -> [String] -> GlobalFlags -> IO ()
execAction execFlags extraArgs globalFlags = do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, one of the conventions that the other new-* commands has taken, is that they unconditionally accept (ConfigFlags, ConfigExFlags, InstallFlags, HaddockFlags). This is even the case for new-build, which is a change in behavior from build (which didn't accept certain flags.) It seems more uniform to me to follow this convention; is there a particular reason you don't here? You have a TODO below about the settings but it doesn't explain to me why you didn't just alos grab those flags.

{ hookPrePlanning = \_ _ _ -> return ()
, hookSelectPlanSubset = \_ -> return
}
buildStatus <- updatePostBuildProjectStatus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth an explanation why you call out to update status here with a mempty build outcome. I believe it's because you want to find out if some files have been modified, causing things to be out of date? (Or maybe it's just because you need the buildStatus variable?)

-- TODO: discuss PATH munging with #hackage

case extraArgs of
exe:args -> runProgramInvocation
Copy link
Contributor

@ezyang ezyang Feb 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this pass stdin correctly to the created process? (does ghci work?)

@@ -1802,6 +1791,39 @@ elaborateInstallPlan verbosity platform compiler compilerprogdb pkgConfigDB
-- + vanilla libs & exes, exe needs lib, recursive
-- + ghci or shared lib needed by TH, recursive, ghc version dependent

-- | Get the bin\/ directory that executables should reside in, assuming that
-- they are the result of an in-place build.
inplaceBinDirectory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, thank you!

pathAdditions :: Verbosity -> ProjectBuildContext -> IO [FilePath]
pathAdditions verbosity ProjectBuildContext{..} = do
debug verbosity $ "Considering the following directories for inclusion in PATH:"
mapM_ (debug verbosity) paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason to test if an executable exists on the physical file system, as opposed to unconditionally adding the binary directory to the path if the package in question defines some executables? You'd be able to save on a bunch of banging on the filesystem if you just go off the in-memory metadata.

-- | Returns a list of all entries in the directory except the special entries
-- @.@ and @..@.
listDirectory :: FilePath -> IO [FilePath]
listDirectory d = filter (`notElem` [".", ".."]) <$> getDirectoryContents d
Copy link
Contributor

@ezyang ezyang Feb 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be OK to delete if you stop probing right? (See my (hidden) comment below on cabal-install/Distribution/Client/CmdExec.hs )

@dmwit
Copy link
Collaborator Author

dmwit commented Feb 13, 2017

(I will look at doing tests+documentation, but wanted to get the code changes into the CI pipeline as quick as I could without waiting on that.)

@ezyang
Copy link
Contributor

ezyang commented Feb 13, 2017

No problem. (I kind of slagged the CI last night so it might take some time to build, sorry!)

@23Skidoo
Copy link
Member

If this is ready to go in, I can cherry-pick the patches into 2.0. Though there's a merge conflict right now.

@dmwit
Copy link
Collaborator Author

dmwit commented Feb 24, 2017

Hey. Sorry for the few days of radio silence. The code is ready to go in; I'm going to be looking at writing some docs and tests today. I'll take a look at the merge conflict.

dmwit added 12 commits February 24, 2017 15:10
Although we may not want to create GHC environment files in a standard
location after every build, it is still useful to be able to build them
in a custom place in special circumstances. So let's disable this
feature at the call-sites where it doesn't work properly rather than
inside the implementation of the feature.
This implements a bare-bones skeleton for cabal new-exec.

The old cabal exec gave programs access to a sandbox's package database.
By analogy, cabal new-exec should give programs access to the store's
package database; however, this database will be cluttered with many
non-project-related packages that may confuse issues. Therefore new-exec
selects just the packages that are in the current project's dependency
tree and makes them available to compiler tools. Currently only very new
GHCs are supported, via the GHC_ENVIRONMENT mechanism for selecting a
subset of some package databases.

Eventually we should probably also modify the PATH so that dependencies'
executables are available.
cabal new-exec should be allowed to run any executables available in the
dependencies for the current project. This patch introduces that
ability: we walk the installation plan, picking out a bin/ directory for
each package, then including it in the PATH if there are any executables
in that directory.
Previously, whichever version of compiler tools that were on the PATH
would be used. Going through the ProgramDb should let us use the
appropriate versions of these tools, instead.
Previously, cabal new-exec would modify the PATH by including each
package's bin directory if it contained executables. This has some
drawbacks:

* Lots of filesystem probing
* Can have false positives due to dynamic libraries
* Mildly unpredictable

The new plan is to add each package's bin directory if the package
includes an executable stanza. This may include some directories that
have no executables in them (e.g. if the executable is not yet built or
is not in the install plan), but is more reproducible and requires less
filesystem access.
Previously, new-exec would use the `build` directory when looking for
executables built inplace. But these executables are actually in
subdirectories of `build`, so the search would fail (potentially falling
back to an executable in the user's PATH instead). Now we place one
directory in the PATH for each executable in each inplace-build package
in the install plan.
@23Skidoo
Copy link
Member

np. I think it's too late for this PR to go into cabal-install 2.0, but I can consider including it in 2.0.1.

@BardurArantsson
Copy link
Collaborator

BardurArantsson commented Feb 25, 2017

EDIT: Removed, see #4359 instead.

@grayjay
Copy link
Collaborator

grayjay commented Feb 25, 2017

I agree. I also like merges better because they show which commits belong to each pull request.

@BardurArantsson
Copy link
Collaborator

Sorry, this was a stupid place to put that comment of mine -- I've opened #4359 for discussion. (+ Removed original comment.)

@ezyang
Copy link
Contributor

ezyang commented Feb 26, 2017

@dmwit Looks like some of the errors are due to changes I landed. Do you need help resolving them?

@dcoutts
Copy link
Contributor

dcoutts commented Mar 15, 2017

The new cli handling stuff is in now and it'll have to be updated for this. Just copy & paste from one of the other Cmd* modules to fit the new pattern.

@phadej
Copy link
Collaborator

phadej commented Aug 17, 2017

Is this obsoleted by new-run? ping @hvr, @23Skidoo

@23Skidoo
Copy link
Member

@fgaz can answer that question better than me, since he has been working on this recently.

@hvr
Copy link
Member

hvr commented Aug 17, 2017

@phadej new-run and new-exec have different purpose; if anything, .ghc.environment.* files compete with new-exec in terms of use-case overlap; but IMO not new-run and new-exec.

@phadej
Copy link
Collaborator

phadej commented Aug 17, 2017

that would be very confusing if we get new-run and new-exec, which are different :S

@cocreature
Copy link
Collaborator

that would be very confusing if we get new-run and new-exec, which are different :S

It’s not any more confusing than the cabal run and cabal exec we have now, right?

@dmwit
Copy link
Collaborator Author

dmwit commented Aug 18, 2017

My two cents: there's a difference in intention and a difference in behavior (and these are related).

new-exec is about debugging the build process, so it is about running build-tools manually but in the "same way" that cabal itself would try to run them. So new-exec is about setting up the right package database for GHC, invoking alex and happy with the same extra command line options, and so forth.

new-run is about debugging the program you're building, so it is about running cabal-built programs without having to install them first. So new-run is about making sure that the program is up-to-date and that we're running the executable produced from the code we're editing.

That difference in intention motivates what I think of as the defining difference in behavior between the two: new-exec never builds anything, and may modify your command to be "more like what cabal would do" by adding command line arguments or modifying the environment it runs in, while new-run always makes sure the run target is up-to-date but passes through arguments and environment variables unmodified.

I think those two modes of operation are sufficiently different -- and both sufficiently useful -- that it would be a shame to have either one and not the other in the new-* regime.

@fgaz
Copy link
Member

fgaz commented Aug 24, 2017

See #4722 for updates.

@fgaz fgaz closed this Aug 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.