diff --git a/doc/architecture.md b/doc/architecture.md deleted file mode 100644 index 05cb9d7cc5..0000000000 --- a/doc/architecture.md +++ /dev/null @@ -1,190 +0,0 @@ -
- -# Architecture - -__NOTE__ MSS 2018-08-22 This document is out of date, and will be made -more out of date by -[#3922](https://github.com/commercialhaskell/stack/issues/3922). I -intend to update it when implementing #3922. Tracked in -[#4251](https://github.com/commercialhaskell/stack/issues/4251). - -## Terminology - -* Package identifier: a package name and version, e.g. text-1.2.1.0 -* GhcPkgId: a package identifier plus the unique hash for the generated binary, - e.g. text-1.2.1.0-bb83023b42179dd898ebe815ada112c2 -* Package index: a collection of packages available for download. This is a - combination of an index containing all of the .cabal files (either a tarball - downloaded via HTTP(S) or a Git repository) and some way to download package - tarballs. - * By default, stack uses a single package index (the Github/S3 mirrors of - Hackage), but supports customization and adding more than one index -* Package database: a collection of metadata about built libraries -* Install root: a destination for installing packages into. Contains a bin path - (for generated executables), lib (for the compiled libraries), pkgdb (for the - package database), and a few other things -* Snapshot: an LTS Haskell or Stackage Nightly, which gives information on a - complete set of packages. This contains a lot of metadata, but importantly it - can be converted into a mini build plan... -* Mini build plan: a collection of package identifiers and their build flags - that are known to build together -* Resolver: the means by which stack resolves dependencies for your packages. - The two currently supported options are snapshot (using LTS or Nightly), and - GHC (which installs no extra dependencies). Others may be added in the future - (such as a SAT-based dependency solver). These packages are always taken from - a package index -* extra-deps: additional packages to be taken from the package index for - dependencies. This list will *shadow* packages provided by the resolver -* Local packages: source code actually present on your file system, and - referred to by the `packages` field in your stack.yaml file. Each local - package has exactly one .cabal file -* Project: a stack.yaml config file and all of the local packages it refers to. - -## Databases - -Every build uses three distinct install roots, which means three separate -package databases and bin paths. These are: - -* Global: the packages that ship with GHC. We never install anything into this - database -* Snapshot: a database shared by all projects using the same snapshot. Packages - installed in this database must use the exact same dependencies and build - flags as specified in the snapshot, and cannot be affected by user flags, - ensuring that one project cannot corrupt another. There are two caveats to - this: - * If different projects use different package indices, then their - definitions of what package foo-1.2.3 are may be different, in which case - they *can* corrupt each other's shared databases. This is warned about in - the FAQ - * Turning on profiling may cause a package to be recompiled, which will - result in a different GhcPkgId -* Local: extra-deps, local packages, and snapshot packages which depend on them - (more on that in shadowing) - -## Building - -### Shadowing - -Every project must have precisely one version of a package. If one of your -local packages or extra dependencies conflicts with a package in the snapshot, -the local/extradep *shadows* the snapshot version. The way this works is: - -* The package is removed from the list of packages in the snapshot -* Any package that depends on that package (directly or indirectly) is moved - from the snapshot to extra-deps, so that it is available to your packages as - dependencies. - * Note that there is no longer any guarantee that this package will build, - since you're using an untested dependency - -After shadowing, you end up with what is called internally a `SourceMap`, which -is `Map PackageName PackageSource`, where a `PackageSource` can be either a -local package, or a package taken from a package index (specified as a version -number and the build flags). - -### Installed packages - -Once you have a `SourceMap`, you can inspect your three available databases and -decide which of the installed packages you wish to use from them. We move from -the global, to snapshot, and finally local, with the following rules: - -* If we require profiling, and the library does not provide profiling, do not - use it -* If the package is in the `SourceMap`, but belongs to a difference database, - or has a different version, do not use it -* If after the above two steps, any of the dependencies are unavailable, do not - use it -* Otherwise: include the package in the list of installed packages - -We do something similar for executables, but maintain our own database of -installed executables, since GHC does not track them for us. - -### Plan construction - -When running a build, we know which packages we want installed (inventively -called "wanteds"), which packages are available to install, and which are -already installed. In plan construction, we put this information together to -decide which packages must be built. The code in Stack.Build.ConstructPlan is -authoritative on this and should be consulted. The basic idea though is: - -* If any of the dependencies have changed, reconfigure and rebuild -* If a local package has any files changed, rebuild (but don't bother - reconfiguring) -* If a local package is wanted and we're running tests or benchmarks, run the - test or benchmark even if the code and dependencies haven't changed - -### Plan execution - -Once we have the plan, execution is a relatively simple process of calling -`runghc Setup.hs` in the correct order with the correct parameters. See -Stack.Build.Execute for more information. - -## Configuration - -stack has two layers of configuration: project and non-project. All of these -are stored in stack.yaml files, but the former has extra fields (resolver, -packages, extra-deps, and flags). The latter can be monoidally combined so that -a system config file provides defaults, which a user can override with -`~/.stack/config.yaml`, and a project can further customize. In addition, -environment variables STACK\_ROOT and STACK\_YAML can be used to tweak where -stack gets its configuration from. - -stack follows a simple algorithm for finding your project configuration file: -start in the current directory, and keep going to the parent until it finds a -`stack.yaml`. When using `stack ghc` or `stack exec` as mentioned above, you'll -sometimes want to override that behavior and point to a specific project in -order to use its databases and bin directories. To do so, simply set the -`STACK_YAML` environment variable to point to the relevant `stack.yaml` file. - -## Snapshot auto-detection - -When you run `stack build` with no stack.yaml, it will create a basic -configuration with a single package (the current directory) and an -auto-detected snapshot. The algorithm it uses for selecting this snapshot is: - -* Try the latest two LTS major versions at their most recent minor version - release, and the most recent Stackage Nightly. For example, at the time of - writing, this would be lts-2.10, lts-1.15, and nightly-2015-05-26 -* For each of these, test the version bounds in the package's .cabal file to - see if they are compatible with the snapshot, choosing the first one that - matches -* If no snapshot matches, uses the most recent LTS snapshot, even though it - will not compile - -If you end up in the no compatible snapshot case, you typically have three -options to fix things: - -* Manually specify a different snapshot that you know to be compatible. If you - can do that, great, but typically if the auto-detection fails, it means that - there's no compatible snapshot -* Modify version bounds in your .cabal file to be compatible with the selected - snapshot -* Add `extra-deps` to your stack.yaml file to fix compatibility problems - -Remember that running `stack build` will give you information on why your build -cannot occur, which should help guide you through the steps necessary for the -second and third option above. Also, note that those options can be -mixed-and-matched, e.g. you may decide to relax some version bounds in your -.cabal file, while also adding some extra-deps. - -## Explicit breakage - -As mentioned above, updating your package indices will not cause stack to -invalidate any existing package databases. That's because stack is always -explicit about build plans, via: - -1. the selected snapshot -2. the extra-deps -3. local packages - -The only way to change a plan for packages to be installed is by modifying one -of the above. This means that breakage of a set of installed packages is an -*explicit* and *contained* activity. Specifically, you get the following -guarantees: - -* Since snapshots are immutable, the snapshot package database will not be - invalidated by any action. If you change the snapshot you're using, however, - you may need to build those packages from scratch. -* If you modify your extra-deps, stack may need to unregister and reinstall - them. -* Any changes to your local packages trigger a rebuild of that package and its - dependencies. diff --git a/doc/build-overview.md b/doc/build-overview.md new file mode 100644 index 0000000000..3b23e09180 --- /dev/null +++ b/doc/build-overview.md @@ -0,0 +1,259 @@ +
+ +# Build Overview + +__NOTE__ This document should *not be considered accurate* until this +note is removed. + +This is a work-in-progress document covering the build process used by Stack. +It was started following the Pantry rewrite work in Stack (likely to +land as Stack 2.0), and contains some significant changes/simplifications from +how things used to work. This document will likely not fully be reflected in +the behavior of Stack itself until late in the Stack 2.0 development cycle. + +## Terminology + +* Project package: anything listed in `packages` in stack.yaml +* Dependency: anything listed in extra-deps or a snapshot +* Target: package and/or component listed on the command line to be built. Can + be either project package or dependency. If none specified, automatically + targets all project packages +* Immutable package: a package which comes from Hackage, an archive, or a + repository. In contrast to... +* Mutable package: a package which comes from a local file path. The contents + of such a package are assumed to mutate over time. +* Write only database: a package database and set of executables for a given set + of _immutable_ packages. Only packages from immutable sources and which + depend exclusively on other immutable packages can be in this database. + *NOTE* formerly this was the _snapshot database_. +* Mutable database: a package database and set of executables for packages which + are either mutable or depend on such mutable packages. Importantly, packages + in this database can be unregister, replaced, etc, depending on what happens + with the source packages. *NOTE* formerly this was the *local database*. + +Outdated terminology to be purged: + +* Wanted +* Local +* Snapshot package + +## Inputs + +Stack pays attention to the following inputs: + +* Current working directory, used for finding the default `stack.yaml` file and + resolving relative paths +* The `STACK_YAML` environment variable +* Command line arguments (CLI args), as will be referenced below + +Given these inputs, Stack attempts the following process when performing a build. + +## Find the `stack.yaml` file + +* Check for a `--stack-yaml` CLI arg, and use that +* Check for a `STACK_YAML` env var +* Look for a `stack.yaml` in this directory or ancestor directories +* Fall back to the default global project + +This file is parsed to provide the following config values: + +* `resolver` (required field) +* `compiler` (optional field) +* `packages` (optional field, defaults to `["."]`) +* `extra-deps` (optional field, defaults to `[]`) +* `flags` (optional field, defaults to `{}`) +* `ghc-options` (optional field, defaults to `{}`) + +`flags` and `ghc-options` break down into both _by name_ (applied to a +specific package) and _general_. + +## Wanted compiler, dependencies, and project packages + +* If the `--resolver` CLI is present, ignore the `resolver` and + `compiler` config values +* Load up the snapshot indicated by the `resolver` (either config + value or CLI arg). This will provide: + * A map from package name to package location, flags, GHC options, + and if a package should be hidden. All package locations here + are immutable. + * A wanted compiler version, e.g. `ghc-8.4.3` +* If the `--compiler` CLI arg is set, or the `compiler` config value + is set (and `--resolver` CLI arg is not set), ignore the wanted + compiler from the snapshot and use the specified wanted compiler +* Parse `extra-deps` into a `Map PackageName PackageLocation`, + containing both mutable and immutable package locations. Parse + `packages` into a `Map PackageName ProjectPackage`. +* Ensure there are no duplicates between these two sets of packages +* Delete any packages from the snapshot packages that appear in + `packages` or `extra-deps` +* Perform a left biased union between the immutable `extra-deps` + values and the snapshot packages. Ignore any settings in the + snapshot packages that have been replaced. +* Apply the `flags` and `ghc-options` by name to these packages. If + any values are specified but no matching package is found, it's an + error. +* We are now left with the following: + * A wanted compiler version + * A map from package name to immutable packages with package config (flags, GHC options, hidden) + * A map from package name to mutable packages as dependencies with package config + * A map from package name to mutable packages as project packages with package config + +## Get actual compiler + +Use the wanted compiler and various other Stack config values (not all +listed here) to find the actual compiler, potentially installing it in +the process. + +## Global package sources + +With the actual compiler discovered, list out the packages available +in its database and create a map from package name to +version/GhcPkgId. Remove any packages from this map which are present +in one of the other three maps mentioned above. + +## Resolve targets + +Take the CLI args for targets as raw text values and turn them into +actual targets. + +* Do a basic parse of the values into one of the following: + * Package name + * Package identifier + * Package name + component + * Directory +* An empty target list is equivalent to listing the package names of + all project packages +* For any directories specified, find all project packages in that + directory or subdirectories therefore and convert to those package + names +* For all package identifiers, ensure that either the package name + does not exist in any of the three parsed maps from the "wanted + compiler" step above, or that the package is present as an immutable + dependency from Hackage. If so, create an immutable dependency entry + with default flags, GHC options, and hidden status, and add this + package to the set of immutable package dependencies. +* For all package names, ensure the package is in one of the four maps + we have, and if so add to either the dependency or project package + target set. +* For all package name + component, ensure that the package is a + project package, and add that package + component to the set of + project targets. +* Ensure that no target has been specified multiple times. (*FIXME* + Mihai states: I think we will need an extra consistency step for + internal libraries. Sometimes stack needs to use the mangled name + (`z-package-internallibname-z..`), sometimes the + `package:internallibname` one. But I think this will become obvious + when doing the code changes.) + +We now have an update four package maps, a new set of dependency +targets, and a new set of project package targets (potentially with +specific components). + +## Apply named CLI flags + +Named CLI flags are applied to specific packages by updating the +config in one of the four maps. If a flag is specified and no package +is found, it's an error. Note that flag settings are added _on top of_ +previous settings in this case, and does not replace them. That is, if +previously we have `singleton (FlagName "foo") True` and now add +`singleton (FlagName "bar") True`, both `foo` and `bar` will now be +true. + +## Apply CLI GHC options + +Apply GHC options from the command line to all _project package +targets_. *FIXME* confirm that this is in fact the correct behavior. + +## Apply general flags from CLI + +`--flag *:flagname[:bool]` specified on the CLI are applied to any +project package which uses that flag name. + +## Apply general GHC options + +*FIXME* list out the various choices here and which packages they +apply to. + +## Determine snapshot hash + +Use some deterministic binary serialization and SHA256 thereof to get +a hash of the following information: + +* Actual compiler (GHC version, path, *FIXME* probably some other + unique info from GHC, I've heard that `ghc --info` gives you + something) +* Global database map +* Immutable dependency map + +Motivation: Any package built from the immutable dependency map and +installed in this database will never need to be rebuilt. + +*FIXME* Caveat: do we need to take profiling settings into account +here? How about Haddock status? + +## Determine actual target components + +* Dependencies: "default" components (all libraries and executables) +* Project packages: + * If specific components named: only those, plus any libraries present + * If no specific components, include the following: + * All libraries, always + * All executables, always + * All test suites, _if_ `--test` specified on command line + * All benchmarks, _if_ `--bench` specified on command line + +## Construct build plan + +* Applied to every target (project package or dependency) +* Apply flags, platform, and actual GHC version to resolve + dependencies in any package analyzed +* Include all library dependencies for all enabled components +* Include all build tool dependencies for all enabled components + (using the fun backwards compat logic for `build-tools`) +* Apply the logic recursively to come up with a full build plan +* If a task depends exclusively on immutable packages, mark it as + immutable. Otherwise, it's mutable. The former go into the snapshot + database, the latter into the local database. + +We now have a set of tasks of packages/components to build, with full +config information for each package, and dependencies that must be +built first. + +*FIXME* There's some logic to deal with cyclic dependencies between +test suites and benchmarks, where a task can be broken up into +individual components versus be kept as a single task. Need to +document this better. Currently it's the "all in one" logic. + +## Unregister local modified packages + +* For all mutable packages in the set of tasks, see if any files have + changed since last successful build and, if so, unregister + delete + their executables +* For anything which depends on them directly or transitively, + unregister + delete their executables + +## Perform the tasks + +* Topological sort, find things which have no dependencies remaining +* Check if already installed in the relevant database + * Check package database + * Check Stack specific "is installed" flags, necessary for + non-library packages + * For project packages, need to also check which components were + built, if tests were run, if we need to rerun tests, etc +* If all good: do nothing +* Otherwise, for immutable tasks: check the precompiled cache for an + identical package installation (same GHC, dependencies, etc). If + present: copy that over, and we're done. +* Otherwise, perform the build, register, write to the Stack specific + "is installed" stuff, and (for immutable tasks) register to the + precompiled cache + +"Perform the build" consists of: + +* Do a cabal configure, if needed +* Build the desired components +* For all test suites built, unless "no rerun tests" logic is on and + we already ran the test, _or_ "no run tests" is on, run the test +* For all benchmarks built, unless "no run benchmarks" is on, run the + benchmark diff --git a/doc/terminology.md b/doc/terminology.md deleted file mode 100644 index 336d561b72..0000000000 --- a/doc/terminology.md +++ /dev/null @@ -1,22 +0,0 @@ -
-# Terminology - -This is a work-in-progress document covering terminology used by -Stack. It was started following the Pantry rewrite work in Stack -(likely to land as Stack 2.0), and contains some significant -changes/simplifications from previous terms. - -__NOTE__ This document should *not be considered accurate* until this -note is removed. - -Correct, new terminology - -* Project package: anything listed in `packages` in stack.yaml -* Dependency: anything listed in extra-deps or a snapshot -* Target: package and/or component listed on the command line to be built. Can be either project package or dependency. If none specified, automatically targets all project packages - -Outdated terminology to be purged: - -* Wanted -* Local -* Snapshot package