diff --git a/doc/src/manual/code-loading.md b/doc/src/manual/code-loading.md index c990de601e5aa3..e09b6d4aaf2296 100644 --- a/doc/src/manual/code-loading.md +++ b/doc/src/manual/code-loading.md @@ -3,12 +3,12 @@ Julia has two mechanisms for loading code: 1. **Code inclusion:** e.g. `include("source.jl")`. Inclusion allows you to split a single program across multiple source files. The expression `include("source.jl")` causes the contents of the file `source.jl` to be evaluated in the global scope of the module where the `include` call occurs. If `include("source.jl")` is called multiple times, `source.jl` is evaluated multiple times. The included path, `source.jl`, is interpreted relative to the file where the `include` call occurs. This makes it simple to relocate a subtree of source files. In the REPL, included paths are interpreted relative to the current working directory, `pwd()`. -2. **Package loading:** e.g. `import X` or `using X`. The import mechanism allows you to load a package—i.e. an independent, reusable collection of Julia code, wrapped in a module—and makes the resulting module available by the name `X` inside of the importing module. If the same `X` package is imported multiple times in the same Julia session, it is only loaded the first time—on subsequent imports, the importing module gets a reference to the same module. It should be noted, however, that `import X` can load different packages in different contexts: `X` can refer to one package named `X` in the main project but potentially different packages named `X` in each dependency. More on this below. +2. **Package loading:** e.g. `import X` or `using X`. The import mechanism allows you to load a package—i.e. an independent, reusable collection of Julia code, wrapped in a module—and makes the resulting module available by the name `X` inside of the importing module. If the same `X` package is imported multiple times in the same Julia session, it is only loaded the first time—on subsequent imports, the importing module gets a reference to the same module. Note though, that `import X` can load different packages in different contexts: `X` can refer to one package named `X` in the main project but potentially different packages named `X` in each dependency. More on this below. -Code inclusion is quite straightforward: it simply parses and evaluates a source file in the context of the caller. Package loading is built on top of code inclusion and is quite a bit more complex. The rest of this chapter, therefore, focuses on the behavior and mechanics of package loading. +Code inclusion is quite straightforward: it simply parses and evaluates a source file in the context of the caller. Package loading is built on top of code inclusion and is a lot more complex. Therefore, the rest of this chapter focuses on the behavior and mechanics of package loading. !!! note - You only need to read this chapter if you want to understand the technical details of package loading in Julia. If you just want to install and use packages, simply use Julia's built-in package manager to add packages to your environment and write `import X` or `using X` in your code to load packages that you've added. + You only need to read this chapter if you want to understand the technical details of package loading. If you just want to install and use packages, simply use Julia's built-in package manager to add packages to your environment and write `import X` or `using X` in your code to load packages that you've added. A *package* is a source tree with a standard layout providing functionality that can be reused by other Julia projects. A package is loaded by `import X` or `using X` statements. These statements also make the module named `X`, which results from loading the package code, available within the module where the import statement occurs. The meaning of `X` in `import X` is context-dependent: which `X` package is loaded depends on what code the statement occurs in. The effect of `import X` depends on two questions: @@ -19,9 +19,9 @@ Understanding how Julia answers these questions is key to understanding package ## Federation of packages -Julia supports federated management of packages. This means that multiple independent parties can maintain both public and private packages and registries of them, and that projects can depend on a mix of public and private packages from different registries. Packages from various registries are installed and managed using a common set of tools and workflows. The `Pkg` package manager ships with Julia 0.7/1.0 and lets you install and manage dependencies of your projects, by creating and manipulating project files, which describe what your project depends on, and manifest files that snapshot exact versions of your project's complete dependency graph. +Julia supports federated management of packages. This means that multiple independent parties can maintain both public and private packages and registries of them, and that projects can depend on a mix of public and private packages from different registries. Packages from various registries are installed and managed using a common set of tools and workflows. The `Pkg` package manager ships with Julia 0.7/1.0 and lets you install and manage your projects' dependencies. It does this by creating and manipulating project files that describe what your project depends on, and manifest files that snapshot exact versions of your project's complete dependency graph. -One consequence of federation is that there cannot be a central authority for package naming. Different entities may use the same name to refer to unrelated packages. This possibility is unavoidable since these entities do not coordinate and may not even know about each other. Because of the lack of a central naming authority, a single project can quite possibly end up depending on different packages with the same name. Julia's package loading mechanism handles this by not requiring package names to be globally unique, even within the dependency graph of a single project. Instead, packages are identified by [universally unique identifiers](https://en.wikipedia.org/wiki/Universally_unique_identifier) (UUIDs) which are assigned to them before they are registered. The question *"what is `X`?"* is answered by determining the UUID of `X`. +One consequence of federation is that there cannot be a central authority for package naming. Different entities may use the same name to refer to unrelated packages. This possibility is unavoidable since these entities do not coordinate and may not even know about each other. Because of the lack of a central naming authority, a single project can quite possibly end up depending on different packages that have the same name. Julia's package loading mechanism handles this by not requiring package names to be globally unique, even within the dependency graph of a single project. Instead, packages are identified by [universally unique identifiers](https://en.wikipedia.org/wiki/Universally_unique_identifier) (UUIDs) which are assigned to them before they are registered. The question *"what is `X`?"* is answered by determining the UUID of `X`. Since the decentralized naming problem is somewhat abstract, it may help to walk through a concrete scenario to understand the issue. Suppose you're developing an application called `App`, which uses two packages: `Pub` and `Priv`. `Priv` is a private package that you created, whereas `Pub` is a public package that you use but don't control. When you created `Priv`, there was no public package by that name. Subsequently, however, an unrelated package also named `Priv` has been published and become popular. In fact, the `Pub` package has started to use it. Therefore, when you next upgrade `Pub` to get the latest bug fixes and features, `App` will end up—through no action of yours other than upgrading—depending on two different packages named `Priv`. `App` has a direct dependency on your private `Priv` package, and an indirect dependency, through `Pub`, on the new public `Priv` package. Since these two `Priv` packages are different but both required for `App` to continue working correctly, the expression `import Priv` must refer to different `Priv` packages depending on whether it occurs in `App`'s code or in `Pub`'s code. Julia's package loading mechanism allows this by distinguishing the two `Priv` packages by context and UUID. How this distinction works is determined by environments, as explained in the following sections. @@ -35,9 +35,9 @@ An *environment* determines what `import X` and `using X` mean in various code c These three kinds of environment each serve a different purpose: -* Project environments provide **reproducibility.** By checking a project environment into version control—e.g. a git repository—along with the rest of the project's source code, you can reproduce the exact state of the project _and_ all of its dependencies since the manifest file captures the exact version of every dependency and can be rematerialized easily. -* Package directories provide low-overhead **convenience** when a project environment would be overkill: are handy when you have a set of packages and just want to put them somewhere and use them as they are without having to create and maintain a project environment for them. -* Stacked environments allow for **augmentation** of the primary environment with additional tools. You can push an environment including development tools onto the stack and they will be available from the REPL and scripts but not from inside of packages. +* Project environments provide **reproducibility.** By checking a project environment into version control—e.g. a git repository—along with the rest of the project's source code, you can reproduce the exact state of the project _and_ all of its dependencies since the manifest file captures the exact version of every dependency. +* Package directories provide low-overhead **convenience** when a project environment isn't needed. Package directories are handy when you have a set of packages that you just want to put somewhere and use them as they are, without having to create and maintain a project environment for them. +* Stacked environments allow for **augmentation** of the primary environment with additional tools. You can push an environment including development tools onto the stack and they will be available from the REPL and scripts but not from inside packages. As an abstraction, an environment provides three maps: `roots`, `graph` and `paths`. When resolving the meaning of `import X`, `roots` and `graph` are used to determine the identity of `X` and answer the question *"what is `X`?"*, while the `paths` map is used to locate the source code of `X` and answer the question *"where is `X`?"* The specific roles of the three maps are: @@ -51,18 +51,18 @@ As an abstraction, an environment provides three maps: `roots`, `graph` and `pat - **paths:** `uuid::UUID` × `name::Symbol` ⟶ `path::String` - The `paths` map assigns to each package UUID-name pair, the location of the entry-point source file of that package. After the identity of `X` in `import X` has been resolved to a UUID via `roots` or `graph` (depending on whether it is loaded from the main project or an dependency), Julia determines what file to load to acquire `X` by looking up `paths[uuid,:X]` in the environment. Including this file should create a module named `X`. After the first time this package is loaded, any import resolving to the same `uuid` will simply create a new binding to the same already-loaded package module. + The `paths` map assigns to each package UUID-name pair, the location of that package's entry-point source file. After the identity of `X` in `import X` has been resolved to a UUID via `roots` or `graph` (depending on whether it is loaded from the main project or a dependency), Julia determines what file to load to acquire `X` by looking up `paths[uuid,:X]` in the environment. Including this file should create a module named `X`. Once this package is loaded, i.e. after its first import, any subsequent import resolving to the same `uuid` will simply create a new binding to the original already-loaded package module. Each kind of environment defines these three maps differently, as detailed in the following sections. !!! note - For clarity of exposition, the examples throughout this chapter include fully materialized data structures for `roots`, `graph` and `paths`. However, these maps are really only abstractions—for efficiency, Julia's package loading code does not actually materialize them. Instead, it queries them through internal APIs and lazily computes only as much of each structure as is necessary to load a given package. + For ease of understanding, the examples throughout this chapter show full data structures for `roots`, `graph` and `paths`. However, for efficiency, Julia's package loading code does not actually create them. Instead, it queries them through internal APIs and lazily computes only as much of each structure as it needs to load a given package. ### Project environments -A project environment is determined by a directory containing a project file, `Project.toml`, and optionally a manifest file, `Manifest.toml`. These files can also be named `JuliaProject.toml` and `JuliaManifest.toml`, in which case `Project.toml` and `Manifest.toml` are ignored; this allows for coexistence with other tools that might consider files named `Project.toml` and `Manifest.toml` significant. For pure Julia projects, however, the names `Project.toml` and `Manifest.toml` should be preferred. The `roots`, `graph` and `paths` maps of a project environment are defined as follows. +A project environment is determined by a directory containing a project file called `Project.toml`, and optionally a manifest file called `Manifest.toml`. These files may also be called `JuliaProject.toml` and `JuliaManifest.toml`, in which case `Project.toml` and `Manifest.toml` are ignored. (This allows for coexistence with other tools that might consider files called `Project.toml` and `Manifest.toml` significant.) For pure Julia projects, however, the names `Project.toml` and `Manifest.toml` are preferred. The `roots`, `graph` and `paths` maps of a project environment are defined as follows. -**The roots map** of the environment is determined by the contents of the project file, specifically, its top-level `name` and `uuid` entries and its `[deps]` section (all optional). Consider the following example project file for the hypothetical application, `App`, as described above: +**The roots map** of the environment is determined by the contents of the project file, specifically, its top-level `name` and `uuid` entries and its `[deps]` section (all optional). Consider the following example project file for the hypothetical application, `App`, as described earlier: ```toml name = "App" @@ -73,7 +73,7 @@ Priv = "ba13f791-ae1d-465a-978b-69c3ad90f72b" Pub = "c07ecb7d-0dc9-4db7-8803-fadaaeaf08e1" ``` -This project file implies the following `roots` map, if it were materialized as a Julia dictionary: +This project file implies the following `roots` map, if it was represented by a Julia dictionary: ```julia roots = Dict( @@ -83,9 +83,9 @@ roots = Dict( ) ``` -Given this `roots` map, in the code of `App` the statement `import Priv` will cause Julia to look up `roots[:Priv]`, which yields `ba13f791-ae1d-465a-978b-69c3ad90f72b`, the UUID of the `Priv` package that is to be loaded in that context. This UUID identifies which `Priv` package to load and use when the main application evaluates `import Priv`. +Given this `roots` map, in `App`'s code the statement `import Priv` will cause Julia to look up `roots[:Priv]`, which yields `ba13f791-ae1d-465a-978b-69c3ad90f72b`, the UUID of the `Priv` package that is to be loaded in that context. This UUID identifies which `Priv` package to load and use when the main application evaluates `import Priv`. -**The dependency graph** of a project environment is determined by the contents of the manifest file, if present, or if there is no manifest file, `graph` is empty. A manifest file contains a stanza for each direct or indirect dependency of a project, including for each one, its UUID and a source tree hash or an explicit path to the source code. Consider the following example manifest file for `App`: +**The dependency graph** of a project environment is determined by the contents of the manifest file, if present. If there is no manifest file, `graph` is empty. A manifest file contains a stanza for each of a project's direct or indirect dependencies, including for each one, its UUID and a source tree hash or an explicit path to the source code. Consider the following example manifest file for `App`: ```toml [[Priv]] # the private one @@ -120,7 +120,7 @@ This manifest file describes a possible complete dependency graph for the `App` * The public `Priv` has no dependencies. - The application also depends on the `Pub` package, which in turn depends on the public `Priv ` and the same `Zebra` package which the private `Priv` package depends on. -A materialized representation of this dependency `graph` looks like this: +This dependency `graph` represented as a dictionary, looks like this: ```julia graph = Dict{UUID,Dict{Symbol,UUID}}( @@ -147,7 +147,7 @@ Given this dependency `graph`, when Julia sees `import Priv` in the `Pub` packag graph[UUID("c07ecb7d-0dc9-4db7-8803-fadaaeaf08e1")][:Priv] ``` -and gets `2d15fe94-a1f7-436c-a4d8-07a9a496e01c` , which indicates that in the context of the `Pub` package, `import Priv` refers to the public `Priv` package, rather than the private one which the app depends on directly. This is how the name `Priv` can refer to different packages in the main project than it does in one of the packages dependencies, which allows for name collisions in the package ecosystem. +and gets `2d15fe94-a1f7-436c-a4d8-07a9a496e01c`, which indicates that in the context of the `Pub` package, `import Priv` refers to the public `Priv` package, rather than the private one which the app depends on directly. This is how the name `Priv` can refer to different packages in the main project than it does in one of its package's dependencies, which allows for duplicate names in the package ecosystem. What happens if `import Zebra` is evaluated in the main `App` code base? Since `Zebra` does not appear in the project file, the import will fail even though `Zebra` *does* appear in the manifest file. Moreover, if `import Zebra` occurs in the public `Priv` package—the one with UUID `2d15fe94-a1f7-436c-a4d8-07a9a496e01c`—then that would also fail since that `Priv` package has no declared dependencies in the manifest file and therefore cannot load any packages. The `Zebra` package can only be loaded by packages for which it appear as an explicit dependency in the manifest file: the `Pub` package and one of the `Priv` packages. @@ -167,7 +167,7 @@ If, on the other hand, Julia was loading the *other* `Priv` package—the one wi Julia uses the first of these that exists to load the public `Priv` package. -Here is a materialized `paths` map for the `App` project environment: +Here is a representation of the `paths` map for the `App` project environment: ```julia paths = Dict{Tuple{UUID,Symbol},String}( @@ -192,13 +192,13 @@ paths = Dict{Tuple{UUID,Symbol},String}( This example map includes three different kinds of package locations: -1. The private `Priv` package is "[vendored](https://stackoverflow.com/a/35109534/659248)" inside of `App` repository. +1. The private `Priv` package is "[vendored](https://stackoverflow.com/a/35109534/659248)" inside the `App` repository. 2. The public `Priv` and `Zebra` packages are in the system depot, where packages installed and managed by the system administrator live. These are available to all users on the system. 3. The `Pub` package is in the user depot, where packages installed by the user live. These are only available to the user who installed them. ### Package directories -Package directories provide a kind of environment that approximates package loading in Julia 0.6 and earlier, and which resembles package loading in many other dynamic languages. The set of packages available in a package directory corresponds to the set of subdirectories it contains that look like packages: if `X/src/X.jl` is a file in a package directory, then `X` is considered to be a package and `X/src/X.jl` is the file you load to get `X`. Which packages can "see" each other as dependencies depends on whether they contain project files or not and what appears in the `[deps]` sections of those project files. +Package directories provide a kind of environment that approximates package loading in Julia 0.6 and earlier, and which resembles package loading in many other dynamic languages. The set of packages available in a package directory corresponds to the set of subdirectories it contains that look like packages: if `X/src/X.jl` is a file in a package directory, then `X` is considered to be a package and `X/src/X.jl` is the file Julia loads to get `X`. Which packages can "see" each other as dependencies depends on whether they contain project files, and if they do, on what appears in those project files' `[deps]` sections. **The roots map** is determined by the subdirectories `X` of a package directory for which `X/src/X.jl` exists and whether `X/Project.toml` exists and has a top-level `uuid` entry. Specifically `:X => uuid` goes in `roots` for each such `X` where `uuid` is defined as: @@ -246,7 +246,7 @@ Dingo/ # no imports ``` -Here is a corresponding `roots` structure, materialized as a dictionary: +Here is a corresponding `roots` structure, represented as a dictionary: ```julia roots = Dict{Symbol,UUID}( @@ -257,7 +257,7 @@ roots = Dict{Symbol,UUID}( ) ``` -Here is the corresponding `graph` structure, materialized as a dictionary: +Here is the corresponding `graph` structure, represented as a dictionary: ```julia graph = Dict{UUID,Dict{Symbol,UUID}}( @@ -285,12 +285,12 @@ Observe the following specific instances of these rules in our example: * `Aardvark` can import on any of `Bobcat`, `Cobra` or `Dingo`; it does import `Bobcat` and `Cobra`. * `Bobcat` can and does import both `Cobra` and `Dingo`, which both have project files with UUIDs and are declared as dependencies in `Bobcat`'s `[deps]` section. -* `Bobcat` cannot possibly depend on `Aardvark` since `Aardvark` does not have a project file. +* `Bobcat` cannot depend on `Aardvark` since `Aardvark` does not have a project file. * `Cobra` can and does import `Dingo`, which has a project file and UUID, and is declared as a dependency in `Cobra`'s `[deps]` section. * `Cobra` cannot depend on `Aardvark` or `Bobcat` since neither have real UUIDs. * `Dingo` cannot import anything because it has a project file without a `[deps]` section. -**The paths map** in a package directory is simple: it maps subdirectory names to their corresponding entry-point paths. In other words, if the path to our example project directory is `/home/me/animals` then the `paths` map would be materialized as this dictionary: +**The paths map** in a package directory is simple: it maps subdirectory names to their corresponding entry-point paths. In other words, if the path to our example project directory is `/home/me/animals` then the `paths` map could be represented by this dictionary: ```julia paths = Dict{Tuple{UUID,Symbol},String}( @@ -321,7 +321,7 @@ paths = reduce(merge, reverse([paths₁, paths₂, …])) The subscripted `rootsᵢ`, `graphᵢ` and `pathsᵢ` variables correspond to the subscripted environments, `envᵢ`, contained `stack`. The `reverse` is present because `merge` favors the last argument rather than first when there are collisions between keys in its argument dictionaries. That's all there is to stacked environments. There are a couple of noteworthy features of this design: -1. The *primary environment*—i.e.the first environment in a stack—is faithfully embedded in a stacked environment. The full dependency graph of the first environment in a stack is guaranteed to be included intact in the stacked environment including the same versions of all dependencies. +1. The *primary environment*—i.e. the first environment in a stack—is faithfully embedded in a stacked environment. The full dependency graph of the first environment in a stack is guaranteed to be included intact in the stacked environment including the same versions of all dependencies. 2. Packages in non-primary environments can end up using incompatible versions of their dependencies even if their own environments are entirely compatible. This can happen when one of their dependencies is shadowed by a version in an earlier environment in the stack. Since the primary environment is typically the environment of a project you're working on, while environments later in the stack contain additional tools, this is the right tradeoff: it's better to break your dev tools but keep the project working. When such incompatibilities occur, you'll typically want to upgrade your dev tools to versions that are compatible with the main project.