Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 9999: Recording the source URL of an installed distribution #1009

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
260 changes: 260 additions & 0 deletions pep-9999.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
PEP: 9999
Title: Recording the source URL of an installed distribution
Author: Stéphane Bidoul <[email protected]>
Sponsor: ??? <???>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 21-Apr-2019
Post-History:


Abstract
========

This PEP specifies a ``source_url.json`` file in the Database of Installed
Python Distributions (PEP376_). The purpose of this file is to record, in an
extensible manner, the origin of distributions that were installed
from direct URL references (as opposed to distributions that were installed
from a package index).

The main use case of this entry is allowing tools attempting to "freeze" the
state of a python environment to work in a wider range of situations.

Motivation
==========

Installation from URL
---------------------

Python installers such as pip are capable of downloading and installing
distributions from package indexes. They are also capable of downloading
and installing source code from requirements specifying arbitrary URLs of
source archives and Version Control Systems (VCS) repositories,
as standardized in `PEP440 Direct References`_.

In the latter mode, installers typically download the source code in a
temporary directory, invoke the pep517 build backend to produce a wheel,
install the wheel, and delete the temporary directory.

After installation, no trace of the url used to download the source code is
left on the user system.

Freezing an environment
-----------------------

pip also sports a command named ``pip freeze`` which examines the Database of
Installed Python Distributions to generate a list of requirements. The main
goal of this command is to help users generating a list of requirements that
will later allow the re-installation the same environment with the highest
possible fidelity.

The ``pip freeze`` command outputs a ``name==version`` line for each installed
distribution. To achieve the goal of reinstalling the same environment, this
requires the (name, version) tuple to refer to an immutable version of the
distribution which, combined with the knowledge of which package index to use,
satisfies the requirement. The immutability is guaranteed by package indexes
such as Warehouse. When the package was installed from an arbitrary URL,
the (name, version) tuple is obviously not sufficient to reinstall the same
distribution.

The reasoning above is equally applicable to tools, other than ``pip freeze``,
that would attempt to generate a ``Pipfile.lock`` or any other similar format
from the Database of Installed Python Distributions. Unless specified
otherwise, "freeze" is used in this document as a generic term for such
an operation.

The importance of installing from (VCS) URLs for application integrators
------------------------------------------------------------------------

For an application developer, it is important to be able to reliably install
and freeze unreleased version of python distributions.
For instance when a developer needs to deploy an unreleased patched version
of a dependency, it is common to install the dependency directly from a VCS
branch that has the patch, while waiting for the maintainer to release an
updated version.

In such cases, it is important for "freeze" to pin the exact VCS
reference (commit-hash if available) that was installed, in order to create
reproducible builds with the highest possible fidelity.

Note about "editable" installs
------------------------------

The so called editable installation mode of pip roughly lets a user insert a
local directory in sys.path for development purpose. This mode is somewhat
abused to work around the fact that a non editable install from a VCS URL
loses trace of the origin after installation.
Indeed editable installs implicitly record the VCS origin in the checkout
directory, so the information can be recovered when running "freeze".

The use of this workaround; although useful, is fragile, creates confusion
about the purpose of the editable mode, and works only when the distribution
can be installed with setuptools (ie it is not usable with other pep517
build backends).

For the sake of clarity, it is important to note that this PEP is otherwise
unrelated to editable installs.

Rationale
=========

This PEP specifies a new ``source_url.json`` metadata file in the .dist-info
directory of an installed distribution.

The fields specified are sufficient to reproduce the source archive and VCS
URLs supported by pip. They are also sufficient to reproduce
`PEP440 Direct References`_, as well as `Pipfile and Pipfile.lock`_ entries.

Since at least the above 3 different way to encode the information exist,
this PEP uses a key-value format, to not make any assumption on how a source
URL must ultimately be encoded in a requirement or lockfile.

Information has been taken from Ruby's bundler manual to verify it has similar
capabilities and inform the selection and naming of fields in this
specifications.

The json format allows for the addition of additional fields in the future.

Specification
=============

This PEP specifies a ``source_url.json`` file in the ``.dist-info`` directory
of an installed distribution.

This file MUST be created by installers when installing a distribution
from a source archive URL or VCS URL requirement in non-editable mode.

This file MUST NOT be created when installing a distribution from an other
type of requirement (ie non-URL or URL in editable mode).

If present, it MUST contain at least one field with name ``url``.

The url MUST be stripped from any authentication information,
for security reasons.

URL references SHOULD specify a secure transport mechanism (such as https).

When ``url`` refers to a VCS repository:

- A ``vcs`` field MUST be present, containing the name of the VCS
(ie one of git, hg, bzr, svn). Other VCS SHOULD be registered by
amending this PEP.
- The ``url`` value MUST be compatible with the corresponding VCS,
so an installer can hand it off without transformation to a
checkout/download command of the VCS.
- If the VCS supports commit-hash based revision identifiers, a ``commit-hash``
field MUST be set by the installer in order to reference the immutable
version of the source code that was installed.
- In addition, a ``ref`` field MAY be present to reference a
branch/tag/revision compatible with the VCS.

When ``url`` is a direct reference to a source archive or wheel:

- A ``hash`` field SHOULD be present, with value
``<hash-algorithm>=<expected-hash>``.
It is RECOMMENDED that only hashes which are unconditionally provided by
the latest version of the standard library's hashlib module be used for
source archive hashes. At time of writing, that list consists of 'md5',
'sha1', 'sha224', 'sha256', 'sha384', and 'sha512'.

A ``subdirectory`` field MAY be present containing a directory path,
relative to the root of the VCS repository or source archive,
to specify where ``pyproject.toml`` or ``setup.py`` is located.

Examples
========

Example source_url.json
-----------------------

Source archive:

.. code::

{
"url": "https://github.com/pypa/pip/archive/1.3.1.zip",
"hash": "sha256=2dc6b5a470a1bde68946f263f1af1515a2574a150a30d6ce02c6ff742fcc0db8"
}

Git URL with tag and commit hash:

.. code::

{
"ur"l: "https://github.com/pypa/pip.git",
"vcs": "git",
"ref": "1.3.1",
"commit-hash": "7921be1537eac1e97bc40179a57f0349c2aee67d"
}

Example pip commands and their effect in source_url.json
--------------------------------------------------------

Commands that generate a ``source_url.json``:

* pip install https://example.com/app-1.0.tgz
* pip install https://example.com/app-1.0.whl
* pip install "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
* pip install ./app
* pip instal file:///home/user/app

Commands that *do not* generate a ``source_url.json``

* pip install app
* pip install app --no-index --find-links https://example.com/
* pip install --editable "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
* pip install -e ./app

Use cases
=========

"Freezing" an environment

Tools, such as ``pip freeze``, which generate requirements from the Database
of Installed Python Distributions SHOULD exploit ``source_url.json``
if it is present, and give it priority over the Version metadata in order
to generate a higher fidelity output.

Backwards Compatibility
=======================

Since this PEP specifies a new file in the ``.dist-info`` directory,
there are no backward compatibility implications.

Open Issues
===========

* The now withdrawn PEP426_ specifies a ``source_url`` metadata entry.
It is also implemented in distlib. The only known limitation of this format
is it lacks support for the subdirectory option of pip URLs.
The same limitation is present in PEP440 direct references.
The introduction of url fragments in PEP440 (subdirectory being the first
one to be documented), would allow to use that specification for
``source_url`` too.
* examine what to do for VCS where the branch can be part of the URL
(for svn?).

References
==========

.. _PEP376: http://www.python.org/dev/peps/pep-0376
.. _PEP426: http://www.python.org/dev/peps/pep-0426
.. _PEP440: http://www.python.org/dev/peps/pep-0440
.. _PEP440 Direct References: https://www.python.org/dev/peps/pep-0440/#direct-references
.. _Pipfile and Pipfile.lock: https://github.com/pypa/pipfile

Copyright
=========

This document has been placed in the public domain.


..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: