-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PEP 571: Updated version of the manylinux ABI (GH-565)
manylinux1 is getting old enough now to start making things difficult (specifically around network security), so it's time for a refresh to a slightly newer baseline.
- Loading branch information
1 parent
69eb650
commit a5f87a7
Showing
1 changed file
with
343 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,343 @@ | ||
PEP: 571 | ||
Title: The manylinux2 Platform Tag | ||
Version: $Revision$ | ||
Last-Modified: $Date$ | ||
Author: Mark Williams <[email protected]> | ||
BDFL-Delegate: Nick Coghlan <[email protected]> | ||
Discussions-To: Distutils SIG <[email protected]> | ||
Status: Active | ||
Type: Informational | ||
Content-Type: text/x-rst | ||
Created: | ||
Post-History: | ||
Resolution: | ||
|
||
|
||
Abstract | ||
======== | ||
|
||
This PEP proposes the creation of a ``manylinux2`` platform tag to | ||
succeed the ``manylinux1`` tag introduced by PEP 513 [1]_. It also | ||
proposes that PyPI and ``pip`` both be updated to support uploading, | ||
downloading, and installing ``manylinux2`` distributions on compatible | ||
platforms. | ||
|
||
Rationale | ||
========= | ||
|
||
True to its name, the ``manylinux1`` platform tag has made the | ||
installation of binary extension modules a reality on many Linux | ||
systems. Libraries like ``cryptography`` [2]_ and ``numpy`` [3]_ are | ||
more accessible to Python developers now that their installation on | ||
common architectures does not depend on fragile development | ||
environments and build toolchains. | ||
|
||
``manylinux1`` wheels achieve their portability by allowing the | ||
extension modules they contain to link against only a small set of | ||
system-level shared libraries that export versioned symbols old enough | ||
to benefit from backwards-compatibility policies. Extension modules | ||
in a ``manylinux1`` wheel that rely on ``glibc``, for example, must be | ||
built against version 2.5 or earlier; they may then be run systems | ||
that provide more recent ``glibc`` version that still export the | ||
required symbols at version 2.5. | ||
|
||
PEP 513 drew its whitelisted shared libraries and their symbol | ||
versions from CentOS 5.11, which was the oldest supported CentOS | ||
release at the time of its writing. Unfortunately, CentOS 5.11 | ||
reached its end-of-life on March 31st, 2017 with a clear warning | ||
against its continued use. [4]_ No further updates, such as security | ||
patches, will be made available. This means that its packages will | ||
remain at obsolete versions that hamper the efforts of Python software | ||
packagers who use the ``manylinux1`` Docker image. | ||
|
||
CentOS 6 is now the oldest supported CentOS release, and will receive | ||
maintenance updates through November 30th, 2020. [5]_ We propose that | ||
a new PEP 425-style [6]_ platform tag called ``manylinux2`` be derived | ||
from CentOS 6 and that the ``manylinux`` toolchain, PyPI, and ``pip`` | ||
be updated to support it. | ||
|
||
|
||
The ``manylinux2`` policy | ||
========================= | ||
|
||
The following criteria determine a ``linux`` wheel's eligibility for | ||
the ``manylinux2`` tag: | ||
|
||
1. The wheel may only contain binary executables and shared objects | ||
compiled for one of the two architectures supported by CentOS 6: | ||
x86_64 or i686. [5]_ | ||
2. The wheel's binary executables or shared objects may not link | ||
against externally-provided libraries except those in the following | ||
whitelist: :: | ||
|
||
libgcc_s.so.1 | ||
libstdc++.so.6 | ||
libm.so.6 | ||
libdl.so.2 | ||
librt.so.1 | ||
libcrypt.so.1 | ||
libc.so.6 | ||
libnsl.so.1 | ||
libutil.so.1 | ||
libpthread.so.0 | ||
libresolv.so.2 | ||
libX11.so.6 | ||
libXext.so.6 | ||
libXrender.so.1 | ||
libICE.so.6 | ||
libSM.so.6 | ||
libGL.so.1 | ||
libgobject-2.0.so.0 | ||
libgthread-2.0.so.0 | ||
libglib-2.0.so.0 | ||
|
||
This list is identical to the externally-provided libraries | ||
whitelisted for ``manylinux1``, minus ``libncursesw.so.5`` and | ||
``libpanelw.so.5``. [7]_ ``libpythonX.Y`` remains ineligible for | ||
inclusion for the same reasons outlined in PEP 513. | ||
|
||
On Debian-based systems, these libraries are provided by the packages: | ||
|
||
============ ======================================================= | ||
Package Libraries | ||
============ ======================================================= | ||
libc6 libdl.so.2, libresolv.so.2, librt.so.1, libc.so.6, | ||
libpthread.so.0, libm.so.6, libutil.so.1, libcrypt.so.1, | ||
libnsl.so.1 | ||
libgcc1 libgcc_s.so.1 | ||
libgl1 libGL.so.1 | ||
libglib2.0-0 libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0 | ||
libice6 libICE.so.6 | ||
libsm6 libSM.so.6 | ||
libstdc++6 libstdc++.so.6 | ||
libx11-6 libX11.so.6 | ||
libxext6 libXext.so.6 | ||
libxrender1 libXrender.so.1 | ||
============ ======================================================= | ||
|
||
On RPM-based systems, they are provided by these packages: | ||
|
||
============ ======================================================= | ||
Package Libraries | ||
============ ======================================================= | ||
glib2 libglib-2.0.so.0, libgthread-2.0.so.0, libgobject-2.0.so.0 | ||
glibc libresolv.so.2, libutil.so.1, libnsl.so.1, librt.so.1, | ||
libcrypt.so.1, libpthread.so.0, libdl.so.2, libm.so.6, | ||
libc.so.6 | ||
libICE libICE.so.6 | ||
libX11 libX11.so.6 | ||
libXext: libXext.so.6 | ||
libXrender libXrender.so.1 | ||
libgcc: libgcc_s.so.1 | ||
libstdc++ libstdc++.so.6 | ||
mesa libGL.so.1 | ||
============ ======================================================= | ||
|
||
3. If the wheel contains binary executables or shared objects linked | ||
against any whitelisted libraries that also export versioned | ||
symbols, they may only depend on the following maximum versions:: | ||
|
||
GLIBC_2.12 | ||
CXXABI_1.3.3 | ||
GLIBCXX_3.4.13 | ||
GCC_4.3.0 | ||
|
||
As an example, ``manylinux2`` wheels may include binary artifacts | ||
that require ``glibc`` symbols at version ``GLIBC_2.4``, because | ||
this an earlier version than the maximum of ``GLIBC_2.12``. | ||
4. If a wheel is built for any version of CPython 2 or CPython | ||
versions 3.0 up to and including 3.2, it *must* include a CPython | ||
ABI tag indicating its Unicode ABI. A ``manylinux2`` wheel built | ||
against Python 2, then, must include either the ``cpy27mu`` tag | ||
indicating it was built against an interpreter with the UCS-4 ABI | ||
or the ``cpy27m`` tag indicating an interpeter with the UCS-2 | ||
ABI. [8]_ [9]_ | ||
5. A wheel *must not* require the ``PyFPE_jbuf`` symbol. This is | ||
achieved by building it against a Python compiled *without* the | ||
``--with-fpectl`` ``configure`` flag. | ||
|
||
Compilation of Compliant Wheels | ||
=============================== | ||
|
||
Like ``manylinux1``, the ``auditwheel`` tool adds ```manylinux2`` | ||
platform tags to ``linux`` wheels built by ``pip wheel`` or | ||
``bdist_wheel`` in a ``manylinux2`` Docker container. | ||
|
||
Docker Images | ||
------------- | ||
|
||
``manylinux2`` Docker images based on CentOS 6 x86_64 and i686 are | ||
provided for building binary ``linux`` wheels that can reliably be | ||
converted to ``manylinux2`` wheels. [10]_ These images come with a | ||
full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` | ||
4.8.2) as well as the latest releases of Python and ``pip``. | ||
|
||
Compatibility with kernels that lack ``vsyscall`` | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
A Docker container assumes that its userland is compatible with its | ||
host's kernel. Unfortunately, an increasingly common kernel | ||
configuration breaks breaks this assumption for x86_64 CentOS 6 Docker | ||
images. | ||
|
||
Versions 2.14 and earlier of ``glibc`` require the kernel provide an | ||
archaic system call optimization known as ``vsyscall`` on x86_64. [11]_ | ||
To effect the optimization, the kernel maps a read-only page of | ||
frequently-called system calls -- most notably ``time(2)`` -- into | ||
each process at a fixed memory location. ``glibc`` then invokes these | ||
system calls by dereferencing a function pointer to the appropriate | ||
offset into the ``vsyscall`` page and calling it. This avoids the | ||
overhead associated with invoking the kernel that affects normal | ||
system call invocation. ``vsyscall`` has long been deprecated in | ||
favor of an equivalent mechanism known as vDSO, or "virtual dynamic | ||
shared object", in which the kernel instead maps a relocatable virtual | ||
shared object containing the optimized system calls into each | ||
process. [12]_ | ||
|
||
The ``vsyscall`` page has serious security implications because it | ||
does not participate in address space layout randomization (ASLR). | ||
Its predictable location and contents make it a useful source of | ||
gadgets used in return-oriented programming attacks. [13]_ At the same | ||
time, its elimination breaks the x86_64 ABI, because ``glibc`` | ||
versions that depend on ``vsyscall`` suffer from segmentation faults | ||
when attempting to dereference a system call pointer into a | ||
non-existent page. As a compromise, Linux 3.1 implemented an | ||
"emulated" ``vsyscall`` that reduced the executable code, and thus the | ||
material for ROP gadgets, mapped into the process. [14]_ | ||
``vsyscall=emulated`` has been the default configuration in most | ||
distribution's kernels for many years. | ||
|
||
Unfortunately, ``vsyscall`` emulation still exposes predicatable code | ||
at a reliable memory location, and continues to be useful for | ||
return-oriented programming. [15]_ Because most distributions have now | ||
upgraded to ``glibc`` versions that do not depend on ``vsyscall``, | ||
they are beginning to ship kernels that do not support ``vsyscall`` at | ||
all. [16]_ | ||
|
||
CentOS 5.11 and 6 both include versions of ``glibc`` that depend on | ||
the ``vsyscall`` page (2.5 and 2.12.2 respectively), so containers | ||
based on either cannot run under kernels provided with many | ||
distribution's upcoming releases. [17]_ Continuum Analytics faces a | ||
related problem with its conda software suite, and as they point out, | ||
this will pose a significant obstacle to using these tools in hosted | ||
services. [18]_ If Travis CI, for example, begins running jobs under | ||
a kernel that does not provide the ``vsyscall`` interface, Python | ||
packagers will not be able to use our Docker images there to build | ||
``manylinux`` wheels. [19]_ | ||
|
||
We have derived a patch from the ``glibc`` git repository that | ||
backports the removal of all dependencies on ``vsyscall`` to the | ||
version of ``glibc`` included with our ``manylinux2`` image. [20]_ | ||
Rebuilding ``glibc``, and thus building ``manylinux2`` image itself, | ||
still requires a host kernel that provides the ``vsyscall`` mechanism, | ||
but the resulting image can be both run on hosts that provide it and | ||
those that do not. Because the ``vsyscall`` interface is an | ||
optimization that is only applied to running processes, the | ||
``manylinux2`` wheels built with this modified image should be | ||
identical to those built on an unmodified CentOS 6 system. Also, the | ||
``vsyscall`` problem applies only to x86_64; it is not part of the | ||
i686 ABI. | ||
|
||
Auditwheel | ||
---------- | ||
|
||
The ``auditwheel`` tool has also been updated to produce | ||
``manylinux2`` wheels. [21]_ Its behavior and purpose are otherwise | ||
unchanged from PEP 513. | ||
|
||
|
||
Platform Detection for Installers | ||
================================= | ||
|
||
Platforms may define a ``manylinux2_compatible`` boolean attribute on | ||
the ``_manylinux`` module described in PEP 513. A platform is | ||
considered incompatible with ``manylinux2`` if the attribute is | ||
``False``. | ||
|
||
|
||
Backwards compatibility with ``manylinux1`` wheels | ||
================================================== | ||
|
||
As explained in PEP 513, the specified symbol versions for | ||
``manylinux1`` whitelisted libraries constitute an *upper bound*. The | ||
same is true for the symbol versions defined for ``manylinux2`` in | ||
this PEP. As a result, ``manylinux1`` wheels are considered | ||
``manylinux2`` wheels. A ``pip`` that recognizes the ``manylinux2`` | ||
platform tag will thus install ``manylinux1`` wheels for | ||
``manylinux2`` platforms -- even when explicitly set -- when no | ||
``manylinux2`` wheels are available. [22]_ | ||
|
||
PyPI Support | ||
============ | ||
|
||
PyPI should permit wheels containing the ``manylinux2`` platform tag | ||
to be uploaded in the same way that it permits ``manylinux1``. It | ||
should not attempt to verify the compatibility of ``manylinux2`` | ||
wheels. | ||
|
||
|
||
References | ||
========== | ||
|
||
.. [1] PEP 513 -- A Platform Tag for Portable Linux Built Distributions | ||
(https://www.python.org/dev/peps/pep-0513/) | ||
.. [2] pyca/cryptography | ||
(https://cryptography.io/) | ||
.. [3] numpy | ||
(https://numpy.org) | ||
.. [4] CentOS 5.11 EOL announcement | ||
(https://lists.centos.org/pipermail/centos-announce/2017-April/022350.html) | ||
.. [5] CentOS Product Specifications | ||
(https://web.archive.org/web/20180108090257/https://wiki.centos.org/About/Product) | ||
.. [6] PEP 425 -- Compatibility Tags for Built Distributions | ||
(https://www.python.org/dev/peps/pep-0425/) | ||
.. [7] ncurses 5 -> 6 transition means we probably need to drop some | ||
libraries from the manylinux whitelist | ||
(https://github.com/pypa/manylinux/issues/94) | ||
.. [8] PEP 3149 | ||
https://www.python.org/dev/peps/pep-3149/ | ||
.. [9] SOABI support for Python 2.X and PyPy | ||
https://github.com/pypa/pip/pull/3075 | ||
.. [10] manylinux2 Docker images | ||
(https://hub.docker.com/r/markrwilliams/manylinux2/) | ||
.. [11] On vsyscalls and the vDSO | ||
(https://lwn.net/Articles/446528/) | ||
.. [12] vdso(7) | ||
(http://man7.org/linux/man-pages/man7/vdso.7.html) | ||
.. [13] Framing Signals -- A Return to Portable Shellcode | ||
(http://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf) | ||
.. [14] ChangeLog-3.1 | ||
(https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.1) | ||
.. [15] Project Zero: Three bypasses and a fix for one of Flash's Vector.<*> mitigations | ||
(https://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html) | ||
.. [16] linux: activate CONFIG_LEGACY_VSYSCALL_NONE ? | ||
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852620) | ||
.. [17] [Wheel-builders] Heads-up re: new kernel configurations breaking the manylinux docker image | ||
(https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html) | ||
.. [18] Due to glibc 2.12 limitation, static executables that use | ||
time(), cpuinfo() and maybe a few others cannot be run on systems | ||
that do not support or use `vsyscall=emulate` | ||
(https://github.com/ContinuumIO/anaconda-issues/issues/8203) | ||
.. [19] Travis CI | ||
(https://travis-ci.org/) | ||
.. [20] remove-vsyscall.patch | ||
https://github.com/markrwilliams/manylinux/commit/e9493d55471d153089df3aafca8cfbcb50fa8093#diff-3eda4130bdba562657f3ec7c1b3f5720 | ||
.. [21] auditwheel manylinux2 branch | ||
(https://github.com/markrwilliams/auditwheel/tree/manylinux2) | ||
.. [22] pip manylinux2 branch | ||
https://github.com/markrwilliams/pip/commits/manylinux2 | ||
Copyright | ||
========= | ||
|
||
This document has been placed into the public domain. | ||
|
||
.. | ||
Local Variables: | ||
mode: indented-text | ||
indent-tabs-mode: nil | ||
sentence-end-double-space: t | ||
fill-column: 70 | ||
coding: utf-8 | ||
End: |