Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop support for Python < 2.7.9 #4350

Closed
dstufft opened this issue Mar 20, 2017 · 34 comments
Closed

Drop support for Python < 2.7.9 #4350

dstufft opened this issue Mar 20, 2017 · 34 comments
Labels
auto-locked Outdated issues that have been locked by automation kind: backwards incompatible Would be backward incompatible type: maintenance Related to Development and Maintenance Processes type: security Has potential security implications

Comments

@dstufft
Copy link
Member

dstufft commented Mar 20, 2017

Older versions of Python 2.7 (prior to 2.7.9) don't have the capability to have a good TLS configuration, and thus it would be great to drop support for them. We're not currently at the point that we can do that, but I wanted to open this issue to both track that we should do that at some point, and also take notes about the current state and how to query in the future.

Results:

Python Version Download Count Percent
>=2.7.9 268456729 53%
<2.7.9 155669164 31%
3.5 35605564 7%
3.4 23114420 5%
2.6 13023950 3%
3.6 11948118 2%
3.3 1278506 0.3%
3.7 231670 0.05%

The results can be queried with:

SELECT
  CASE
    WHEN REGEXP_MATCH(details.python, r"2\.7\.(9|\d\d)") THEN '>=2.7.9'
    WHEN REGEXP_MATCH(details.python, r"2\.7\.") THEN '<2.7.9'
    ELSE REGEXP_EXTRACT(details.python, r"^([^\.]+\.[^\.]+)")
  END AS python_version,
  COUNT(*) AS download_count,
FROM
  TABLE_DATE_RANGE( [the-psf:pypi.downloads], DATE_ADD(CURRENT_TIMESTAMP(), -31, "day"), DATE_ADD(CURRENT_TIMESTAMP(), -1, "day") )
WHERE
  details.installer.name = 'pip'
GROUP BY
  python_version,
ORDER BY
  download_count DESC
LIMIT
  100
@alex
Copy link
Member

alex commented Mar 20, 2017

Your query isn't totally correct, 2.7.9 itself doesn't match the >= regex, it matches the second one.

@dstufft
Copy link
Member Author

dstufft commented Mar 20, 2017

Ah right, I blame doing this early in the morning :)

@dstufft
Copy link
Member Author

dstufft commented Mar 20, 2017

Ok, updated the query and results in my first post to reflect the real numbers. Still not enough to drop support for it yet, but the numbers look a lot better this way.

@nicktimko
Copy link
Contributor

nicktimko commented Mar 27, 2017

Does RedHat/CentOS 7 Python 2.7.5 count in those stats and also as having "bad TLS"? I'm a CentOS noob, but supposedly the RH-packaged Python has some fixes backported, i.e. are you talking about TLS 1.2?

Is your plan about denying pip based on Python <2.7.9, or a breaking change that earlier versions won't like (TLS 1.2)?

$ cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core) 
$ python -V
Python 2.7.5
$ python -c "import json, urllib2; print json.load(urllib2.urlopen(\
    'https://www.howsmyssl.com/a/check'))['tls_version']"
TLS 1.2

@dstufft
Copy link
Member Author

dstufft commented Mar 27, 2017

@nickstenning Those statistics count whatever is returned by platform.python_version() on that Python, so presumably the RHEL/CentOS 2.7.5 is showing up as 2.7.5 and thus is <2.7.9. I don't know what patches they've applied.

To be clear, this is not a short term issue. It is a place holder mostly to track how >=2.7.9 adoption is going to make a decision about when it is the right time to drop support. That is unlikely to be before it gets into single digit usage.

@nicktimko
Copy link
Contributor

nicktimko commented Mar 27, 2017

True, I guess I'm just alluding to how RHEL versions are super-clingy (10-year support, so 7, which has 2.7.5, might be around in reasonable numbers until 2024). That might keep the "<2.7.9" number artificially high, while not truly being reflective of the users you'd disrupt.

@RonnyPfannschmidt
Copy link
Contributor

is there a reasonable/reliable way to identify the actual distribution of linux that's being used?

@dstufft
Copy link
Member Author

dstufft commented Jun 1, 2017

Here are today's numbers:

Python Version Download Count Percent Delta
>=2.7.9 293349226 54% +1%
<2.7.9 141872623 26% -5%
3.5 46714212 9% +2%
3.4 23322504 4% +1%
3.6 22756263 4% +2%
2.6 14454324 3% +0%
3.3 1321803 0.2% -0.1%
3.7 285099 0.05% +0%

@nicktimko
Copy link
Contributor

Please let me know if I'm way off-base and looking at the wrong thing... To examine that <2.7.9 group a bit:

Python Version Distro Downloads % urllib2 uses TLS 1.2
2.7.6 Ubuntu 14.04 54809493 10.07% yes*
2.7.6 null null 37728581 6.94% ???
2.7.5 CentOS Linux 7 17152820 3.15% yes*
2.7.3 Ubuntu 12.04 6890300 1.27% yes*
2.7.5 CentOS Linux 7.3.1611 3596891 0.66% yes
2.7.5 CentOS Linux 7.2.1511 2534648 0.47% yes
2.7.3 null null 2504846 0.46% ???
2.7.6 Ubuntu 12.04 2355638 0.43% yes*
2.7.5 null null 2329032 0.43% ???
2.7.5 RHEL Server 7.3 2223663 0.41% yes
2.7.6 debian jessie/sid 841274 0.15% ?‡
2.7.3 debian 7.11 748925 0.14% ?‡

* depends on patch level
† this could plausibly be Ubuntu 14.04/16.04/CentOS 7?
‡ don't have any Debian systems

Query (full results):

SELECT
  details.python as python_version,
  details.distro.name as distro_name,
  details.distro.version as distro_version,
  COUNT(*) AS download_count,
FROM
  TABLE_DATE_RANGE( [the-psf:pypi.downloads], DATE_ADD(CURRENT_TIMESTAMP(), -31, "day"), DATE_ADD(CURRENT_TIMESTAMP(), -1, "day") )
WHERE
  details.installer.name = 'pip'
  AND REGEXP_MATCH(details.python, r'2\.7\.[0-8]($|[^\d])')
GROUP BY
  python_version, distro_name, distro_version
ORDER BY
  download_count DESC
LIMIT
  100

"urllib2 uses TLS 1.2" checked by the output of python -c "import json, urllib2; print json.load(urllib2.urlopen('https://www.howsmyssl.com/a/check'))['tls_version']"

  • Ubuntu 12.04.5 LTS / Python 2.7.3 (default, Oct 26 2016, 21:01:49)
  • Ubuntu 14.04.5 LTS / Python 2.7.6 (default, Oct 26 2016, 20:30:19)
  • Ubuntu 14.04.5 LTS / Python 2.7.6 (default, Jun 22 2015, 17:58:13)
  • CentOS Linux release 7.2.1511 / Python 2.7.5 (default, Aug 18 2016, 15:58:25)
  • CentOS Linux release 7.3.1611 / Python 2.7.5 (default, Nov 6 2016, 00:28:07)

From a thread on Python-Dev and it pointing to https://github.com/ouspg/trytls/tree/shootout-0.2/shootout I'm not sure if there's a problem with the functionality (insecure as it may be, but then older TLS are insecure anyways?) of TLS in older distros/Python 2.7.x's, or if it would actually break things.

@nicktimko
Copy link
Contributor

Oh, looks like the TLS protocol is logged to the DB:

Python TLSv1 TLSv1.1 TLSv1.2 Totals
2.6 0.19% 0.01% 2.45% 2.66%
<2.7.9 0.27% 0.25% 25.55% 26.07%
≥2.7.9 2.66% 0.31% 50.93% 53.90%
3.3 0.00% 0.03% 0.21% 0.24%
3.4 0.04% 0.11% 4.14% 4.28%
3.5 0.06% 0.15% 8.37% 8.58%
3.6 0.00% 0.09% 4.09% 4.18%
3.7 0.01% 0.04% 0.05%
Totals 3.22% 0.96% 95.79% 99.96%

I guess I'm now even more confused as most of the old TLS connections (in number and proportion) are made by Python ≥2.7.9.

@dstufft
Copy link
Member Author

dstufft commented Jun 1, 2017

@nicktimko I think that is macOS.

@dstufft
Copy link
Member Author

dstufft commented Jun 1, 2017

I was curious how this compared over time, so here are the percent of downloads using >=2.7,2.7.9:

Month Percent of Downloads >=2.7,<2.7.9 Delta
2017-06 24.5 -1.6%
2017-05 26.1 -3%
2017-04 29.1 -0.4%
2017-03 29.5 -3.4%
2017-02 32.9 -3.2%
2017-01 36.1 -0.9%
2016-12 37.0 -0.2%
2016-11 37.2 -1.6%
2016-10 38.8 -3.6%
2016-09 42.4 -3.3%
2016-08 45.7 -0.5%
2016-07 46.2 -1%
2016-06 47.7

Gotten using this query:

SELECT
  STRFTIME_UTC_USEC(timestamp, "%Y-%m") AS yyyymm,
  ROUND(100 * SUM(CASE
        WHEN REGEXP_MATCH(details.python, r"2\.7\.(9|\d\d)") THEN 0
        WHEN REGEXP_MATCH(details.python, r"2\.7\.") THEN 1
        ELSE 0 END) / COUNT(*), 1) AS percent_lt279,
  COUNT(*) AS download_count
FROM
  TABLE_DATE_RANGE(
    [the-psf:pypi.downloads],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, "year"),
    CURRENT_TIMESTAMP()
  )
WHERE
  details.installer.name = 'pip'
GROUP BY
  yyyymm
ORDER BY
  yyyymm DESC
LIMIT
  100

@nicktimko
Copy link
Contributor

What's the change that wants to be made that would break <2.7.9?

@dstufft
Copy link
Member Author

dstufft commented Jun 1, 2017

@nicktimko Remove the need to continue to support emulation for SSLContext objects, which would also free up the ability to start trusting the platform network store on Linux machines and to allow Python to start validating the hostname instead of having to copy that functionality into requests. It also will allow us to start mandating TLSv1.2+ on the client side.

@pradyunsg
Copy link
Member

I'm curious as to what dropping this support would look like: will the current warning be changed into a fatal-error message and pip aborting or it'll be allow albeit requiring jumping through hoops to make it possible for the end-user?

I think it should be the former.

I was curious how this compared over time,

It looks to me like it'll be another 6 months+ till the amount of requests from Python < 2.7.9 will stay significant?

@dstufft
Copy link
Member Author

dstufft commented Jun 13, 2017

@pradyunsg It's not entirely defined, but if we do it like we've done the other ones, we'll just drop support, update the python_requires in the setup.py and be done with it. That means that on pip<9 it will just install it and possibly fail at runtime from something incompatible and on pip>=9 it will ignore it when looking at PyPI and will fail if you attempt to install it anyways.

We could add a install time check in setup.py if we so felt inclined to do so as well.

@pradyunsg
Copy link
Member

We could add a install time check in setup.py if we so felt inclined to do so as well.

Sounds good.

@pradyunsg pradyunsg added kind: backwards incompatible Would be backward incompatible type: maintenance Related to Development and Maintenance Processes type: security Has potential security implications labels Jun 28, 2017
@hugovk
Copy link
Contributor

hugovk commented Dec 19, 2017

It looks to me like it'll be another 6 months+ till the amount of requests from Python < 2.7.9 will stay significant?

We're now 6 months on, how are the numbers looking now?

@pradyunsg
Copy link
Member

Running the query @dstufft posted above (#4350 (comment)):

Month Downloads % of >=2.7,<2.7.9
2018-01 17.8
2017-12 17.6
2017-11 18.5
2017-10 19.7
2017-09 21.3
2017-08 22.8
2017-07 24.1
2017-06 24.0
2017-05 26.1
2017-04 29.1
2017-03 29.5

It's receded slower than I'd anticipated.

@hugovk
Copy link
Contributor

hugovk commented Jan 26, 2018

Thanks for running the query. Here's the same numbers with deltas, combined with the earlier numbers from #4350 (comment):

Month Downloads % of >=2.7,<2.7.9 Delta
2018-01 17.8 0.2%
2017-12 17.6 -0.9%
2017-11 18.5 -1.2%
2017-10 19.7 -1.6%
2017-09 21.3 -1.5%
2017-08 22.8 -1.3%
2017-07 24.1 0.1%
2017-06 24 -2.1%
2017-05 26.1 -3.0%
2017-04 29.1 -0.4%
2017-03 29.5 -3.4%
2017-02 32.9 -3.2%
2017-01 36.1 -0.9%
2016-12 37 -0.2%
2016-11 37.2 -1.6%
2016-10 38.8 -3.6%
2016-09 42.4 -3.3%
2016-08 45.7 -0.5%
2016-07 46.2 -1.5%
2016-06 47.7  

And charted:

image

And with a trendline:

image

@pradyunsg
Copy link
Member

Thanks @hugovk! ^.^

Looking at the general trend, I think we should come back to this in ~4/5 months from now. I propose 14 June 2018. :P

@pradyunsg
Copy link
Member

pradyunsg commented May 7, 2018

@dstufft @di @ewdurbin Did pip on Python < 2.7.9 break due to the removal of TLS 1.0/1.1 support from PyPI?

(sorry if I'm too noisy)

@5j9
Copy link

5j9 commented May 7, 2018

Did pip on Python < 2.7.9 break due to the removal of TLS 1.0/1.1 support from PyPI?

I know that AppVeyor jobs for Python < 2.7.9 (also 3.4.0 but not 3.4.1+), installed using msi Windows installers, all started to fail because of the TLS issue and pip not being able to communicate with PyPI. I don't know about Linux or source compiled installations though.

@dstufft
Copy link
Member Author

dstufft commented May 24, 2018

Today's numbers look like:

Python Version Download Count Percent
>=2.7.9 258067087 54.76%
3.6 89836948 19.06%
<2.7.9 68652338 14.57%
3.5 36294439 7.70%
3.4 14786255 3.14%
2.6 2940512 0.62%
3.7 436672 0.09%
3.3 193578 0.04%
Month Downloads % of >=2.7,<2.7.9 Delta
2018-05 14.0 -1.1%
2018-04 15.1 -0.1%
2018-03 15.2 -0.7%
2018-02 15.9 -1.8%
2018-01 17.7 0.1%
2017-12 17.6 -0.9%
2017-11 18.5 -1.2%
2017-10 19.7 -1.6%
2017-09 21.3 -1.5%
2017-08 22.8 -1.3%
2017-07 24.1 0.1%
2017-06 24 -2.1%
2017-05 26.1 -3.0%
2017-04 29.1 -0.4%
2017-03 29.5 -3.4%
2017-02 32.9 -3.2%
2017-01 36.1 -0.9%
2016-12 37 -0.2%
2016-11 37.2 -1.6%
2016-10 38.8 -3.6%
2016-09 42.4 -3.3%
2016-08 45.7 -0.5%
2016-07 46.2 -1.5%
2016-06 47.7  

@hugovk
Copy link
Contributor

hugovk commented May 24, 2018

Charted:

image

With trendline:

image

@nicktimko
Copy link
Contributor

That linear trendline is misleading; it will be an exponential with a tail that will never go to 0 (until it's forced to)

@hugovk
Copy link
Contributor

hugovk commented May 24, 2018

Yeah, it does demonstrate that the decline slowing.

@pradyunsg
Copy link
Member

Running this BigQuery again, for the past 6 months:

yyyymm percent_lt279 download_count
2019-10 3.5 1919909312
2019-09 3.7 3214786202
2019-08 4.2 3076675931
2019-07 4.5 3033745291
2019-06 5.3 2729569347
2019-05 5.9 2757051716
2019-04 6.2 828207478

It's less than 5% now, so I'm happy to completely drop support in our next release.

@pradyunsg
Copy link
Member

/cc @dstufft ^

@nicktimko
Copy link
Contributor

@pradyunsg general follow up to what I was on about way earlier: is this about removing TLS 1.0 support or explicitly checking and rejecting Python <2.7.9? If the former, can you query against the TLS version used rather than the Python version reported?

@hugovk
Copy link
Contributor

hugovk commented Oct 17, 2019

Charted, this time with a polynomial trendline:

image

@dstufft
Copy link
Member Author

dstufft commented Nov 16, 2019

@nicktimko I think the biggest thing is just simply reducing the surface area of support we have. Generally we decide this on an X.Y basis, because that draws the cleanest lines around major changes. However, due to 2.7's age, 2.7.9 is a bit of a special case in that it introduced a backported ssl module from Python 3.

This work would effectively allow us to better take advantage on the capabilities of that back ported SSL module, for instance we can use the configuration settings so that pip itself will not function with older versions of TLS, we can possibly start relying on the platform trust stores in more situations, etc.

But really the biggest thing is narrowing the supported configurations.

To @pradyunsg I think sub 5% is a perfectly fine point to drop support for older versions of 2.7.

@pradyunsg
Copy link
Member

Filed #7362, where we can discuss the how we'd do the removal. If someone wants to discuss if, when or why, please continue to do so in this issue.

@hugovk
Copy link
Contributor

hugovk commented Apr 20, 2020

For some reason GitHub gives me a 500 error on #7362, so I can't see the close reason.

Anyway, do you think this should be swept into the general Python 2.7.* removals after pip 20.3 is released in October 2020?

#6148 (comment)

If so, close this?

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 20, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation kind: backwards incompatible Would be backward incompatible type: maintenance Related to Development and Maintenance Processes type: security Has potential security implications
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants