Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tunnel CONNECT response headers to httplib / http.client #69152

Closed
thomasbelhalfaoui mannequin opened this issue Aug 30, 2015 · 11 comments
Closed

Add tunnel CONNECT response headers to httplib / http.client #69152

thomasbelhalfaoui mannequin opened this issue Aug 30, 2015 · 11 comments
Labels
3.12 bugs and security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@thomasbelhalfaoui
Copy link
Mannequin

thomasbelhalfaoui mannequin commented Aug 30, 2015

BPO 24964
Nosy @terryjreedy, @vadmium, @OneMoreZanuda
PRs
  • gh-69152: Add _proxy_response_headers attribute to HTTPConnection #26152
  • Files
  • httplib.py: Proposed patch for httplib.py
  • http-detach.patch: HTTPConnection.detach() implementation only
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-08-30.17:41:50.140>
    labels = ['type-feature', 'library', '3.11']
    title = 'Add tunnel CONNECT response headers to httplib / http.client'
    updated_at = <Date 2021-05-15.21:19:25.060>
    user = 'https://bugs.python.org/thomasbelhalfaoui'

    bugs.python.org fields:

    activity = <Date 2021-05-15.21:19:25.060>
    actor = 'alexey.namyotkin'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2015-08-30.17:41:50.140>
    creator = 'thomas.belhalfaoui'
    dependencies = []
    files = ['40301', '40371']
    hgrepos = []
    issue_num = 24964
    keywords = ['patch']
    message_count = 10.0
    messages = ['249361', '249373', '249741', '249803', '249897', '249906', '249996', '250085', '393729', '393730']
    nosy_count = 4.0
    nosy_names = ['terry.reedy', 'martin.panter', 'thomas.belhalfaoui', 'alexey.namyotkin']
    pr_nums = ['26152']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue24964'
    versions = ['Python 3.11']

    Linked PRs

    @thomasbelhalfaoui
    Copy link
    Mannequin Author

    thomasbelhalfaoui mannequin commented Aug 30, 2015

    When using httplib / http.client to connect to an HTTPS website through a proxy (by making a tunnel with a CONNECT request), there is no way to retrieve the HTTP headers which the proxy sends back in response to that CONNECT request.

    This becomes a problem when using rotating proxy providers like ProxyMesh, who send useful information in those headers (for instance, "X-ProxyMesh-IP" contains the IP address of the proxy, which is necessary to keep the same address throughout the session).

    It would be nice to save those headers in a property of the HTTPConnection class (e.g. self._tunnel_response_headers), which would be set up inside the _tunnel method (as proposed in the attached patch, lines 748 and 827-831). This would allow to get the headers back and/or pass them to a higher-level library (such as requests).

    @thomasbelhalfaoui thomasbelhalfaoui mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Aug 30, 2015
    @vadmium
    Copy link
    Member

    vadmium commented Aug 30, 2015

    Such a change would involve adding a new API, so should go into a new version of Python.

    Thomas: a diff rather than a full copy of the changed file would be more convenient. Also, if this gets accepted, test cases and documentation would be needed.

    It is also useful to get the header of an unsuccessful CONNECT response. For example, see bpo-7291, where the Proxy-Authenticate header of the proxy’s 407 response needs to be accessible. In that issue, I started working on a patch tht may also be useful here. From memory, usage would be a bit like this:

    proxy_conn = HTTPConnection("proxy")
    proxy_conn.request("CONNECT", "website:443")
    proxy_resp = proxy_conn.getresponse()
    if proxy_resp.status == PROXY_AUTHENTICATION_REQUIRED:
        # Handle proxy_resp.msg["Proxy-Authenticate"]
        ...
    # Handle proxy_resp.msg["X-ProxyMesh-IP"]
    ...
    tunnel = proxy_conn.detach()  # Returns socket and any buffered data
    website_conn = HTTPSConnection("website", tunnel=tunnel)
    website_conn.request("GET", "/")
    ...
    website_conn.close()

    Thomas, let me know if this would be useful for you, and I can try and dig up my patch.

    @thomasbelhalfaoui
    Copy link
    Mannequin Author

    thomasbelhalfaoui mannequin commented Sep 4, 2015

    Martin: Thanks for your quick answer (and sorry for sending the whole file) !
    I think it is indeed a good idea to detach the proxy connection and treat it as any other connection, as you did in your patch. It would be great if you would be able to dig it up !

    @terryjreedy
    Copy link
    Member

    Thomas, please sign a contributor agreement for your patches to be considered.
    https://www.python.org/psf/contrib/
    https://www.python.org/psf/contrib/contrib-form/

    @vadmium
    Copy link
    Member

    vadmium commented Sep 5, 2015

    This is the patch I had in mind. It looks like it only implements the detach() method, so we would still need to add support for passing in the tunnel details to the HTTPSConnection constructor.

    This patch would allow doing stuff at a lower level than the existing tunnel functionality. The patch includes a test case for getting the proxy’s response header fields, and another test case illustrating how a plain text HTTP 2 upgrade could work.

    @thomasbelhalfaoui
    Copy link
    Mannequin Author

    thomasbelhalfaoui mannequin commented Sep 5, 2015

    Terry: Thanks for the form, I just filled it.

    Martin: Thanks for sending your patch. I will dive into it, and try to figure out how to add support for passing in the tunnel details to the HTTPSConnection constructor.

    @thomasbelhalfaoui
    Copy link
    Mannequin Author

    thomasbelhalfaoui mannequin commented Sep 6, 2015

    Martin, I went through your patch and made some simple tests, and I have a couple of questions.

    1. When I run the following code, I get a "Bad file descriptor" :
    conn = httplib.HTTPConnection("uk.proxymesh.com", 31280)
    conn.set_tunnel("www.google.com", 80)
    conn.request("GET", "/")
    resp = conn.getresponse()
    print(resp.read())

    So I tweaked the "getresponse" function so that it does not call "self.close()" (i.e. the connection stays open after the CONNECT request) in that case, and it seems to works fine.

    1. I added "self.sock, _ = tunnel" in HTTPConnection constructor, to try your use case, but I get "http.client.RemoteDisconnected: Remote end closed connection without response".

    Do you think it makes sense or am I missing something ?

    @vadmium
    Copy link
    Member

    vadmium commented Sep 7, 2015

    1. The real problem is when _tunnel() internally calls getresponse(), it notices the connection cannot be reused for another request, and closes the socket object. Perhaps I should rethink my logic; maybe move sock and detach() to HTTPResponse.

    2. With some rough experimentation, passing tunnel through the HTTPConnection (plain text HTTP) constructor seems to work for me. However if you meant HTTPSConnection (over TLS) instead, you will probably need to manually do the wrap_socket() step. Maybe that’s why your connection is being dropped.

    @terryjreedy
    Copy link
    Member

    Alexey, to repeat what I said to Thomas above: please sign a contributor agreement for your patches to be considered.
    https://www.python.org/psf/contrib/
    https://www.python.org/psf/contrib/contrib-form/

    @terryjreedy terryjreedy added the 3.11 only security fixes label May 15, 2021
    @OneMoreZanuda
    Copy link
    Mannequin

    OneMoreZanuda mannequin commented May 15, 2021

    Thanks, Terry. I signed it.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    orsenthil added a commit that referenced this issue May 5, 2023
    …6152)
    
    Add _proxy_response_headers attribute to HTTPConnection (#26152)
    
    ---------
    
    Co-authored-by: Senthil Kumaran <[email protected]>
    carljm added a commit to carljm/cpython that referenced this issue May 5, 2023
    * main:
      pythongh-99113: Add PyInterpreterConfig.own_gil (pythongh-104204)
      pythongh-104146: Remove unused var 'parser_body_declarations' from clinic.py (python#104214)
      pythongh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (pythongh-104205)
      pythongh-104108: Add the Py_mod_multiple_interpreters Module Def Slot (pythongh-104148)
      pythongh-99113: Share the GIL via PyInterpreterState.ceval.gil (pythongh-104203)
      pythonGH-100479: Add `pathlib.PurePath.with_segments()` (pythonGH-103975)
      pythongh-69152: Add _proxy_response_headers attribute to HTTPConnection (python#26152)
      pythongh-103533: Use PEP 669 APIs for cprofile (pythonGH-103534)
      pythonGH-96803: Add three C-API functions to make _PyInterpreterFrame less opaque for users of PEP 523. (pythonGH-96849)
    jbower-fb pushed a commit to jbower-fb/cpython-jbowerfb that referenced this issue May 8, 2023
    …on (python#26152)
    
    Add _proxy_response_headers attribute to HTTPConnection (python#26152)
    
    ---------
    
    Co-authored-by: Senthil Kumaran <[email protected]>
    @arhadthedev arhadthedev added 3.13 bugs and security fixes and removed 3.11 only security fixes labels May 11, 2023
    gpshead added a commit that referenced this issue May 16, 2023
    …ss (#104248)
    
    Add http.client.HTTPConnection method get_proxy_response_headers() - this is a followup to #26152 which added it as a non-public attribute. This way we don't pre-compute a headers dictionary that most users will never access. The new method is properly public and documented and triggers full proxy header parsing into a dict only when actually called.
    
    ---------
    
    Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
    Co-authored-by: Gregory P. Smith <[email protected]>
    @arhadthedev arhadthedev added 3.12 bugs and security fixes and removed 3.13 bugs and security fixes labels May 16, 2023
    @gpshead
    Copy link
    Member

    gpshead commented May 16, 2023

    thanks for the contribution!

    @gpshead gpshead closed this as completed May 16, 2023
    carljm added a commit to carljm/cpython that referenced this issue May 16, 2023
    * main:
      pythonGH-104510: Fix refleaks in `_io` base types (python#104516)
      pythongh-104539: Fix indentation error in logging.config.rst (python#104545)
      pythongh-104050: Don't star-import 'types' in Argument Clinic (python#104543)
      pythongh-104050: Add basic typing to CConverter in clinic.py (python#104538)
      pythongh-64595: Fix write file logic in Argument Clinic (python#104507)
      pythongh-104523: Inline minimal PGO rules (python#104524)
      pythongh-103861: Fix Zip64 extensions not being properly applied in some cases (python#103863)
      pythongh-69152: add method get_proxy_response_headers to HTTPConnection class (python#104248)
      pythongh-103763: Implement PEP 695 (python#103764)
      pythongh-104461: Run tkinter test_configure_screen on X11 only (pythonGH-104462)
      pythongh-104469: Convert _testcapi/watchers.c to use Argument Clinic (python#104503)
      pythongh-104482: Fix error handling bugs in ast.c (python#104483)
      pythongh-104341: Adjust tstate_must_exit() to Respect Interpreter Finalization (pythongh-104437)
      pythonGH-102613: Fix recursion error from `pathlib.Path.glob()` (pythonGH-104373)
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 bugs and security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants