Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fails to re-establish a connection after keep-alive timout #468

Open
evgeni opened this issue Aug 7, 2019 · 5 comments · May be fixed by #605
Open

fails to re-establish a connection after keep-alive timout #468

evgeni opened this issue Aug 7, 2019 · 5 comments · May be fixed by #605
Labels
ready requests issues related to requests library verified can replicate This issue has linked a toy repo that replicates the issue
Milestone

Comments

@evgeni
Copy link
Contributor

evgeni commented Aug 7, 2019

vcrpy 2.0.1, requests 2.22.0

We are using requests with sessions to call an API endpoint to trigger a job and then call another endpoint in a loop to get the job status and thus wait until the job is done. The server is an Apache httpd with KeepAliveTimeout 5 and a Rails app.

The code is working fine outside of VCR, but as soon as we enable VCR recording, the "wait" part of the code fails (requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",)) on Python 2.7, requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) on Python 3.7)

Example code:

import time

import urllib3
urllib3.disable_warnings()
import requests

import vcr

BASE = 'https://foreman.example.com'
STATUS = '{}/api/v2/status'.format(BASE)
DELETE = '{}/katello/api/organizations/10/subscriptions/delete_manifest'.format(BASE)


def wait(task):
    while task['state'] != 'stopped':
        print("sleeping for {}".format(task['id']))
        time.sleep(10)
        r = s.get("{}/foreman_tasks/api/tasks/{}".format(BASE, task['id']))
        task = r.json()

def delete():
    print("delete")
    r = s.post(DELETE, json={})
    print(r.headers)
    wait(r.json())

with vcr.use_cassette('katello.yaml'):
    s = requests.Session()
    s.verify = False
    s.auth = ('admin', 'changeme')

    delete()

output:

python test_katello_vcr.py
delete
{'status': '202 Accepted', 'x-request-id': 'd117a579-0f05-4e43-b2f9-7ebadba8e07f', 'x-xss-protection': '1; mode=block', 'x-download-options': 'noopen', 'x-content-type-options': 'nosniff', 'x-powered-by': 'Phusion Passenger 4.0.53', 'set-cookie': 'request_method=POST; path=/; secure; HttpOnly; SameSite=Lax, _session_id=20163f242bd29584ae7a8c5674508431; path=/; secure; HttpOnly; SameSite=Lax', 'strict-transport-security': 'max-age=631139040; includeSubdomains', 'foreman_version': '1.22.0', 'keep-alive': 'timeout=5, max=10000', 'server': 'Apache', 'content-security-policy': "default-src 'self'; child-src 'self'; connect-src 'self' ws: wss:; img-src 'self' data: *.gravatar.com; script-src 'unsafe-eval' 'unsafe-inline' 'self'; style-src 'unsafe-inline' 'self'", 'x-runtime': '0.209123', 'connection': 'Keep-Alive', 'x-permitted-cross-domain-policies': 'none', 'cache-control': 'no-cache', 'date': 'Wed, 07 Aug 2019 08:17:49 GMT', 'apipie-checksum': 'f287e71ef7536ad92a99a61830ffabc07a800b26', 'x-frame-options': 'sameorigin', 'content-type': 'application/json; charset=utf-8', 'foreman_api_version': '2'}
sleeping for ac06848f-1be6-462c-92a2-116a0d20a8eb
Traceback (most recent call last):
  File "test_katello_vcr.py", line 32, in <module>
    delete()
  File "test_katello_vcr.py", line 25, in delete
    wait(r.json())
  File "test_katello_vcr.py", line 18, in wait
    r = s.get("{}/foreman_tasks/api/tasks/{}".format(BASE, task['id']))
  File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))

As you can see, the server answers with 'keep-alive': 'timeout=5, max=10000' to the POST, but we use time.sleep(10) to wait.
Lowering the sleep to 4 (so below the timeout) avoids the problem.
However, it should be the responsibility of the underlying http client to re-open the connection if needed. And requests does so just fine if used outside of VCR. Thus we think this is a bug in VCR.

@neozenith
Copy link
Collaborator

Thank you for reporting this bug! Much appreciated.

We recently released v2.1.0 with a lot of fixes. Although this doesn't look like one of the fixes included. Could you try with the latest version too so we can rule that out before proceeding?

Would you be open to contributing a pull request to address this? I'd be more than happy to collaborate getting the TravisCI to pass to get it over the line. Then we can put out a fix shortly.

If you haven't got the bandwidth, that's fine too. Is there a toy repo that we could clone to replicate the issue at least?

@evgeni
Copy link
Contributor Author

evgeni commented Aug 13, 2019

Hey,

so as you expected, 2.1.0 doesn't fix the issue. And I must admit I don't really have the bandwidth to dive into the details of VCR itself to help you fixing this. However, I sat down and made you a smaller reproducer, so that's something ;)

import time
import requests
import vcr

URL = 'https://www.die-welt.net/'

with vcr.use_cassette('test.yaml', record_mode='all'):
    s = requests.Session()
    s.verify = False

    s.get(URL)
    time.sleep(10)
    s.get(URL)

If I am not mistaken, you can point URL at any Apache httpd in default config, and it will have KeepAliveTimeout 5 and thus fail if you try a second GET after 10 seconds.

@neozenith
Copy link
Collaborator

Awesome! I’ll have a crack at replicating it myself with that sample. That’s really helpful thanks.

@neozenith neozenith added ready requests issues related to requests library verified can replicate This issue has linked a toy repo that replicates the issue labels Aug 19, 2019
@neozenith neozenith added this to the v2.1.1 milestone Aug 24, 2019
@neozenith neozenith modified the milestones: v2.1.2, v4.0.x, v4.1.x Dec 14, 2019
@Dunedan
Copy link

Dunedan commented Sep 15, 2020

I can confirm that this is still an issue with vcrpy 4.1.0. It took me a while to figure out that vcrpy is causing the problem and not the remote server, as the error message can be pretty misleading. I'd appreciate if this issue could get fixed.

peterisr added a commit to peterisr/vcrpy that referenced this issue Aug 31, 2021
With old patching method, urllib3 never detected TCP connections that
were closed by the server side. For example, persistent HTTP connection
that were closed by the server (e.g., due to timeout) were not
recognized as closed. Any following requests that attempted to reuse
the same, closed connection caused the following failure:

    urllib3.exceptions.ProtocolError: ('Connection aborted.',
       RemoteDisconnected('Remote end closed connection without response'
    ))

Fixes: kevin1024#468
@peterisr
Copy link

Hello! I am also one of the unfortunate ones who has encountered this issue. Thankfully I managed to debug and fix it. See PR #605.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready requests issues related to requests library verified can replicate This issue has linked a toy repo that replicates the issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants