Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ConnectionPools): added connection pools class #85

Closed
wants to merge 1 commit into from

Conversation

william-silversmith
Copy link

@william-silversmith william-silversmith commented May 19, 2017

Adding connection pools capability, but will add the actual pools
to Storage when other PRs are merged first.

In an earlier iteration the ConnectionPool capped the number of connections that could be outstanding. However, after some more usage, it became apparent that there were threading issues.
I tried to write a thread safe version that was inspired by urlib3.HTTPConnectionPool, but it turned out that it was quite difficult to perform 'queue.put(block=True)' and accurately count the number of outstanding connections without triggering a deadlock.

e.g.

self._queue.put(block=True)
# This gap is dangerous and in testing resulted in inconsistencies
with self._lock:
    self.outstanding += 1

The new philosophy is that in reality a substantial number of connections can remain open without ill effects (dozens or hundreds), the important thing is to allow them to be reused, which will substantially slow their growth, maybe completely depending on how the code is written. Thus this version of ConnectionPool is thread safe, but does not cap the number of outstanding connections.

Testing Sufficiency Declaration
The pool is stressed in test_connections. A production run in a separate branch seemed to eliminate SSLEOFError that is raised from ssl.c that was probably caused by multithreading issues, and the presence of a pool seemed to reduce the number of handshake errors overall (except for DNS, which is a somewhat separate issue).

Adding connection pools capability, but will add the actual pools
to Storage when other PRs are merged first.
@william-silversmith
Copy link
Author

william-silversmith commented May 20, 2017

googleapis/google-cloud-python#1346

Presently, we use httplib2 by default but allow users to specify their own http library (#908) so long as it conforms to httplib2.Http's signature.

httplib2 was chosen because it's the underlying http client used by google/oauth2client. However, httplib2 has a variety of issues include not being thread-safe (#1274), not doing any connection pooling, etc.

We should consider what it would take to move to another http library. The major considerations are:

We must support using oauth2client's credentials with the library. The essential functionality is adding the auth header and performing refresh & retry. This can be done either here or within oauth2client.
The library must work on Google App Engine.

googleapis/google-cloud-python#1998

"Tear httplib2 out of gcloud.streaming "

@tartavull
Copy link

@william-silversmith what should we do with this PR?

@william-silversmith
Copy link
Author

Not necessary in this repo anymore due to PyPI CloudVolume.

@tartavull tartavull deleted the wms_connection_pools_2 branch September 11, 2017 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants