Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SSL support for MongoDB and RabbitMQ under Python 3.x #4834

Merged
merged 13 commits into from
Dec 16, 2019

Conversation

Kami
Copy link
Member

@Kami Kami commented Dec 16, 2019

This pull request fixes #4832 SSL support for MongoD and RabbitMQ when running under Python 3.6.

Background / Details

When running under Python 3 and enabling SSL support for MongoDB, st2api and st2auth would fail with a cryptic error:

2019-12-15 08:08:45,891 140651884441824 INFO (unknown file) [-] Connecting to database "somedatabase" @ "mongodb-host:27017" as user "someuser".
2019-12-15 08:08:45,892 140651884441824 WARNING (unknown file) [-] Retry on ConnectionError - Cannot connect to database default :
maximum recursion depth exceeded while calling a Python object

After some more digging in and manual instrumentation, I managed to track down the original exception / root cause - https://gist.github.com/Kami/ea8e63cdc539fd879fff41271969d650.

The root cause was an SSL error which happened because st2api and st2auth didn't perform eventlet monkey patching early enough. They performed monkey patching after some other module (likely mongoengine or pymongo) already imported ssl which has undefined behavior and won't work.

Proposed Fix

This pull request fixes the issue by making sure we perform eventlet monkey patching as early as possible (before any other modules are imported) inside st2api and st2auth service.

In addition to that, I made another change by setting serverSelectionTimeoutMS MongoClient option to 5 seconds.

It defaults to 30 seconds which means if there is a SSL related connection error (e.g. handshake failed), exception won't be raised until 30 seconds has passed.

Keep in mind though that in such scenarios, only "connection closed" exception will be returned to the user and for the actual root cause, user will need to check the mongo server logs (that's the limitation of the client / server and nothing we can do).

This will also provide a better user-experience because previously in many scenarios our code would wait 30 seconds for this timeout to be reached before propagating the connection error.

Affected Components / Services

Based on my digging it, this issue only affected st2api and st2auth because other services already performed eventlet monkey patching as early as possible.

This also explains why user reported that st2-register-content worked in #4832 (we intentionally don't perform any monkey patching there so it works fine).

Another thing worth keeping in mind is that we have two entry points for st2api and st2auth - WSGI one (for production gunicorn deployments) and the direct eventlet WSGI server one (for local testing and development).

Both entry points were affected so I needed to fix both.

Configuration

For completeness, here is the st2.conf and mongod.conf configuration I used:

st2.conf:

...
[database]
ssl = true
host = 127.0.0.1
ssl_cert_reqs = none
username = stackstorm
password = ...
...

mongod.conf:

...
net:
  port: 27017
  bindIp: 127.0.0.1
  ssl:
    mode: requireSSL
    PEMKeyFile: /home/ubuntu/mongodb.pem
    CAFile: /home/ubuntu/rootCA.pem
    allowConnectionsWithoutCertificates: true  # NOTE: If this option is not set, client also needs to specify client server otherwise connection / auth will fail
...

connection errors better.

Without specifying a lower serverSelectionTimeoutMs value, the client
would wait up to 30 seconds (default value) in case there are SSL
errors (e.g. handshake failed or similar).

With lower timeout we ensure a faster failure on fatal errors.
as early as possible.

This important, because if we don't do it early enough and "ssl" module
is imported before monkey patching is performed, SSL support for
MongoDB won't work.

Fixes issue reported in #4832.
@pull-request-size pull-request-size bot added the size/M PR that changes 30-99 lines. Good size to review. label Dec 16, 2019
connection = mongoengine.connection.connect(db_name, host=db_host,
port=db_port, tz_aware=True,
username=username, password=password,
serverSelectionTimeoutMS=5000,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "just in case", I will also make this configurable via st2.conf.

@Kami Kami added this to the 3.2.0 milestone Dec 16, 2019
@pull-request-size pull-request-size bot added size/L PR that changes 100-499 lines. Requires some effort to review. and removed size/M PR that changes 30-99 lines. Good size to review. labels Dec 16, 2019
CHANGELOG.rst Outdated Show resolved Hide resolved
CHANGELOG.rst Outdated Show resolved Hide resolved
Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating and fixing this! 👍

Kami and others added 4 commits December 16, 2019 15:37
Sadly due to how tests run, we need to add monkey patch to another file
which is imported before the actual affected test file (nose imports all
the tests in the same process and relies on the ordering).
Co-Authored-By: Eugen C. <[email protected]>
CHANGELOG.rst Outdated Show resolved Hide resolved
@Kami Kami merged commit dd6a2e9 into master Dec 16, 2019
@Kami Kami deleted the mongodb_connect_improvements branch December 16, 2019 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mongodb python3 rabbitmq size/L PR that changes 100-499 lines. Requires some effort to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Connection to mongo db over SSL doesn't work
4 participants