Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3.8 on Ubuntu 20.04 (requiring MongoDB 4.4) #5177

Merged
merged 70 commits into from
Apr 18, 2021

Conversation

nzlosh
Copy link
Contributor

@nzlosh nzlosh commented Mar 3, 2021

Work in Progress

Add support for Python3.8 on Ubuntu 20.04. To support this configuration, it implicitly includes MongoDB 4.4 unless we consider standalone instances of StackStorm can no longer host the mongodb instance locally for Ubuntu 20.04.

  • Bump eventlet/greenlet to latest to benefit from threading fix for python3.7+
  • Bump pymongo/mongoengine to support Mongo4.4 which is the only version packaged for Ubuntu 20.04.
  • Added Python3.8 into tox tests.

StackStorm/community#68

 - Bump eventlet/greenlet to latest to benefit from threading fix for
   python3.7+
 - Bump pymongo/mongoengine to support Mongo4.4 which is the only
   version packaged for Ubuntu 20.04.
 - Added Python3.8 into tox tests.
@pull-request-size pull-request-size bot added the size/L PR that changes 100-499 lines. Requires some effort to review. label Mar 3, 2021
@amanda11 amanda11 added this to the 3.5.0 milestone Mar 3, 2021
@CLAassistant
Copy link

CLAassistant commented Mar 6, 2021

CLA assistant check
All committers have signed the CLA.

@cognifloyd
Copy link
Member

We just reformatted the code with black. (Hooray!) And this PR got caught in the cross fire too. (Arrgh!)
This is just a note to let you know why it is going to be more work than normal when you merge master into this branch. Sorry!

@amanda11 amanda11 added deployment feature OS support Support/issues/PRs on a specific OS labels Mar 8, 2021
@winem
Copy link
Contributor

winem commented Mar 10, 2021

I'll use this PR to drop some findings I come across when running the tests:

  • a GET on /v1/executions/123123123123123123123123/attribute/result returns an internal server error if the execution does not exist:
{
    "faultstring": "Internal Server Error"
}

It works fine for existing executions.

@Kami Kami added the mongodb label Mar 12, 2021
@Kami
Copy link
Member

Kami commented Mar 12, 2021

Just to confirm - new version of pymongo still supports older versions of MongoDB, right?

On a related note, we should also add some upgrade notes to st2docs on how to upgrade MongoDB for existing installations.

@Kami
Copy link
Member

Kami commented Apr 18, 2021

Also, now that we use newer version of pymongo and MongoDB, we could also experiment / benchmark with zstandard compression for wire transport (this may be beneficial when working with larger execution results which usually compress quite well).

@Kami
Copy link
Member

Kami commented Apr 18, 2021

st2client install check started failing because a new version of importlib-metadata has been released (https://pypi.org/project/importlib-metadata/#history) and one of our transitive dependencies explicitly specified importlib-metadata < 4.0.

Will push a change which pins it to < 4.0 and hopefully that will work.

Recently st2client tests started failing due to new version of importlib-metadata being published
which is not supported by one of our transitive dependencies.
can specify which compressor client should try to use when talking to
MongoDB (aka for transport / network level compression).

Default (same as before) is no transport level compression.
@Kami
Copy link
Member

Kami commented Apr 18, 2021

While at it, I also decided to add new database.compressors config option which allows users to specify compression algorithms client supports / should use for network / transport level compression - 61d5520.

I also added zstandard dependency which means setting this value to zstd should work out of the box as long as the server is running MongoDB >= 4.2.

By default (same as before), no transport / network level compression is used. I did run some quick micro benchmarks locally with and without zstd compression and there is no perceived difference in the run time so I decided to leave it disabled by default.

We should perform more larger scale benchmarks and then decide if it perhaps makes sense to enable it by default at some point in the future. I also plan to add some micro benchmarks for that in the future which will cover various scenarios and document / data sizes.

In the mean time, I will open st2docs change with release notes + config change which tells user how to configure this option in to be released StackStorm v3.5.

@blag
Copy link
Contributor

blag commented Apr 18, 2021

If we plan to support two Python versions, another change we should probably make is to store virtualenv inside virtualenv- directory and update Makefile accordingly.

Aren't virtualenvs at least partially created like this already? virtualenv/lib/python3.6/site-packages/... Is it possible to use multiple Python versions with a single virtualenv?

@Kami
Copy link
Member

Kami commented Apr 18, 2021

@blag There are still other issues, for example, files in bin/ will be symlinks to a specific python version, same goes for virtualenv activation script.

@Kami
Copy link
Member

Kami commented Apr 18, 2021

Re network / transport level compression - interesting thing is that the existing micro benchmarks showed very little difference in operation duration when using zstd for network level compression, but when running unit tests with compression on they do finish a bit faster.
As mentioned above, I plan to add some compression specific micro benchmarks in the future. Existing ones are not really geared towards benchmarking transport level compression, but they should at least serve as good starting point to catch any regression in terms of performance with compression enabled.

@pull-request-size pull-request-size bot added size/XL PR that changes 500-999 lines. Consider splitting work into several ones that easier to review. and removed size/L PR that changes 100-499 lines. Requires some effort to review. labels Apr 18, 2021
@Kami
Copy link
Member

Kami commented Apr 18, 2021

I pushed some unit test optimizations for low hanging fruit and managed to save a minute or so from unit test run.

@Kami
Copy link
Member

Kami commented Apr 22, 2021

Re MongoDB 4.4 support.

It looks like that we might need to do more work. I got this error locally:

  File "/home/vagrant/st2/virtualenv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1360, in _cmd
    return server.run_operation_with_response(
  File "/home/vagrant/st2/virtualenv/lib/python3.8/site-packages/pymongo/server.py", line 135, in run_operation_with_response
    _check_command_response(first, sock_info.max_wire_version)
  File "/home/vagrant/st2/virtualenv/lib/python3.8/site-packages/pymongo/helpers.py", line 164, in _check_command_response
    raise OperationFailure(errmsg, code, response, max_wire_version)
pymongo.errors.OperationFailure: Path collision at context, full error: {'ok': 0.0, 'errmsg': 'Path collision at context', 'code': 31250, 'codeName': 'Location31250'}

No idea why no tests caught that or why it happened locally. Perhaps it could be related to some specific data being in the document.

Didn't have time to dig in and track it down yet. It looks like it's related to change in MongoDB and we may need to modify some queries or similar - https://docs.mongodb.com/manual/release-notes/4.4-compatibility/#path-collision-restrictions.

EDIT: It may be because we don't run end to end tests with MongoDB 4.4 yet.

@Kami
Copy link
Member

Kami commented Apr 22, 2021

From Slack:

It's an issue with "overlapping" projection query.

That's the projection for that query (default executions list API endpoint): ['end_timestamp', 'runner.runner_parameters', 'action.ref', 'parameters', 'id', 'context', 'start_timestamp', 'action.pack', 'action.parameters', 'action.uid', 'status', 'context.user'].

It should either be just context or context.user depending on what we want (all the context fields or just one). Previously in older versions in scenarios like that, all values from context would be returned.

In short:

  1. We need to fix this project query and audit rest of them to make sure there are no "overlapping values"
  2. Update our code to throw early and a more user-friendly error in case there are "overlapping" projections. Unit tests should then catch any such issue (or the validation code before it actually gets send to MongoDB)
  3. Add tests

One option would also be to handle this in the API layer and make it behave as before. E.g. if projection contains foo and foo.bar field, we would simply fold it to foo. That way the behavior would be exactly the same as in pre 4.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployment feature mongodb OS support Support/issues/PRs on a specific OS size/XL PR that changes 500-999 lines. Consider splitting work into several ones that easier to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants