-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Python Lots of fn runner test items cost exactly 5 seconds to run #22283
Comments
Found this issue is introduced in grpcio-1.46.0. Nailed this down to changes added to grpc master between March 21 to March 22, most likely this commit: grpc/grpc#29050 Also found the server that does not shutdown before grace time is mostly state_server; and control_server also prevents shutdown in some tests: no graceful shutsown for control_server alone:
|
I was able to bisect grpc commits to debug another issue, while building grpc from sources, in case it helps, my notes are in #22533 (comment) |
I think mocking the value in tests is fine if it does not result in additional flakiness. What happens if the server is shut down prematurely - will it restart or requests will just fail? |
Or is the issue that the test is waiting to shutdown the servers after completing execution of the scenario? |
Yes, the test is waiting for state_server and sometimes control_server to shutdown. This did not occur in grpc<1.46.0. |
Changing the priority. This is blocking the release according to the release email. Could we rollback to a previous grpc version to unblock the release? /cc @kileys |
Sorry, I meant that it may be a blocker. There is a performance regression as well |
Where is the perf regression tracked? |
I believe (though not sure) that this is due to fnapirunner closing grpc connections and should not be considered a perf regression |
I may not be able to look at this for now : / |
I assume this is no longer blocking the release, given that we have an RC ? |
Yes, not a release blocker |
If not blocking the release and not considered a performance regression, lowering to P2. |
.take-issue |
What happened?
Was investigating #22115 and find this interesting observation
pytest results with time consuming info:
There is a default 5 seconds to close the rpc server in
beam/sdks/python/apache_beam/runners/portability/fn_api_runner/worker_handlers.py
Line 453 in c14a4cf
Change this to None (or mock this constant in tests), tests still pass and running time decrease dramatically:
Suprisingly, if changing
_DEFAULT_SHUTDOWN_TIMEOUT_SECS
to a large number (100), the running time is IDENTICAL with setting it as 5 seconds---hundreds of tests costs exactly 5 seconds, likely there is a hard-coded waiting time of 5 seconds that prevented RpcServer from closing.Issue Priority
Priority: 2
Issue Component
Component: testing
The text was updated successfully, but these errors were encountered: