Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix connection linger for WebSTOMP #6789

Merged

Conversation

csicar
Copy link

@csicar csicar commented Jan 3, 2023

Proposed Changes

Resolve issue #6737 by delaying closing the websocket connection to happen after the ERROR frame is sent.

Types of Changes

Checklist

Further Comments

Cause of the issue seems to be that process_request enqueues an ERROR frame, but handle_data1 closes the connection immediately. By enqueuing a message close_websocket instead of closing the connection directly, we avoid this problem

@pivotal-cla
Copy link

@csicar Please sign the Contributor License Agreement!

Click here to manually synchronize the status of this Pull Request.

See the FAQ for frequently asked questions.

@pivotal-cla
Copy link

@csicar Thank you for signing the Contributor License Agreement!

@lukebakken lukebakken self-assigned this Jan 3, 2023
@lukebakken
Copy link
Collaborator

Thanks! I'll have some small changes to push to your branch / fork.

Copy link
Collaborator

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test fails:

cowboy_websocket_SUITE:sub_non_existent failed on line 192
Reason: {badmatch,{close,{1006,"Connection closed abnormally"}}}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Testing deps.rabbitmq_web_stomp.cowboy_websocket_SUITE: *** FAILED test case 5 of 11 ***

I can reproduce this locally by running this command:

make -C deps/rabbitmq_web_stomp ct-cowboy_websocket t=integration:sub_non_existent

Sometimes the test times out when I run it locally.

I'm looking into it.

@csicar
Copy link
Author

csicar commented Jan 5, 2023

That's surprising...
I tested with make tests and that worked in my machine. Seems like I get the same result now. I must have changed something during cleanup.

@lukebakken
Copy link
Collaborator

lukebakken commented Jan 5, 2023

I took some time to review the code that deals with WebSTOMP. It would be great if this fix could be made to work, because the "correct" (i.e. better use of cowboy_websocket) would involve quite a bit of rewriting.

@csicar
Copy link
Author

csicar commented Jan 9, 2023

I think I finally found what the problems are.

The websocket client rfc6455_client, which is used for testing, has two places for receive ing messages:

  1. In loop (which is started by init) where a TCP close message is interpreted as an error resulting in the error seen in my test: "Connection closed abnormally"
  2. In recv (which is called by recv_raw) where a close message is handled as a normal message and returned to the caller of recv

The server sends the correct messages in the correct sequence (which I confirmed with wireshark), but the websocket client does not make it accessible to the test. (Or I'm using it incorrectly)

Why the test blocked: Waiting for rfc6455_client:close(WS), when the server already closed the connection, will never finish. I suspect that this caused by how it is implemented:

close(WS, WsReason) ->
    WS ! {close, WsReason},
    receive
        {rfc6455, close, WS, R} ->
            {close, R}
    end.

When the connection is already closed, the client will never receive a close message; blocking indefinitely.

Also: There seem to be problems with how the tests are cleaned up: When cancelling a test, some erlang processes remain running (Confirmed by running ps aux | grep ct_run after make test finished)

@csicar
Copy link
Author

csicar commented Jan 9, 2023

because the "correct" (i.e. better use of cowboy_websocket) would involve quite a bit of rewriting.

Yeah.. I came to the same conclusion

- Server closes connection on the error case, so the client was not able
  to do so and consequently hung
- Do not test for the error code returned by the server, because it is
  not part of the spec and now accessible via the websocket client,
  which handles the TCP close instead of the Websocket close message
@lukebakken lukebakken force-pushed the fix/web_stomp_missing_subscription branch from 2683762 to 644bf2c Compare January 10, 2023 23:58
Copy link
Collaborator

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@michaelklishin michaelklishin merged commit a158fc5 into rabbitmq:main Jan 11, 2023
michaelklishin added a commit that referenced this pull request Jan 11, 2023
Fix connection linger for WebSTOMP (backport #6789)
michaelklishin added a commit that referenced this pull request Jan 11, 2023
Fix connection linger for WebSTOMP (backport #6789) (backport #6838)
michaelklishin added a commit that referenced this pull request Jan 11, 2023
Fix connection linger for WebSTOMP (backport #6789) (backport #6838) (backport #6839)
@csicar
Copy link
Author

csicar commented Jan 11, 2023

Thanks for merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants