Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

websocket client loses first text message when packed together with http response (IDFGH-11300) #396

Closed
3 tasks done
0xFEEDC0DE64 opened this issue Oct 23, 2023 · 13 comments
Assignees

Comments

@0xFEEDC0DE64
Copy link
Contributor

Answers checklist.

  • I have read the documentation for esp-protocols components and the issue is not addressed there.
  • I have updated my esp-protocols branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

I need to first flush in my websocket server, before sending the first websocket text message, otherwise the websocket client on the ESP32 doesnt report the text message. All following text messages are received then correctly, just my hello message is missing in ESP32

image
image
image

@github-actions github-actions bot changed the title websocket client loses first text message when packed together with http response websocket client loses first text message when packed together with http response (IDFGH-11300) Oct 23, 2023
@bryghtlabs-richard
Copy link
Contributor

bryghtlabs-richard commented Mar 26, 2024

I'm seeing this using ESP websocket client to connect over WSS to an Engine.IO server. After the initial connection, the Engine.IO protocol sends a packet to the client(Engine.IO open(EIO ID=0)), and that packet sometimes goes missing when it is returned over HTTP just before the handover to websocket. Here's logging of it working, as it usually works:

I (6189) websocket_client: Reconnecting...
I (6965) transport_ws: esp_transport_read() returned 211 bytes: HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Upgrade: websocket
Sec-Websocket-Accept: XZoG4M2HTfnMHsDixWW8VRXycqs=
Access-Control-Allow-Origin: http://localhost:5000
Vary: Origin
Via: 1.1 vegur


I (6977) Socket.IO Client: WS Connected!
I (7101) websocket_client: esp_transport_poll_read()=1
I (7101) websocket_client: esp_transport_read() returned 105: 0{"sid":"AcAvkjEq61vWi7gUAABo","upgrades":[],"pingInterval":2000,"pingTimeout":5000,"maxPayload":1000000}
I (7109) Socket.IO Client: EIO Open(ID=0) received, responding

But sometimes the EIO Connect is returned over HTTP just before the protocol switch, and thus dropped:

I (6213) websocket_client: Reconnecting...
I (6961) transport_ws: esp_transport_read() returned 318 bytes: HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Upgrade: websocket
Sec-Websocket-Accept: i9RR9OLvS/c477j1rzeGRHtLuKM=
Access-Control-Allow-Origin: http://localhost:5000
Vary: Origin
Via: 1.1 vegur

����i0{"sid":"eW_zUzXVj6RAaQI0AABn","upgrades":[],"pingInterval":2000,"pingTimeout":5000,"maxPayload":1000000}
I (6985) Socket.IO Client: WS Connected!
I (8113) websocket_client: esp_transport_poll_read()=0
I (8113) websocket_client: Read poll timeout: skipping esp_transport_read()...

I think the issue may be in transport_ws, rather than esp-websocket-client, since that's where the message is pulled over TLS, and dropped. 0xFEEDC0DE64's workaround doesn't seem to be an option for our server.

@rsaxvc
Copy link

rsaxvc commented Apr 7, 2024

@gabsuren , it looks like this hasn't been touched in a bit. Is there anything we could do to help? In the event of an HTTP response during the upgrade, do we need to parse it and emit the event just after the connect event is processed?

@gabsuren
Copy link
Contributor

gabsuren commented Apr 8, 2024

@rsaxvc actually the fix has already been submitted for review and is expected to be merged shortly, likely within this week. Apologize for the delay
@euripedesrocha PTAL on your first convenience.

@gabsuren
Copy link
Contributor

gabsuren commented Apr 17, 2024

@0xFEEDC0DE64 @rsaxvc The fix is already available in ESP-IDF latest master branch.
Please take a look and let us know if it works for you.
This is the fix commit espressif/esp-idf@2267d4b

Thank you
Suren

@bryghtlabs-richard
Copy link
Contributor

Been testing the last two days - haven't seen our printout when we detect the dropped EIO.Open, so it seems to be working. I should add a printout when the EIO.Open is routed over HTTP rather than WS to be certain though.

@gabsuren
Copy link
Contributor

gabsuren commented Apr 22, 2024

Closing the issue as the fix already available in espressif/esp-idf@2267d4b, please don't hesitate to reopen this if you encounter any further problems. Thank you!

@bryghtlabs-richard
Copy link
Contributor

I finally caught it with a few printouts when EIO.Open was routed over HTTP before the WS changeover. The results were unexpected. @gabsuren I don't think I can reopen though. My annotations with '#':

#I added this to print the data that was routed over HTTP rather than WS. This looks good:
D (13581) transport_ws: preloading WS buffer:����j0{"sid":"uUPU4PoJnegQZmX-AACB","upgrades":[],"pingInterval":2000,"pingTimeout":10000,"maxPayload":1000000}
#but this server-key seems to be malformed:
D (13585) transport_ws: server key=d":1000000}DSoNjIuSTp5vnstw=, send_key=yCM5Fow39xL1q0xE023SXw==, expected_server_key=Sz3lYdKRUDFDSoNjIuSTp5vnstw=
E (13597) transport_ws: Invalid websocket key
E (13601) websocket_client: esp_transport_connect() failed with -1, transport_error=ESP_OK, tls_error_code=0, tls_flags=0, esp_ws_handshake_status_code=101, errno=119
I (13617) SIOC: WS Error!

If I understand the patch correctly, when there is a response delimiter present, memmove() the start of response over to the start. However, server_key is a pointer, not a copy, so the memmove can shift data over the server-key. When I diff the keys:

server key=d":1000000}DSoNjIuSTp5vnstw=
expected_server_key=Sz3lYdKRUDFDSoNjIuSTp5vnstw=

I see they both have a suffix of DSoNjIuSTp5vnstw=, but that the start of the server-key was overwritten with the end of the data included in HTTP response before WS switchover.

@rsaxvc
Copy link

rsaxvc commented Apr 27, 2024

Perhaps we should check the server-key;first, before shifting over the response data?

@bryghtlabs-richard
Copy link
Contributor

@gabsuren , I have a patch for this here: https://github.com/bryghtlabs-richard/esp-protocols/tree/pr-ws-fragfix which defers shifting the response over the headers until after the server-key is validated. I'm having trouble creating a pull-request though - github isn't seeming to parse the common commit ancestor.

@gabsuren
Copy link
Contributor

@bryghtlabs-richard Thank you for your contribution. I will create a new pull request incorporating your commit.

@bryghtlabs-richard
Copy link
Contributor

bryghtlabs-richard commented Apr 30, 2024

Please don't submit yet - we're having some connection issues here, we think they're on the server, but unable to test here for now.

Edit: this is a bug in my client-side patch that is only triggered when "Sec-WebSocket-Accept" is the last header in the response.

espressif-bot pushed a commit to espressif/esp-idf that referenced this issue May 23, 2024
When first fragment is sent over HTTP during websocket
connection, defer buffering of fragment until after the
websocket server-key is validated.

This order is required because the first fragment buffering
overwrites the memory holding the server-key headers.

Fixes 2267d4b
Fixes espressif/esp-protocols#396
PR #13724
espressif-bot pushed a commit to espressif/esp-idf that referenced this issue May 25, 2024
When first fragment is sent over HTTP during websocket
connection, defer buffering of fragment until after the
websocket server-key is validated.

This order is required because the first fragment buffering
overwrites the memory holding the server-key headers.

Fixes 2267d4b
Fixes espressif/esp-protocols#396
PR #13724
@gabsuren
Copy link
Contributor

@bryghtlabs-richard the changes has been merged to ESP-IDG Github. Thank you for your contirbution

@bryghtlabs-richard
Copy link
Contributor

Thanks @gabsuren

espressif-bot pushed a commit to espressif/esp-idf that referenced this issue Aug 21, 2024
When first fragment is sent over HTTP during websocket
connection, defer buffering of fragment until after the
websocket server-key is validated.

This order is required because the first fragment buffering
overwrites the memory holding the server-key headers.

Fixes 2267d4b
Fixes espressif/esp-protocols#396
PR #13724
espressif-bot pushed a commit to espressif/esp-idf that referenced this issue Aug 22, 2024
When first fragment is sent over HTTP during websocket
connection, defer buffering of fragment until after the
websocket server-key is validated.

This order is required because the first fragment buffering
overwrites the memory holding the server-key headers.

Fixes 2267d4b
Fixes espressif/esp-protocols#396
PR #13724
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants