-
-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Websocket internally errors "/usr/local/share/lua/5.3/http/websocket.lua:282: read: Connection timed out" #140
Comments
Could you run with the environment variable |
With SOCKET_DEBUG=1 it works correctly, and logs the payloads I expect to see, it only errors without the debug variable set. |
That's extremely odd. The debug variable should only add logging, not change any behaviour. |
That seems likely, is there a way for me to collect the debug data into a file? |
Just pipe/redirect it? |
https://ghostbin.com/paste/a6czk You can see it cut off reading here. It is crashing consistently it seems upon further inspection. I think I jumped the gun a little when it managed to reconnect previously. |
I added a xpcall handler to inspect the internals here is an example of a frame which errors: {
"RSV2": false,
"RSV3": false,
"RSV1": false,
"MASK": false,
"length": 12964,
"FIN": true,
"opcode": 1
} |
Any idea what could cause this? When this happens my receive loop breaks down and I dont know if i can handle it without reconnecting. |
Your pastebin link has now expired. Could you repost it? (consider using a gist?) |
I'll try get some more data (I've lost the old log). What seems to happen is that the websocket cannot read the frame in 0 timeout https://github.com/daurnimator/lua-http/blob/master/http/websocket.lua#L282 I've made this line retry after a I should also note that I've not found any further clues on how to reproduce it aside from connecting without requesting compression. I think it's possibly some kind of race condition? |
Why should these reads be 0 timeout exactly? In practice this isn't readable without waiting. if frame.MASK then
local key = assert(sock:xread(4, "b", 0))
frame.key = { key:byte(1, 4) }
end
do
local data = assert(sock:xread(frame.length, "b", 0))
if frame.MASK then
frame.data = apply_mask(data, frame.key)
else
frame.data = data
end
end EDIT: This is a very naive solution I have been using to try and get a handle on what happens: local function should_retry(timer) if timer then return timer:status() == "pending" else return true end end
local function read_again(deadline, socket, ...)
local tries = 0
local data, msg, code = socket:xread(...)
local timeout = deadline and deadline - monotime()
local timer = timeout and promise.new(sleep, timeout)
while should_retry(timer) and (not data or code == ce.ETIMEDOUT) do
tries = tries + 1
sleep()
data, msg, code = socket:xread(...)
end
used_tries = max(used_tries, tries)
return data, msg, code
end Calling this function on the socket in place of the xreads and logging the used_tries :
This makes me think the read_frame is not consuming the deadline somewhere? |
There is a preceding When the Line 229 in 1f30846
|
Are you still having problems? |
Hello, im so sorry i've not been more proactive with this. I'll try and get a reproducible example to work, and as far as im aware yes there's still something strange happening (it is mostly mitiaged right now by requesting compressed messages from the websocket but it by no means fixed). |
I think being sent a large enough message over websocket is enough to trigger this.
It is usually this payload which causes this error when it is received by my client without compression. |
Could you please come up with a test case that I can run? |
You need to get a bot token by making a discord application. |
Very odd. |
Someone more proficient at lua-http websockets might try setting up a simple server which sends a payload as big as what the server, discord in both cases here, sends. |
I'm getting the same error |
Got the same error The client version is RFC7692, is it irrelevant? |
I have the same problem on |
Hi there, I ran into this issue, here's a small test case to reproduce. I just use a public echo server to send a large-ish string of data (about 12kb) and check that I receive it.
|
I have gained an interest in fixing this. |
Also been getting that exact same error, did not check that well yet but average response sizes are varying around 70kb but sometimes getting larger. |
Any updates on this? I get the same issue. The server is in fact sending the packets, but the websocket interface simply times out. However, if I step through this slowly while debugging, I actually get the expected, correct result which leads me to think that this is some sort of race condition. This issue, like the others above, only happens with large-ish payloads. Thanks |
Would you be able to write some code that launches a websocket server and reproduces this issue using a sample packet? It's really unreliable for me -- sometimes it happens, sometimes it doesn't.
|
Sorry, just got back to this -- been pretty busy. |
Isn't there a Line 244 in 90aa6d6
The 8-byte length case below does have it, and indeed, "big messages" over 64k are received properly. Edit: fwiw, adding the suggested bit of code makes the issue go away for me. |
@LBPHacker |
Can we make a new build/luarocks package with this change? |
This happens when receiving payloads from the websocket.
This happens regardless of if a timeout is passed to the socket.
The weird thing is when I request deflated messages (per message inflate/deflate inside my handling loop), I dont get this issue at all. The only time this happens is when I request raw json payloads. I'm using scm-0 from master (8ab5c30) to fix the timeout issues on the release rock.
This might be due to the payload size, heuristically I know it's the biggest payload I'm likely to receive causing this error, however this is not an issue with other clients for the api I'm connecting to.
The text was updated successfully, but these errors were encountered: