Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server crashes when the client disconnects unexpectedly #34

Open
joprice opened this issue Jul 29, 2023 · 10 comments
Open

server crashes when the client disconnects unexpectedly #34

joprice opened this issue Jul 29, 2023 · 10 comments

Comments

@joprice
Copy link

joprice commented Jul 29, 2023

I'm running the EIO server and client examples, and noticed that when I kill the client, the server crashes with the following error:

End_of_file Raised at Eio_linux__Low_level.readv.(fun) in file "lib_eio_linux/low_level.ml", line 158, characters 25-52
+Called from Eio_unix__Rcfd.use in file "lib_eio/unix/rcfd.ml", line 161, characters 10-14
+Re-raised at Eio_unix__Rcfd.use in file "lib_eio/unix/rcfd.ml", line 166, characters 6-41
+Called from Eio__Flow.single_read in file "lib_eio/flow.ml", line 16, characters 12-27
+Called from Gluten_eio.IO_loop.read_once.(fun) in file "eio/gluten_eio.ml", line 54, characters 10-45
+Called from Gluten_eio.IO_loop.read_once in file "eio/gluten_eio.ml", line 51, characters 4-192
+Called from Gluten_eio.IO_loop.read in file "eio/gluten_eio.ml", line 60, characters 10-31
+Called from Eio__core__Fiber.any.(fun).wrap in file "lib_eio/core/fiber.ml", line 97, characters 16-20
+Re-raised at Eio__core__Fiber.any in file "lib_eio/core/fiber.ml", line 138, characters 26-61
+Called from Gluten_eio.IO_loop.start.(fun).read_loop_step in file "eio/gluten_eio.ml", line 126, characters 21-44
+Re-raised at Eio__core__Switch.maybe_raise_exs in file "lib_eio/core/switch.ml", line 118, characters 21-56
+Called from Eio__core__Switch.run_internal in file "lib_eio/core/switch.ml", line 150, characters 4-21
+Called from Eio__core__Cancel.with_cc in file "lib_eio/core/cancel.ml", line 116, characters 8-12
+Re-raised at Eio__core__Cancel.with_cc in file "lib_eio/core/cancel.ml", line 118, characters 32-40
+Called from Dune__exe__Grpc_server.(fun).loop in file "grpc_test/grpc_server.ml", line 125, characters 12-28

I'm not sure if I'm missing somewhere I'm intended to handle this, or whether connection handling should be set up a different way to be more resilient, or if this is unexpected behavior.

@quernd
Copy link
Collaborator

quernd commented Jul 29, 2023

Thanks for reporting this. I haven't yet been able to reproduce it; what are the exact steps to trigger this error? Just Ctrl-C the client during a streaming call?

Since this exception is not caught by Gluten: which specific version of Gluten are you using? There have been a couple of commits recently in the Gluten main branch that affect exception handling.

@joprice
Copy link
Author

joprice commented Jul 29, 2023

I'm using the latest. I tried downgrading to a version before those changes were introduced, but had trouble finding a set of commits across h2-eio and gluten and a few other deps in my project that were compatible. Which versions are you building with?

@joprice
Copy link
Author

joprice commented Jul 29, 2023

I was just able to get it to build. I had to downgrade Eio to 0.10, along with gluten=0.4.1 and h2=0.10.0. Just tested with those and no longer see the error on disconnection.

@joprice
Copy link
Author

joprice commented Jul 29, 2023

As far as the server itself shutting down, this seems to be my misunderstanding of the lifetime of the new switch argument provided to create_connection_handler. Introducing a new switch in the anonymous function returned from connection_handler in the example allows the server to survive the exception:

 fun socket addr ->
   Eio.Switch.run (fun sw ->
       H2_eio.Server.create_connection_handler ~sw ?config:None
         ~request_handler ~error_handler addr socket)

But I'm new to eio and not sure that's correct.

@joprice
Copy link
Author

joprice commented Jul 29, 2023

I hit another case which I'm not sure yet is related to the recent gluten changes. I have to try to reproduce it in the reduced version with the older versions. I am now trying to terminate the server while the client is querying in a loop. What I see is the error handler gets called with Malformed_response and then it hangs.

@joprice
Copy link
Author

joprice commented Jul 29, 2023

I added some traces and looks like the code hangs in get_response_and_bodieswaiting for the response body when an error is hit: let response = Eio.Promise.await response in. Maybe the error handler needs to be taken into account and fail the promises?

@joprice
Copy link
Author

joprice commented Jul 29, 2023

I experimented with this here and it does fix the hanging: main...joprice:ocaml-grpc:clientErrors.

@quernd
Copy link
Collaborator

quernd commented Jul 31, 2023

I was just able to get it to build. I had to downgrade Eio to 0.10, along with gluten=0.4.1 and h2=0.10.0. Just tested with those and no longer see the error on disconnection.

Yeah, it can be tricky to find a set of dependency versions that work together. That's mostly because there isn't yet a version of H2 on OPAM that works with Eio 0.11. We enforced an upper bound in this commit but that's just for the examples, it seems.

In any case, if that error doesn't present with gluten 0.4.1, it may be worth opening an issue in the gluten repository to ask for clarification if this behavior is intended. Another thing that would be good to know is: are you able to reproduce the error only in conjunction with gRPC? From the backtrace you posted it looks to me like it would happen with any HTTP2 connection.

@quernd
Copy link
Collaborator

quernd commented Jul 31, 2023

I experimented with this here and it does fix the hanging: main...joprice:ocaml-grpc:clientErrors.

That looks reasonable! Would you mind opening a PR?

@joprice
Copy link
Author

joprice commented Sep 11, 2023

@quernd lost track of the in my inbox. I can put up a pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants