Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't attempt read from closed fd in dune_rpc_lwt #7581

Merged

Conversation

gridbugs
Copy link
Collaborator

This fixes a bug where RPC clients built with dune_rpc_lwt would crash while disconnecting as they attempted to read from a channel whose underlying file descriptor had been closed.

A simple program that reproduces the bug:

let () =
  let init =
    Dune_rpc.V1.Initialize.create ~id:(Dune_rpc.V1.Id.make (Csexp.Atom "hello"))
  in
  let where = Dune_rpc_lwt.V1.Where.default ~build_dir:"_build" () in
  Lwt_main.run
    (let open Lwt.Syntax in
    let* input_channel, output_channel = Dune_rpc_lwt.V1.connect_chan where in
    Dune_rpc_lwt.V1.Client.connect (input_channel, output_channel) init
      ~f:(fun _client -> Lwt.return ()))

This program just connects to an RPC server and then immediately disconnects. Prior to this change it would crash while disconnection with the error:

Fatal error: exception Unix.Unix_error(Unix.EBADF, "check_descriptor", "")
Raised at Lwt_unix.check_descriptor in file "src/unix/lwt_unix.cppo.ml", line 379, characters 4-64
Called from Lwt_unix.blocking in file "src/unix/lwt_unix.cppo.ml", line 384, characters 2-21
Called from Lwt_unix.read_bigarray in file "src/unix/lwt_unix.cppo.ml", line 682, characters 4-15
Called from Lwt_io.perform_io in file "src/unix/lwt_io.ml", line 236, characters 10-35
Called from Lwt_io.Primitives.read_char in file "src/unix/lwt_io.ml", line 676, characters 6-15
Called from Lwt_io.Primitives.read_char_opt.(fun) in file "src/unix/lwt_io.ml", line 686, characters 17-29
Called from Lwt.Sequential_composition.catch in file "src/core/lwt.ml", line 2008, characters 16-20
Re-raised at Lwt.Miscellaneous.poll in file "src/core/lwt.ml", line 3077, characters 20-29
Called from Lwt_main.run.run_loop in file "src/unix/lwt_main.ml", line 31, characters 10-20
Called from Lwt_main.run in file "src/unix/lwt_main.ml", line 118, characters 8-13
Re-raised at Lwt_main.run in file "src/unix/lwt_main.ml", line 124, characters 4-13
Called from Dune__exe__Hello_rpc in file "hello_rpc.ml", line 6, characters 2-260

@gridbugs gridbugs force-pushed the dune-rpc-lwt-check-if-channel-is-closed branch from ff089b6 to de1df83 Compare April 19, 2023 05:35
@gridbugs gridbugs requested a review from rgrinberg April 19, 2023 05:36
@rgrinberg
Copy link
Member

I think I don't understand Lwt enough to understand why this is necessary. AFAIK, this is the line that does the closing:

        let write (_, o) = function
          | None -> Lwt_io.close o
          | Some csexps ->
            Lwt_list.iter_s
              (fun sexp -> Lwt_io.write o (Csexp.to_string sexp))
              csexps

Which should wake up any readers or writers with a Channel_closed exception and not an EBADF.

Any subsequent readers or writers would get the same exception.

@rgrinberg
Copy link
Member

Given the above, I don't see how checking if the fd is closed in the beginning is enough. It could still be closed in the middle of a read in the same way.

@gridbugs
Copy link
Collaborator Author

Here's a minimal program that crashes with Fatal error: exception Unix.Unix_error(Unix.EBADF, "check_descriptor", ""):

let make_socket () =
  let open Lwt.Syntax in
  let+ sock_fd =
    let fd = Lwt_unix.socket Unix.PF_UNIX Unix.SOCK_STREAM 0 in
    let+ () = Lwt_unix.connect fd (Unix.ADDR_UNIX "mysocket") in
    fd
  in
  let fd mode = Lwt_io.of_fd sock_fd ~mode in
  (fd Input, fd Output)

let run () =
  let open Lwt.Syntax in
  let* i, o = make_socket () in
  let* () = Lwt_io.close o in
  let* (_ch : char) = Lwt_io.read_char i in
  Lwt.return_unit

(* run socket server with `nc -lkU mysocket` *)
let () = Lwt_main.run (run ())

Lwt's docs say:

exception Channel_closed of string
  (** Exception raised when a channel is closed. The parameter is a
      description of the channel. *)

It's not clear to me under what conditions Channel_closed will be raised. I don't know if lwt keeps track of the association between the input and output channels when they share an underlying socket.

In any case you are correct that we should be checking if file is open before each read. Will fix.

@gridbugs gridbugs force-pushed the dune-rpc-lwt-check-if-channel-is-closed branch from de1df83 to 12e1be4 Compare April 21, 2023 04:44
This fixes a bug where RPC clients built with dune_rpc_lwt would crash
while disconnecting as they attempted to read from a channel whose
underlying file descriptor had been closed.

Also updates the dune-rpc-lwt expect test to expect the text "success"
to be printed by the client after its connection is closed. It wasn't
being printed prior to this change.

Signed-off-by: Stephen Sherratt <[email protected]>
@gridbugs gridbugs force-pushed the dune-rpc-lwt-check-if-channel-is-closed branch from 12e1be4 to 710533a Compare April 21, 2023 04:45
@gridbugs
Copy link
Collaborator Author

Updated to check that the channel is open before each read.

@rgrinberg
Copy link
Member

The end result still doesn't look satisfying to me. I have no idea why Lwt_io.close isn't waking up existing reads/writes. Another idea would be to make the write end do a shutdown on the write end of the socket with Unix.shutdown and let the callers manage their own fd's.

In any case, at least this is an improvement that's unblocking you. We'll accept it as is.

@gridbugs
Copy link
Collaborator Author

Agreed that the workaround is unsatisfying. I create #7624 so we can improve it when we get time. Could also be a good starter issue I think.

emillon added a commit to emillon/opam-repository that referenced this pull request May 3, 2023
CHANGES:

- Correctly set `MANPATH` in `dune exec`. Previously, we would use the `bin/`
  directory of the context. (ocaml/dune#7655, @rgrinberg)

- Allow overriding the `ocaml` binary with findlib configuration (ocaml/dune#7648,
  @rgrinberg)

- Modules that were declared in `(modules_without_implementation)`,
  `(private_modules)` or `(virtual_modules)` but not declared in `(modules)`
  will cause Dune to emit a warning which will become an error in 3.9. (ocaml/dune#7608,
  fixes ocaml/dune#7026, @Alizter)

- Dune can now detect Coq theories from outside the workspace. This allows for
  composition with installed theories (not necessarily installed with Dune).
  (ocaml/dune#7047, @Alizter, @ejgallego)

- Fix regression where Merlin was unable to handle filenames with uppercase
  letters under Windows. (ocaml/dune#7577, @nojb)

- On nix+macos, pass `-f` to the codesign hook to avoid errors when the binary
  is already signed (ocaml/dune#7183, fixes ocaml/dune#6265, @greedy)

- Fix bug where RPC clients built with dune-rpc-lwt would crash when closing
  their connection to the server (ocaml/dune#7581, @gridbugs)

- Introduce mdx stanza 0.4 requiring mdx >= 2.3.0 which updates the default
  list of files to include `*.mld` files (ocaml/dune#7582, @Leonidas-from-XIV)

- Fix RPC server on Windows (used for OCaml-LSP). (ocaml/dune#7666, @nojb)

- In `(executable)`, `(public_name -)` is now equivalent to no `(public_name)`.
  This is consistent with how `(executables)` handles this field.
  (ocaml/dune#7576 , fixes ocaml/dune#5852, @emillon)

- Change directory of odoc assets to `odoc.support` (was `_odoc_support`) so
  that it works with Github Pages out of the box. (ocaml/dune#7588, fixes ocaml/dune#7364,
  @emillon)
emillon added a commit to emillon/opam-repository that referenced this pull request May 23, 2023
CHANGES:

- Fix string quoting in the json file written by `--trace-file` (ocaml/dune#7773,
  @rleshchinskiy)

- Read `pkg-config` arguments from the `PKG_CONFIG_ARGN` environment variable
  (ocaml/dune#1492, ocaml/dune#7734, @anmonteiro)

- Correctly set `MANPATH` in `dune exec`. Previously, we would use the `bin/`
  directory of the context. (ocaml/dune#7655, @rgrinberg)

- Allow overriding the `ocaml` binary with findlib configuration (ocaml/dune#7648,
  @rgrinberg)

- merlin: ignore instrumentation settings for preprocessing. (ocaml/dune#7606, fixes
  ocaml/dune#7465, @Alizter)

- When a rule's action is interrupted, delete any leftover directory targets.
  This is consistent with how we treat file targets. (ocaml/dune#7564, @rgrinberg)

- Fix plugin loading with findlib. The functionality was broken in 3.7.0.
  (ocaml/dune#7556, @anmonteiro)

- Introduce a `public_headers` field on libraries. This field is like
  `install_c_headers`, but it allows to choose the extension and choose the
  paths for the installed headers. (ocaml/dune#7512, @rgrinberg)

- Load the host context `findlib.conf` when cross-compiling (ocaml/dune#7428, fixes
  ocaml/dune#1701, @rgrinberg, @anmonteiro)

- Add a `coqdoc_flags` field to the `coq.theory` stanza allowing the user to
  pass extra arguments to `coqdoc`. (ocaml/dune#7676, fixes ocaml/dune#7954 @Alizter)

- Resolve `ppx_runtime_libraries` in the target context when cross compiling
  (ocaml/dune#7450, fixes ocaml/dune#2794, @anmonteiro)

- Use `$PKG_CONFIG`, when set, to find the `pkg-config` binary  (ocaml/dune#7469, fixes
  ocaml/dune#2572, @anmonteiro)

- Modules that were declared in `(modules_without_implementation)`,
  `(private_modules)` or `(virtual_modules)` but not declared in `(modules)`
  will cause Dune to emit a warning which will become an error in 3.9. (ocaml/dune#7608,
  fixes ocaml/dune#7026, @Alizter)

- Preliminary support for Coq compiled intefaces (`.vos` files) enabled via
  `(mode vos)` in `coq.theory` stanzas. This can be used in combination with
  `dune coq top` to obtain fast re-building of dependencies (with no checking
  of proofs) prior to stepping into a file. (ocaml/dune#7406, @rlepigre)

- Fix dune crashing on MacOS in watch mode whenever `$PATH` contains `$PWD`
  (ocaml/dune#7441, fixes ocaml/dune#6907, @rgrinberg)

- Fix `dune install` when cross compiling (ocaml/dune#7410, fixes ocaml/dune#6191, @anmonteiro,
  @rizo)

- Find `pps` dependencies in the host context when cross-compiling,  (ocaml/dune#7410,
  fixes ocaml/dune#4156, @anmonteiro)

- Dune in watch mode no longer builds concurrent rules in serial (ocaml/dune#7395
  @rgrinberg, @jchavarri)

- Dune can now detect Coq theories from outside the workspace. This allows for
  composition with installed theories (not necessarily installed with Dune).
  (ocaml/dune#7047, @Alizter, @ejgallego)

- `dune coq top` now correctly respects the project root when called from a
  subdirectory. However, absolute filenames passed to `dune coq top` are no
  longer supported (due to being buggy) (ocaml/dune#7357, fixes ocaml/dune#7344, @rlepigre and
  @Alizter)

- Added a `--no-build` option to `dune coq top` for avoiding rebuilds (ocaml/dune#7380,
  fixes ocaml/dune#7355, @Alizter)

- RPC: Ignore SIGPIPE when clients suddenly disconnect (ocaml/dune#7299, ocaml/dune#7319, fixes
  ocaml/dune#6879, @rgrinberg)

- Always clean up the UI on exit. (ocaml/dune#7271, fixes ocaml/dune#7142 @rgrinberg)

- Bootstrap: remove reliance on shell. Previously, we'd use the shell to get
  the number of processors. (ocaml/dune#7274, @rgrinberg)

- Bootstrap: correctly detect the number of processors by allowing `nproc` to be
  looked up in `$PATH` (ocaml/dune#7272, @Alizter)

- Speed up file copying on macos by using `clonefile` when available
  (@rgrinberg, ocaml/dune#7210)

- Adds support for loading plugins in toplevels (ocaml/dune#6082, fixes ocaml/dune#6081,
  @ivg, @richardlford)

- Support commands that output 8-bit and 24-bit colors in the terminal (ocaml/dune#7188,
  @Alizter)

- Speed up rule generation for libraries and executables with many modules
  (ocaml/dune#7187, @jchavarri)

- Add `--watch-exclusions` to Dune build options (ocaml/dune#7216, @jonahbeckford)

- Do not re-render UI on every frame if the UI doesn't change (ocaml/dune#7186, fix
  ocaml/dune#7184, @rgrinberg)

- Make coq_db creation in scope lazy (@ejgallego, ocaml/dune#7133)

- Non-user proccesses such as version control or config checking are now run
  silently. (ocaml/dune#6994, fixes ocaml/dune#4066, @Alizter)

- Add the `--display-separate-messages` flag to separate the error messages
  produced by commands with a blank line. (ocaml/dune#6823, fixes ocaml/dune#6158, @esope)

- Accept the Ordered Set Language for the `modes` field in `library` stanzas
  (ocaml/dune#6611, @anmonteiro).

- dune install now respects --display quiet mode (ocaml/dune#7116, fixes ocaml/dune#4573, fixes
  ocaml/dune#7106, @Alizter)

- Stub shared libraries (dllXXX_stubs.so) in Dune-installed libraries could not
  be used as dependencies of libraries in the workspace (eg when compiling to
  bytecode and/or Javascript).  This is now fixed. (ocaml/dune#7151, @nojb)

- Allow the main module of a library with `(stdlib ...)` to depend on other
  libraries (ocaml/dune#7154, @anmonteiro).

- Bytecode executables built for JSOO are linked with `-noautolink` and no
  longer depend on the shared stubs of their dependent libraries (ocaml/dune#7156, @nojb)

- Added a new user action `(concurrent )` which is like `(progn )` but runs the
  actions concurrently. (ocaml/dune#6933, @Alizter)

- Allow `(stdlib ...)` to be used with `(wrapped false)` in library stanzas
  (ocaml/dune#7139, @anmonteiro).

- Allow parallel execution of inline tests partitions (ocaml/dune#7012, @hhugo)

- Support `(link_flags ...)` in `(cinaps ...)` stanza. (ocaml/dune#7423, fixes ocaml/dune#7416,
  @nojb)

- Allow `(package ...)` in any position within `(rule ...)` stanza (ocaml/dune#7445,
  @Leonidas-from-XIV)

- Always include `opam` files in the generated `.install` file. Previously, it
  would not be included whenever `(generate_opam_files true)` was set and the
  `.install` file wasn't yet generated. (ocaml/dune#7547, @rgrinberg)

- Fix regression where Merlin was unable to handle filenames with uppercase
  letters under Windows. (ocaml/dune#7577, @nojb)

- On nix+macos, pass `-f` to the codesign hook to avoid errors when the binary
  is already signed (ocaml/dune#7183, fixes ocaml/dune#6265, @greedy)

- Fix bug where RPC clients built with dune-rpc-lwt would crash when closing
  their connection to the server (ocaml/dune#7581, @gridbugs)

- Introduce mdx stanza 0.4 requiring mdx >= 2.3.0 which updates the default
  list of files to include `*.mld` files (ocaml/dune#7582, @Leonidas-from-XIV)

- Fix RPC server on Windows (used for OCaml-LSP). (ocaml/dune#7666, @nojb)

- Coq language versions less 0.8 are deprecated, and will be removed
  in an upcoming Dune version. All users are required to migrate to
  `(coq lang 0.8)` which provides the right semantics for theories
  that have been globally installed, such as those coming from opam
  (@ejgallego, @Alizter)

- Bump minimum version of the dune language for the melange syntax extension
  from 3.7 to 3.8 (ocaml/dune#7665, @jchavarri)
@gridbugs gridbugs deleted the dune-rpc-lwt-check-if-channel-is-closed branch October 11, 2023 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants