Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsevents: move free to finalizer #7312

Merged
merged 1 commit into from
Mar 21, 2023
Merged

Conversation

gridbugs
Copy link
Collaborator

@gridbugs gridbugs commented Mar 14, 2023

Fixes #6151

This just a copy of #6215 rebased onto main. I've tested this on both an M1 and x86 mac and it fixes the problem.

@gridbugs
Copy link
Collaborator Author

(copied from #6151)

I did some digging today into why the crash happens when the fsevents callback calls Fsevents.stop.

The c function dune_fsevents_callback is called on each event, and invokes the user-provided callback (the ~f argument to Fsevents.create) before dereferencing a field of the dune_fsevents_t struct . The c function dune_fsevents_stop deallocates the dune_events_t struct and lucky for us the deallocator seems to corrupt its pointer fields. dune_fsevents_stop is called by the ocaml function Fsevents.stop. So when the fsevents callback calls Fsevents.stop dune_fsevents_callback tries to dereference a field of the dune_events_t after its pointer fields have been corrupted and a segfault happens.

This implies that we'd also see this crash if we ever started, stopped, and then re-started a Fsevents.t as stop causes the value to be deallocated but it's only allocated inside create, though is moot as the Fsevents.state state machine doesn't allow stopping and re-starting.

@emillon's fix (#6215) solves the problem by removing the call to caml_stat_free(t); from dune_fsevents_stop and registering a custom finalizer for the dune_fsevents_t object instead. I've copied to #7312 and rebased onto main.

@emillon
Copy link
Collaborator

emillon commented Mar 16, 2023

Sounds good of course but I think you could have pushed the original PR too.

My comments from there still apply - I'm not confident about the correctness of the memory management here but I think it's an improvement compared to the current situation (and it fixes the issue!).

Please add a changelog entry.

@gridbugs
Copy link
Collaborator Author

Sounds good of course but I think you could have pushed the original PR too.

I couldn't figure out how to push to your branch :)

My comments from there still apply - I'm not confident about the correctness of the memory management here but I think it's an improvement compared to the current situation (and it fixes the issue!).

Should we hold off merging this until we better understand the memory management?

Please add a changelog entry.

Done!

@emillon
Copy link
Collaborator

emillon commented Mar 17, 2023

Should we hold off merging this until we better understand the memory management?

I'd say it's fine.

CHANGES.md Outdated
@@ -60,6 +60,9 @@ Unreleased
- Allow `(stdlib ...)` to be used with `(wrapped false)` in library stanzas
(#7139, @anmonteiro).

- Fix segfault on macos when `Fsevents.stop` is called inside the callback to
`Fsevents.create`. (#7312, fixes #6151, @gridbugs, @emillon)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove such details from the CHANGE list. It's supposed to make sense to end users that know nothing about dune's code. Perhaps something like:

- Fix segfault on MacOS when dune was being shutdown while in watch mode.

@rgrinberg rgrinberg added this to the 3.8.0 milestone Mar 17, 2023
@rgrinberg
Copy link
Member

@nojb could you take a look please?

Copy link
Collaborator

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Another approach: using a custom block as in this PR, you can set the "data pointer" to NULL after deallocating the dune_fsevents_t data structure. The callback would check whether the field is NULL before trying to dereference it and not do anything otherwise. This avoids the use of the finalisation function.

@rgrinberg
Copy link
Member

@gridbugs what do you think about my suggesting for the CHANGES entry?

@gridbugs
Copy link
Collaborator Author

@gridbugs what do you think about my suggesting for the CHANGES entry?

Looks good. I've updated CHANGES with your suggestion.

turning off dune in watch mode on macos will no longer segfault
sporadically

Signed-off-by: Etienne Millon <[email protected]>
Signed-off-by: Stephen Sherratt <[email protected]>
Signed-off-by: Rudi Grinberg <[email protected]>
@rgrinberg rgrinberg merged commit eed25bf into ocaml:main Mar 21, 2023
emillon pushed a commit to emillon/dune that referenced this pull request Mar 30, 2023
turning off dune in watch mode on macos will no longer segfault
sporadically

Signed-off-by: Etienne Millon <[email protected]>
Signed-off-by: Stephen Sherratt <[email protected]>
Signed-off-by: Rudi Grinberg <[email protected]>
emillon pushed a commit to emillon/dune that referenced this pull request Mar 31, 2023
turning off dune in watch mode on macos will no longer segfault
sporadically

Signed-off-by: Etienne Millon <[email protected]>
Signed-off-by: Stephen Sherratt <[email protected]>
Signed-off-by: Rudi Grinberg <[email protected]>
emillon added a commit that referenced this pull request Mar 31, 2023
turning off dune in watch mode on macos will no longer segfault
sporadically

Signed-off-by: Etienne Millon <[email protected]>
Signed-off-by: Stephen Sherratt <[email protected]>
Signed-off-by: Rudi Grinberg <[email protected]>
Co-authored-by: Stephen Sherratt <[email protected]>
emillon added a commit to emillon/opam-repository that referenced this pull request Apr 4, 2023
…, dune-rpc, dune-rpc-lwt, dune-private-libs, dune-glob, dune-configurator, dune-build-info, dune-action-plugin and chrome-trace (3.7.1)

CHANGES:

- Fix segfault on MacOS when dune was being shutdown while in watch mode.
  (ocaml/dune#7312, fixes ocaml/dune#6151, @gridbugs, @emillon)

- Fix preludes not being recorded as dependencies in the `(mdx)` stanza (ocaml/dune#7109,
  fixes ocaml/dune#7077, @emillon).

- Pass correct flags when compiling `stdlib.ml`. (ocaml/dune#7241, @emillon)

- Handle "Too many links" errors when using Dune cache on Windows.  The fix in
  3.7.0 for this same issue was not effective due to a typo. (ocaml/dune#7472, @nojb)
@gridbugs gridbugs deleted the fsevents-finalize branch October 11, 2023 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fsevents segfaulting under M1 mac
4 participants