-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchmark: fix race condition on fs benchs #50035
Conversation
c44de49
to
bc9e25f
Compare
@nodejs/fs @nodejs/performance @nodejs/platform-windows can you review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on an approach that uses setTimeout here, if we end up using setTimeout we should use platform dependent timeout probably.
In general code that does cleanup and may crash shouldn't clean up after itself but when starting which the code here already does - so I wonder if it isn't better to just remove the cleanup and rely on the test.
The benchmarks should likely use async/await instead of the callbackify-liike weirdness they do at the moment.
What do you mean by
Another approach for These changes are meant to fix the error that happens in these lines: Lines 48 to 71 in 19b470f
Due to race condition, the file cannot be deleted because the Maybe we could remove the |
https://github.com/nodejs/node/tree/main/test/common#platformtimeoutms
I guarantee you it has less or as much overhead as the "create a promise then go through extra closure and function" version we currently use.
We should never rely on timers for dealing with race conditions typically. The problem here is design, we should delete the file when the test starts and not when it ends.. |
Cant we have a after helper, like mocha/chai/tap has, in the benchmark framework and run it after the benchmark finished? |
What if the user stops the benchmark in the middle (or similarly in CI?)? What's the issue with only doing the cleanup before the test and not after? |
@benjamingr Doing it this way, it will clean up only during the start of the benchmark, and then the file will be there without being cleaned. It may or may not be a problem, but if we went this way we would need to remove the reference to About stopping the CI in the middle, the file is cleaned because of But I will try to use platform-specific timeout and see how it behaves with async/await. |
It's not (at least shouldn't be) it's a tmp dir, it's transient storage by design. It's "fine" to leave trash there.
Sure but I would strongly prefer it that we don't use a timers based solution since that inherently depends on how slow the machine running the benchmark/test is which means harder-to-track-and-fix flakes. |
@benjamingr Just to summarize the changes, now I wait for all the concurrent timers to finish until really stop the benchmark, in this approach I don't relly on platform dependant timeouts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@benjamingr I see you asked for some changes here, are you still blocking this to land? |
This comment has been minimized.
This comment has been minimized.
Landed in 33c87ec |
PR-URL: nodejs#50035 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#50035 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: #50035 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Some of these benchmarks fails even on main versions of Node.js.
This issue is caused because
afterRead
is being called and is locking the temp file which makes the.on('exit')
fails to clean the temp file.If we wait all the concurrent timers to finish, the issue is solved.
First found this issue on: #49962