-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance and design comparison to nucleo #19
Comments
The announcement claims
these are both already implemented in jfind. i can benchmark it against jfind once a matcher is made using it. |
i did not realise it had already been implemented in helix. i'll try do a benchmark now |
unfortunately, helix does not use https://asciinema.org/a/VFVdJ5hyN4wMa5sDlzeZBwtfL jfind was a lot faster. however, this is a crude benchmark and it would be nice if a standalone fuzzy finder was made using nucleo so it can be tested properly. |
in response to helix-editor/nucleo#22, i do not see what is odd about including reading speed in a fuzzy finder benchmark. if your architecture results in a terrible performance due to a reading speed bottleneck, then that is a problem with your architecture. for the end user, reading speed is a very big deal. they open their fuzzy finder and start searching. nobody sits and waits for the loading to complete. if this were the case, why is there such thing as streaming input? |
I found this by accident, if you are going to respond its kind of pointless to do it here since I don't get a notifications for that. Helix sorts all files alphabetically, which requires a single-threaded transversal (and a lot more processing), fd does not. Fd also doesn't follow symlink but helix does. Logically it takes longer to transverse the FS as a result. Helix even usese the exact same rust library as You can't just compare two entirely different inputs and claim one tool is faster than the other. It's an appels to orangesc comparison. Comparing tools during streaming is also not a good benchmark in general. The pielpline from the FS to the user is roughly:
Unless your streaming implementation is complete garbage it's never going to be the bottleneck anyway (your pipe reader could be but still unlikely considering how slow readdir is) so it's also not that interesting to look at. For nucleo injecting an item into the matcher involves three atomic writes and copying the actual pointer. That is just completely irrelevant compared to the time that it takes to transverse the FS. It is also not a good benchmark because an unsorted FS transversal is not exactly deterministic (so depending on what your OS is doing in the background it can take a varying time for a file to stream in). Instead, it makes much more sense to test how long the actual matching takes. One way to do that is of course to just micobenchmark the matcher itself. Doing system benchmarks is also interesting (but far more challenging to do right) since it takes the concurrent code into account. But that should be done once all matches are streamed in to remove as many unknown variables as possible. For example, one important variable is how many files you are matching but unless you can time your typing speeds to nanosecond accuracy if you start matching during streaming the number of matched items is likely different. For a crude benchmark the most accurate you can get is to wait for all files to stream in (making certain the number of files is the same) and then pasting a pattern (so your typing speed doesn't have an affect). It's possible that jfind has a faster matcher. It's also possible that nucleo is faster. I wasn't aware of jfind until you opened that issue so I haven't looked at the matcher or done any comparison. If jfind has a faster matcher I would interested in looking at how I can improve. Although I don't really have time to work on something like that right now. I just found your screencast a very odd benchmark. It's not in any way an objective comparison of any code you or I implemented. It's simply comparing how certain config flags of |
previously when i have benchmarked various fuzzy finders, i ensured that the tests were fair by replacing the i made it very clear that this performance test was crude due to my inability to just replace you attribute this to slow singled threaded traversal and sorting results prior to matching. you are simply explaining why it is slower. |
by that logic you could also claim fzf is faster than by comparing |
you are exactly right that from what i can tell, nucleo is not a fuzzy finder. nucleo is a fuzzy matching library that can be used to create fuzzy finders or used in other tools for fuzzy matching. there may or may not be performance penalties due to it being a library. the only way to see the end result is to see the matching implemented in a fuzzy finder. as it stands, helix is that fuzzy finder. so i test jfind against helix and from my very crude test, jfind was faster. you argue that it is unfair because helix has slower reading. there is no way for me to make the reading quicker. i cannot replace it with i dont know how else to explain it to you. also i am not trying to undermine helix or nucleo. i hadnt heard of nucleo until this issue was opened and i was asked to compare performance design with it. edit: @pascalkuthe |
See announcement https://www.reddit.com/r/rust/comments/165lp5a/announcing_nucleo_a_fast_fuzzy_matcher_library/ and repo https://github.com/helix-editor/nucleo.
The text was updated successfully, but these errors were encountered: