Add an option to disable "support finding data with typos" from version 6.4.0/7.0.0 #154

dumbmatter · 2024-10-18T14:55:09Z

#153 is undoubtedly a great feature for many people, but not for me :)

The argument from @overengineered at #153 is:

match sorter currently finds data with some typos, but others. E.g. "canceled" would find "cancelled", but not "cacneled".

Personally I don't see the old behavior (allowing skipped letters) as a "typo", I often do that on purpose to search for multiple parts of a string. Like when searching a list of filenames, I type a few letters for the start of a filename, realize "oh this is a really common start of a filename, it's returning way too many results", and then start typing a later part to filter more precisely. Cause I'm used to how ctrl+p works in Sublime/VSCode/etc.

Now, sometimes the matches found by the new typo algorithm rank higher than ones that would have appeared previously with skipped letters. But more fundamentally, for my personal use, I never want typos (again, not considering the old behavior as "typos") returned. All they do is confuse my search by adding things I never want to select.

So an option to disable this behavior would be appreciated. #153 (comment) has a suggestion for how to do that, although to be honest I'm not totally sure which approach is intended to be "scattered" and which is supposed to be "partial". Or if you're able to enable/disable only the old approach or only the new approach. So I'd probably just make them separate booleans, maybe fuzzyGaps and fuzzySkipOne, idk, naming stuff is hard.

If you think this is a good idea and a PR would help move things along lmk, ideally with some comment on what the option(s) should be called.

The text was updated successfully, but these errors were encountered:

kentcdodds · 2024-10-19T05:33:32Z

sometimes the matches found by the new typo algorithm rank higher than ones that would have appeared previously with skipped letters

I would consider this to be a bug. It's possible this change was ill-advised and I'm open to reverting it. I probably should have tried it out in an actual implementation to see how it feels. Sounds like that's what you did and you're suggesting it feels wrong/awkward. I'm more inclined to revert the change and let @overengineered have support for this via a personal fork instead.

I would rather do that than complicate the API with an option to disable this behavior. Thoughts?

overengineered · 2024-10-21T09:07:49Z

I think there's two kind of datasets to search in: curated datasets and user-generated content. @dumbmatter seems to be searching in curated dataset and I can see how my change could make ordering suboptimal. My usage for searching user-generated content and sometimes mistakes in datasets make some content unfindable.

I would propose to introduce "pseudo" ranking in the API to enable/disable this behaviour.

  MATCHES: 1,
+ PARTIALLY_MATCHES: 0.875,
  NO_MATCH: 0,

Allowing to skip letter from query would be enabled only if user of library explicitly sends PARTIALLY_MATCHES threshold.

EDIT: Upon further thought, it's best to have a real ranking for this, lower than MATCHES. For my use-case mixing partial and "scattered" matches would work better, but sorting partial matches below them is acceptable.

kentcdodds · 2024-10-21T13:17:47Z

Let's do that and see how it goes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to disable "support finding data with typos" from version 6.4.0/7.0.0 #154

Add an option to disable "support finding data with typos" from version 6.4.0/7.0.0 #154

dumbmatter commented Oct 18, 2024

kentcdodds commented Oct 19, 2024

overengineered commented Oct 21, 2024 •

edited

Loading

kentcdodds commented Oct 21, 2024

Add an option to disable "support finding data with typos" from version 6.4.0/7.0.0 #154

Add an option to disable "support finding data with typos" from version 6.4.0/7.0.0 #154

Comments

dumbmatter commented Oct 18, 2024

kentcdodds commented Oct 19, 2024

overengineered commented Oct 21, 2024 • edited Loading

kentcdodds commented Oct 21, 2024

overengineered commented Oct 21, 2024 •

edited

Loading