Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to disable "support finding data with typos" from version 6.4.0/7.0.0 #154

Open
dumbmatter opened this issue Oct 18, 2024 · 3 comments

Comments

@dumbmatter
Copy link

#153 is undoubtedly a great feature for many people, but not for me :)

The argument from @overengineered at #153 is:

match sorter currently finds data with some typos, but others. E.g. "canceled" would find "cancelled", but not "cacneled".

Personally I don't see the old behavior (allowing skipped letters) as a "typo", I often do that on purpose to search for multiple parts of a string. Like when searching a list of filenames, I type a few letters for the start of a filename, realize "oh this is a really common start of a filename, it's returning way too many results", and then start typing a later part to filter more precisely. Cause I'm used to how ctrl+p works in Sublime/VSCode/etc.

Now, sometimes the matches found by the new typo algorithm rank higher than ones that would have appeared previously with skipped letters. But more fundamentally, for my personal use, I never want typos (again, not considering the old behavior as "typos") returned. All they do is confuse my search by adding things I never want to select.

So an option to disable this behavior would be appreciated. #153 (comment) has a suggestion for how to do that, although to be honest I'm not totally sure which approach is intended to be "scattered" and which is supposed to be "partial". Or if you're able to enable/disable only the old approach or only the new approach. So I'd probably just make them separate booleans, maybe fuzzyGaps and fuzzySkipOne, idk, naming stuff is hard.

If you think this is a good idea and a PR would help move things along lmk, ideally with some comment on what the option(s) should be called.

@kentcdodds
Copy link
Owner

sometimes the matches found by the new typo algorithm rank higher than ones that would have appeared previously with skipped letters

I would consider this to be a bug. It's possible this change was ill-advised and I'm open to reverting it. I probably should have tried it out in an actual implementation to see how it feels. Sounds like that's what you did and you're suggesting it feels wrong/awkward. I'm more inclined to revert the change and let @overengineered have support for this via a personal fork instead.

I would rather do that than complicate the API with an option to disable this behavior. Thoughts?

@overengineered
Copy link
Contributor

overengineered commented Oct 21, 2024

I think there's two kind of datasets to search in: curated datasets and user-generated content. @dumbmatter seems to be searching in curated dataset and I can see how my change could make ordering suboptimal. My usage for searching user-generated content and sometimes mistakes in datasets make some content unfindable.

I would propose to introduce "pseudo" ranking in the API to enable/disable this behaviour.

  MATCHES: 1,
+ PARTIALLY_MATCHES: 0.875,
  NO_MATCH: 0,

Allowing to skip letter from query would be enabled only if user of library explicitly sends PARTIALLY_MATCHES threshold.

EDIT: Upon further thought, it's best to have a real ranking for this, lower than MATCHES. For my use-case mixing partial and "scattered" matches would work better, but sorting partial matches below them is acceptable.

@kentcdodds
Copy link
Owner

Let's do that and see how it goes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants