Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prototype-beta: Tokenizer customization #1563

Conversation

atoulmet
Copy link
Contributor

@atoulmet atoulmet commented Aug 9, 2023

Pull Request

Related issue

Fixes #1555
Fixes #1556

What does this PR do?

Add dictionary settings:

  • Modify the indexes file accordingly
  • Create tests
  • Add Dictionary type

Add separator tokens and non separator tokens settings:

  • Modify the indexes file accordingly
  • Create tests
  • Add Dictionary type

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

@atoulmet atoulmet added enhancement New feature or request beta Changes not meant to be merged on main labels Aug 9, 2023
@atoulmet atoulmet marked this pull request as draft August 9, 2023 13:01
@atoulmet atoulmet force-pushed the prototype-beta/prototype-tokenizer-customization branch from 8a81d8c to 774ef7b Compare August 9, 2023 14:03
@atoulmet atoulmet marked this pull request as ready for review August 9, 2023 14:03
@@ -1,6 +1,6 @@
{
"name": "meilisearch",
"version": "0.34.1",
"version": "0.34.2-tokenizer-customization.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bidoubiwa do we really need to release a meilisearch-js beta? Should we test it with instant-meilisearch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instant-meilisearch is not impacted by this new change!

I released it as it is part of the prototyping process, so that people can try it out. If you think it's not necessary, it is less work for us to not release it hehe

Copy link
Member

@curquiza curquiza Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the process is too complex for what we want to do
We just want to test the prototype is working with the library. Can we have a risk of not releasing it? I mean, can we have a difference of usage between using it from source, or by using the NPM package?

From my POV, releasing a NPM package is useful only if you want to use it with another library, like IS? Am I right?
For example, we will not release a beta for the PHP integration.
I also can understand sometimes it's useful for the users to test the beta version of the library however, especially for front end oriented features!

So I would say: if you agree, let's show the full process to @atoulmet (so release the beta version) even if it's not necessary, but in the future, we would rather go to simplicity and do only what is useful for us: ensuring the library can be compatible with the prototype.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, can we have a difference of usage between using it from source, or by using the NPM package?

No reasons there should be any difference.

From my POV, releasing a NPM package is useful only if you want to use it with another library, like IS? Am I right?

This would be the biggest reason. Another reason we had in the past is if we wanted users to be able to try the feature in their javascript environment without having to use a HTTP lib.

f you agree, let's show the full process to @atoulmet (so release the beta version) even if it's not necessary, but in the future, we would rather go to simplicity and do only what is useful for us: ensuring the library can be compatible with the prototype.

i completely agree!

@atoulmet atoulmet force-pushed the prototype-beta/prototype-tokenizer-customization branch from 774ef7b to 1441d4a Compare August 9, 2023 14:44
@atoulmet atoulmet force-pushed the prototype-beta/prototype-tokenizer-customization branch from 1441d4a to 223b252 Compare August 9, 2023 15:16
Copy link
Contributor

@bidoubiwa bidoubiwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✨✨✨✨✨✨ GOOD JOB ✨✨✨✨✨✨✨

@brunoocasali brunoocasali changed the base branch from main to bump-meilisearch-v1.4.0 September 5, 2023 20:49
@brunoocasali brunoocasali merged commit 7546b44 into bump-meilisearch-v1.4.0 Sep 5, 2023
7 of 10 checks passed
@brunoocasali brunoocasali deleted the prototype-beta/prototype-tokenizer-customization branch September 5, 2023 20:49
meili-bors bot added a commit that referenced this pull request Sep 25, 2023
1578: Update version for the next release (v0.35.0) r=curquiza a=meili-bot

## CHANGELOG 

This version introduces features released on Meilisearch v1.4.0 🎉
Check out the changelog of [Meilisearch v1.4.0](https://github.com/meilisearch/meilisearch/releases/tag/v1.4.0) for more information on the changes. 

⚠️ If you want to adopt new features of this release, **update the Meilisearch server** to the according version.

## 🚀 Enhancements

- Add support for the new setting: `dictionary` (#1563) `@atoulmet` 
```js
client.index('books').getDictionary()
client.index('books').updateDictionary(['W.E.B'])
client.index('books').resetDictionary()
```
- Add support for the new setting: `separator-tokens` (#1563) `@atoulmet` 
```js
client.index('books').getSeparatorTokens()
client.index('books').updateSeparatorTokens(['`@'])`
client.index('books').resetSeparatorTokens()
```
- Add support for the new setting: `non-separator-tokens` (#1563) `@atoulmet` 
```js
client.index('books').getNonSeparatorTokens()
client.index('books').updateNonSeparatorTokens(['.', ','])
client.index('books').resetNonSeparatorTokens()
```

## ⚠️ Warning usage with v1.4.0

A bug fix in Meilisearch v1.4.0 introduces a breaking change in the filter usage. It only concerns users using the `filter` search parameter with `\`.
Explanation and change to apply are detailed in the [Meilisearch v1.4.0](https://github.com/meilisearch/meilisearch/releases/tag/v1.4.0)

Thanks to `@atoulmet` and `@bidoubiwa!` 🎉  



Co-authored-by: meili-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta Changes not meant to be merged on main enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prototype: Separators settings api Prototype: User dictionary settings API
4 participants