Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for references should take sentence casing into account #531

Closed
4 tasks done
jzohrab opened this issue Dec 7, 2024 · 2 comments
Closed
4 tasks done

Searching for references should take sentence casing into account #531

jzohrab opened this issue Dec 7, 2024 · 2 comments
Assignees
Labels
bug Something isn't working fixed Fixed in develop or master, to be launched.

Comments

@jzohrab
Copy link
Collaborator

jzohrab commented Dec 7, 2024

Currently, searching for a term reference will only find sentences that match the case. e.g. if I search for DOG, I don't think it will find references with dog ... at least, it won't work for sentences with funny casing rules like Turkish.

To do:

  • add test to finding references to illustrate the problem, e.g. add Turkish test
  • add sentences lowercase field to sentences table
  • add some kind of data cleanup job to fix the cases for existing sentences -- might need to do this piecemeal, depending on the size of the database, e.g. fix 10K sentences at a time. Optionally, could do a check at startup and write some kind of progress to the console ... Need to check performance.
  • add sentence downcasing to whatever job loads the table
@jzohrab jzohrab added the bug Something isn't working label Dec 7, 2024
@jzohrab jzohrab added this to Lute-v3 Dec 7, 2024
@jzohrab jzohrab moved this to Todo in Lute-v3 Dec 13, 2024
@jzohrab jzohrab moved this from Todo to In Progress in Lute-v3 Dec 14, 2024
@jzohrab jzohrab self-assigned this Dec 14, 2024
@jzohrab jzohrab added the fixed Fixed in develop or master, to be launched. label Dec 14, 2024
@jzohrab
Copy link
Collaborator Author

jzohrab commented Dec 14, 2024

Phew, merged to develop. Seems fine in my dev env, tested with mecab load and unload as well.

Launch docs: may take a while to run on your machine, depending on how much data and the type of langs you're studying.

@jzohrab jzohrab moved this from In Progress to Done in Lute-v3 Dec 14, 2024
@jzohrab
Copy link
Collaborator Author

jzohrab commented Dec 21, 2024

Launched in 3.7.0

@jzohrab jzohrab closed this as completed Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed Fixed in develop or master, to be launched.
Projects
Archived in project
Development

No branches or pull requests

1 participant