Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jdom/plugin/thai #510

Merged
merged 6 commits into from
Nov 2, 2024
Merged

Jdom/plugin/thai #510

merged 6 commits into from
Nov 2, 2024

Conversation

jaydom28
Copy link
Contributor

@jaydom28 jaydom28 commented Nov 2, 2024

No description provided.

is_word_char = re.match(pattern, word) is not None
is_end_of_sentence = word in language.regexp_split_sentences
if is_end_of_sentence:
is_word_char = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to leave a comment about this logic here,

In Thai the "period" punctuation is and in my tests, the parser was showing it as a word and I figured it didn't make sense for punctuation to be words so that is why I have is_word_char set to False if the text is a sentence delimiter

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sometimes there maybe funny cases like this where you have to hardcode it.

@jzohrab jzohrab self-assigned this Nov 2, 2024
@jzohrab jzohrab added the enhancement New feature or request label Nov 2, 2024
@jzohrab
Copy link
Collaborator

jzohrab commented Nov 2, 2024

Looks great, thanks!

@jzohrab jzohrab merged commit 0d06197 into LuteOrg:develop Nov 2, 2024
@jzohrab
Copy link
Collaborator

jzohrab commented Nov 3, 2024

Released to pypi woot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants