-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to get character indicies #19
Conversation
Hi, the PR looks great however, in this case, I would opt for 2 more return keys "clusters_char" and "cluster_heads_chars" or even better, replace the old keys to deprecate this behaviour and go for a more tokenization-agnostic implementation. |
Thanks for taking a look. Returning the chars in separate return keys makes sense to me. Wouldn't your second suggestion just be my PR but with |
Yes it will be a breaking change but also allow for a more uniform integration with non spacy tokenization. I prefer to have one good working default option that can be used for both instead of having everything separate and endlessly configurable. On 16 Mar 2023, at 10:03, Mathijs B ***@***.***> wrote:
Thanks for taking a look. Returning the chars separately makes sense to me. Wouldn't your second suggestion just be my PR but with char_indices forced to be on? I imagine that's quite the breaking change. Maybe I misunderstand?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were assigned.Message ID: ***@***.***>
|
Sounds fine to me. Shall I then just remove the option and make it so it always returns char indices? I can clean the code up a bit doing that. I'll also update the readme to reflect this change. |
@davidberenstein1957 I've implemented your suggestions, let me know if this is what you had in mind. If so, I'll update the readme. |
@davidberenstein1957 any chance you can take a look at this pr? |
Yes, I will take a look on Wednesday.On 3 Apr 2023, at 08:41, Mathijs B ***@***.***> wrote:
@davidberenstein1957 any chance you can take a look at this pr?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LFTM
char_indices
parameter, default=falseImplements #17