Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting beginning and end of token instead of labeling each token #3

Open
behdaad opened this issue Nov 30, 2018 · 2 comments
Open

Comments

@behdaad
Copy link

behdaad commented Nov 30, 2018

Since all tokens are in the form consecutive words, it would be much faster to select multiple words and select the label. All the labels could be inferred this way. Take this example:
پیست اسکی نسار بیجار استثنایی‌ترین ...
screen shot 1397-09-09 at 12 01 09
You can only select پیست as the starting word, بیجار as the ending word, and select the label مکان and it would be all done. No need to label each token separately.

However, I'm not sure if it's wise to make this the only way to label tokens. I'm not sure if there are examples of this method not working, but I'm almost certain you can find weird examples that cannot be labeled using this method.

This method may or may not be exposed in the API, but I believe it would make labeling by hand in the web interface much easier and faster. (Honestly, since submitting labels reloads the page, labeling tokens is tiresome. Combining this feature with #2 would make manual labeling much faster.)

@Hameds
Copy link
Member

Hameds commented Nov 30, 2018

Thank you. It's a good suggestion and we will consider it in development backlog but as you mentioned it shouldn't be the only way to label tokens. Currently, we are working on an improved version of the user panel that may resolve some UX issues such as this one

@Hameds Hameds added the Type: enhancement New feature or request label Nov 30, 2018
@Hameds Hameds added this to the Backlog milestone Dec 3, 2018
@shayan72
Copy link

shayan72 commented Mar 25, 2019

https://github.com/chakki-works/doccano is an open source annotation tool which has a similar approach to what has been suggested here and in #42. I don't get why do we need to specify the beginning of a token separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants