Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Position index not corresponding to index in text #13

Open
3 tasks
alexhebing opened this issue Jul 2, 2019 · 0 comments
Open
3 tasks

Position index not corresponding to index in text #13

alexhebing opened this issue Jul 2, 2019 · 0 comments
Labels
bug Something isn't working

Comments

@alexhebing
Copy link

alexhebing commented Jul 2, 2019

When an IntegratedNamedEntity instance is asked to return the text for the entity (i.e. get_text(), it does 'something smart' to retrieve the text of choice (it includes user configurable settings if need be). However, this does not update the position of the text if that is needed. Multiner deals correctly with entities that overlap (e.g. 'John Doe' and 'John' at index 14) by considering them the same entity. This is also true for an example like this, where the suggested entities does not start at the position but overlap nonetheless:

'''
[{ 'text': 'La Cassa Rurale di Trento', 'pos': 22, 'type': 'LOC' }, 'text': 'Trento', 'pos': 38, 'type': 'LOC']
'''
However, if in a case like the above get_text is called, and the 'something smart' does its work, the position might be completely off (i.e. multiNER might return something like { 'text': ' Trento', 'pos': 22, 'type': 'LOC' } (Note the incorrect index)!

  • Add unit test to prove the above
  • Fix the bug
  • Modify unit test to prove that multiner can deal with cases like this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant