You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When this happens, the location to the source text is not preserved. Perhaps these should only be removed after creating the tokens, such that position to source text is preserved.
To Reproduce
fromflair.modelsimportSequenceTaggerfromflair.dataimportSentencetagger=SequenceTagger.load("flair/ner-english-large")
text="Hello my name is \u200c Chris Kamphuis, and I live in \u200c the Netherlands."sentence=Sentence(text)
tagger.predict(sentence)
forspaninsentence.get_spans():
print(text[span.start_position:span.end_position])
chriskamphuis
changed the title
[Bug]:
[Bug]: Prolematic character removal does not correct for position in source text.
Nov 21, 2024
chriskamphuis
changed the title
[Bug]: Prolematic character removal does not correct for position in source text.
[Bug]: Problematic character removal does not correct for position in source text.
Nov 21, 2024
Describe the bug
Some problematic character are removed before creating a Sentence object:
flair/flair/data.py
Line 1108 in 7eb8533
When this happens, the location to the source text is not preserved. Perhaps these should only be removed after creating the tokens, such that position to source text is preserved.
To Reproduce
produces:
Expected behavior
It should produce:
The text was updated successfully, but these errors were encountered: