Inconsistent exception is raised when series containing Nans is passed ro nlpretext.basic.preprocess.remove_stopwords
#205
Labels
bug
Something isn't working
🐛 Bug Report
When using the
remove_stopwords
function, if your text column has empty values, nlpretext will raise inconsistent exceptions(about language choice).🔬 How To Reproduce
Steps to reproduce the behavior:
load data, convert to DataFrame, concatenate the two text columns without a space between them. some rows will be empty.
Try using remove_stopwords
Code sample
Environment
Screenshots
First exception:
Then when replacing 'fr' by 'fr_scpacy':
📈 Expected behavior
remove the stopwords without errors (convert nans to string ?), or get an excpetion saying "your text colum contains Nans, please fix it"
📎 Additional context
Workaround:
data["text"] = data["tagline"] + " " + data["overview"]
solves it as all rows will be non-empty strings.The text was updated successfully, but these errors were encountered: