Skip to content

Different results/scores for readability #176

Closed Answered by HLasse
KimSteyaert asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @KimSteyaert, thanks for using the package and for opening this discussion.

The reason you're getting these deviations is likely because the online tools you link to use different tokenization and hyphenation models than TextDescriptives. We use spaCy's tokenizer and the pyphen module for hyphenation, whereas the online calculators do not share their code. Differences in hyphenation will lead to differences in the number of "complex words" in the Gunning Fog Index and in the number of total syllables in the Flesch Reading Ease formula which will again lead to differences in the output.

Small deviations like these are expected across implementations of readability metrics simply due to …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@KimSteyaert
Comment options

Answer selected by KimSteyaert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants