Different results/scores for readability #176

KimSteyaert · 2023-02-07T15:44:24Z

KimSteyaert
Feb 7, 2023

I had an issue with the calculation of certain readability scores.
When I test the readability measures on the LOTR fragment, which was also used in your Readme files, I seem to get the correct results: flesch_reading_ease': 107.87857142857146, 'gunning_fog': 3.942857142857143. When I compare these scores with the scores calculated on other websites, I get the exact same results.

However, when I calculate these measures for different texts, I keep getting scores that differ remarkably from the scores calculated on https://charactercalculator.com/flesch-reading-ease/ and http://gunning-fog-index.com/

For the LOTR text, these websites give the following scores: Flesch reading ease: 107.88 and Gunning fog: 3.943.
Pycharm, using textdescriptives and spacy, gives me: Flesch reading ease 107.8785[...] and Gunning fog 3.9428[...]

The other sample text, however, has the following scores on the aforementioned websites: Flesch 41.78 and Gunning fog 12.25.
On Pycharm, these scores are respectively: 48.681 and 13.0639. These differences seem to be the case for every other text I try, except for the text used in the Readme.

Is it possible that different ways of calculating these scores have been used?
As I am new to using Textdescriptives and Spacy, I will add my code below, in the case I made an error.

import spacy
import textdescriptives as td

nlp = spacy.load("en_core_web_lg")
nlp.add_pipe("textdescriptives/descriptive_stats")
nlp.add_pipe("textdescriptives/readability")
nlp.add_pipe("textdescriptives/dependency_distance")

doc1 = nlp("The world is changed. I feel it in the water. I feel it in the earth. I smell it in the air. Much that once was is lost, for none now live who remember it.")

#all attributes are stored as a dict in the ..readability attribute
#print(doc1..descriptive_stats)
print(doc1..readability)
#print(doc1..dependency_distance)

doc = nlp("English texts for beginners to practice reading and comprehension online and for free. Practicing your comprehension of written English will both improve your vocabulary and understanding of grammar and word order. The texts below are designed to help you develop while giving you an instant evaluation of your progress.")
#print(doc..descriptive_stats)
print(doc..readability)
#print(doc._.dependency_distance)

{'flesch_reading_ease': 107.87857142857146, 'flesch_kincaid_grade': -0.048571428571428044, 'smog': 5.683917801722854, 'gunning_fog': 3.942857142857143, 'automated_readability_index': -2.4542857142857173, 'coleman_liau_index': -0.7085714285714317, 'lix': 12.714285714285715, 'rix': 0.4}
{'flesch_reading_ease': 48.68115646258505, 'flesch_kincaid_grade': 10.526938775510207, 'smog': 12.457975602129121, 'gunning_fog': 13.063945578231293, 'automated_readability_index': 12.593605442176866, 'coleman_liau_index': 14.667755102040818, 'lix': 53.06802721088435, 'rix': 6.0}

Process finished with exit code 0

Answered by HLasse

Feb 8, 2023

Hi @KimSteyaert, thanks for using the package and for opening this discussion.

The reason you're getting these deviations is likely because the online tools you link to use different tokenization and hyphenation models than TextDescriptives. We use spaCy's tokenizer and the pyphen module for hyphenation, whereas the online calculators do not share their code. Differences in hyphenation will lead to differences in the number of "complex words" in the Gunning Fog Index and in the number of total syllables in the Flesch Reading Ease formula which will again lead to differences in the output.

Small deviations like these are expected across implementations of readability metrics simply due to …

View full answer

HLasse · 2023-02-08T09:23:08Z

HLasse
Feb 8, 2023
Maintainer

Hi @KimSteyaert, thanks for using the package and for opening this discussion.

The reason you're getting these deviations is likely because the online tools you link to use different tokenization and hyphenation models than TextDescriptives. We use spaCy's tokenizer and the pyphen module for hyphenation, whereas the online calculators do not share their code. Differences in hyphenation will lead to differences in the number of "complex words" in the Gunning Fog Index and in the number of total syllables in the Flesch Reading Ease formula which will again lead to differences in the output.

Small deviations like these are expected across implementations of readability metrics simply due to the many different ways that you can tokenize and work with text. However, the deviations are usually fairly small (e.g. Gunning fog of 12.25 vs 13.06 is negligible for most purposes).

My suggestion is to stick to one calculator (e.g. either TextDescriptives or a website) to ensure that your results comparable and calculated in the same manner.

The code for calculating the readability metrics in TextDescriptives can be found here if you want to double check the implementation.

1 reply

KimSteyaert Feb 8, 2023
Author

Hi Hlasse! Thank you very much for your quick answer (and for creating this library, which has been very helpful for my research)!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results/scores for readability #176

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Different results/scores for readability #176

KimSteyaert Feb 7, 2023

Replies: 1 comment · 1 reply

HLasse Feb 8, 2023 Maintainer

KimSteyaert Feb 8, 2023 Author

KimSteyaert
Feb 7, 2023

Replies: 1 comment 1 reply

HLasse
Feb 8, 2023
Maintainer

KimSteyaert Feb 8, 2023
Author