The repository contains the data file and script corresponding to the paper "Exploiting Transformer based Multitask Learning for the Detection of Media Bias in News Articles".
All Multitask models can be found at https://drive.google.com/drive/folders/1s5netzimpnld-TMo3aZzyggBViERF3yT?usp=sharing
- "bias_dataset.xlsx": 1700 sentences extracted from news articles and labeled by 8 expert annotators. The labels ("Biased" or "Non-biased" are provided on sentence level)
Columns:
- "text": sentences extracted from news articles and labeled in terms of bias and opinion.
- "news_link": url to the news article from which the sentence is extracted.
- "outlet": news platform publishing the news article.
- "topic": news topic.
- "type": political orientation of news platform according to mediacloud.org.
- "label_bias": bias label for the sentence ("Biased" or "Non-biased").
- "label_opinion": opinion label for the sentence ("Expresses writer's opinion" or "Somewhat factual but also opinionated" or "Entirely factual".
- "biased_words": words marked as biased by the annotators.
- "MTL_evaluation.ipynb": script evaluating the MTL models on the bias dataset.
- "Appendix.pdf": appendix containing detailed information about the data sets we use for MTL training, experimental setup, and model parameters.