This is a GitHub-repository for the research project Emotions in Drama (EmoDrama).
The repository is used to store data and other material that was created during the project and meant to be published or serves as additional material for papers.
Please refer to the reference section for information about publications and scientific contributions from this project.
Some general information about the repository:
- All data larger than 100 MB is either zipped or linked via Google Drive or Seafile.
- English/German-translation information can be found here.
- Some vocabulary in naming conventions of data may differ from the papers: tag_type = sub-emotions, base_polarity = (positive, negative, being moved, eventually no annotation), polarity = (only positive/negative).
- Despite great care, errors in the data or folder structure can always occur. If you think you noticed a problem, feel free to reach out: [email protected]
The main components of this repository are described as follows:
The annotation folder contains the final annotations of 18 plays according to our guidelines (Dennerlein, Schmidt & Wolff, 2022c) and has the following structure:
- Emotions: Emotion annotations for 18 plays including implicit source/target annotations.
- Raw_Emotion_Annotations: Unfiltered (raw) emotion annotations by two annotators each.
- Filtered_Emotion_Annotations_withNoAnnotation: Emotion annotations filterd by disagreements and including the no annotation class (non-annotated material). These are the annotations used to train the final classification model.
- Source_Target: Implicit and explicit source/target annotations for 18 plays and for all plays summed up.
- Please refer to Dennerlein, Schmidt and Wolff (2022c), Dennerlein, Schmidt and Wolff (2023b) and Dennerlein, Schmidt and Wolff (2023a) for explanations of annotation data, processes and German/English translations.
- Main annotation attributes are: tag_type (sub-emotions), main_emotion_class, base_polarity (positive, negative, being moved, eventually no annotation).
- Annotations of different annotators can be identified using the annotator attribute and consist of 2-letter-abbreviations (e.g. VH, LS, exception: LSch).
- The annotations were performed under guidance (technical issues: Thomas Schmidt, history of drama: Katrin Dennerlein) by the following students: Carlina Eizenberger (CE), Viola Hippler (VH), Nadine Kastenhofer (NK), Julia Jäger (JJ), Emma Russ (ER), Leon Sautter (LS), Lisa Schattmann (LSch).
This folder contains a link to the transformer-based large language model used in Dennerlein, Schmidt and Wolff (2023a) and Dennerlein, Schmidt and Wolff (2024) as well as evaluation information. This model (based on gbert-large by deepset) achieves an average F1-score for sub-emotion classification of 72% and was fine-tuned for four epochs with a batch size of 32, a learning rate of 4e-5, and the Adam optimizer, utilizing a Tesla P100 GPU with the filtered annotations.
Please refer to the hugging face library on how to use the unzipped model and the models in this repository in general. We also included a link to a google colab for a basic usage example in this folder.
This folder contains the results of the application of the classification model on the sentences of 313 plays. Subsets of these results are used in Dennerlein, Schmidt and Wolff (2023a) and Dennerlein, Schmidt and Wolff (2024). The original plays are from GerDracor, TextGrid or based on our own work.
Some information regarding the classification results:
- To get the exact corpora used in our papers please refer to the Additional_Data_Per_Paper section.
- Main annotation attributes are: pre1_tag_type (sub-emotions), pre1_main_emotion_class (main emotion class), pre1_base_polarity (positive, negative, being moved, eventually no annotation).
- The type attribute differs between character speeches (spoken text) and stage directions (Dennerlein, Schmidt & Wolff, 2024)
- Additional metadata is derived from the data in the Metadata folder.
- Only plays of the GerDracor corpus contain consistent gender information and character ids.
Metadata about the plays used for the classification process. Plays can be identified via the file attribute. The metadata was acquired during the corpus preparation process of the project.
Some information regarding the metadata:
- The correct publication year (as used for Dennerlein, Schmidt & Wolff, 2024) is stored in the "Sortierdatum"-attribute (sorting date). Other year attributes offer additional information.
- Genre is stored in "genre classification": S=Schauspiel (drama), T=Tragedy, K=Comedy. Via the subgenre attribute, satirical Saxon type comedies ("Sächsische Typenkomödien") are marked as they are used in Dennerlein, Schmidt and Wolff (2023a).
This folder contains additional data separated by papers and publications:
- 2021_vDHd_Using_Deep_Learning_For_Emotion_Analysis: Evaluation results, models and further data for Schmidt, Dennerlein and Wolff (2021c).
- 2021_LDK_Towards_A_Corpus_Historical_German_Plays_With_Emotion_Annotations: All relevant data for Schmidt, Dennerlein and Wolff (2021b) can be found in the Annotations-folder in the main branch.
- 2021_LaTeCHCLfL_Emotion_Classification_In_German_Plays: Evaluation results, models, and further data for Schmidt, Dennerlein and Wolff (2021a).
- 2022_DHd_Emotionen_Im_Kulturellen_Gedaechtnis_bewahren: Additional annotation trend results (tables and visualizations) for Dennerlein, Schmidt and Wolff (2022b).
- 2022_DHd_Evaluation_Computergestuetzter_Verfahren_der_Emotionsklassifikation: Additional data (evaluation results, models etc.) for Schmidt, Dennerlein and Wolff (2022) can be found in the folder 2021_vDHd_Using_Deep_Learning_For_Emotion_Analysis.
- 2022_DH_Emotion_Courses_In_German_Historical_Comedies_And_Tragedies: Additional material including annotation and classification trend results (tables and visualizations), models, distribution data, graphs, and classification results for Dennerlein, Schmidt and Wolff (2022a).
- 2023_DH_Results_Of_Emotion_Annotation: All relevant data for Schmidt, Dennerlein and Wolff (2023) can be found in the Annotations-folder in the main branch.
- 2023_DSH_Computational_Emotion_Classification_For_Genre_Corpora: All additional material for Dennerlein, Schmidt and Wolff (2023a) including distribution analysis, visualizations, and classification results. The model used in this study can be found in the Classification_Model folder of the main branch.
- 2023_ZfdG_EmoDrama_Ein_Korpus_mit_Emotionsinformationen: All relevant data for Dennerlein, Schmidt and Wolff (2023b) can be found in the Annotations-folder in the main branch.
- 2024_CDA_Emotions_in_Stage_Directions: All additional material for Dennerlein, Schmidt and Wolff (2024) including distribution tables, visualizations, classification results, and word frequency data like word clouds. Please note that data regarding stage directions uses the abbreviation "_stages" and character speeches (spoken text) "_cp". The model used in this study can be found in the Classification_Model folder of the main branch.
In this folder, you can find all presentation slides that were used for meetings of the priority program computational literary studies including meetings of the working group annotation and sentiment analysis.
Brandes, Ph., Dennerlein, K., Jacke, J., Marshall, S., Pielström, St., Schneider, F. (2022). Modelling and Operationalizing Concepts in Computational Literary Studies. In DH2022 Local Organizing Committee (Ed.), Responding to Asian Diversity. Digital Humanities 2022 Conference Abstracts. (pp. 70–73). Alliance of Digital Humanities Organizations (ADHO). https://dh-abstracts.library.virginia.edu/works/11818
Dennerlein, K., Schmidt, T., & Wolff, C. (2022a). Emotion courses in German historical comedies and tragedies. In DH2022 Local Organizing Committee (Ed.), Responding to Asian Diversity. Digital Humanities 2022 Conference Abstracts. (pp. 193–197). Alliance of Digital Humanities Organizations (ADHO). https://dh-abstracts.library.cmu.edu/works/11929
Dennerlein, K., Schmidt, T., & Wolff, C. (2022b). Emotionen im kulturellen Gedächtnis bewahren. In M. Geierhos, P. Trilcke, I. Börner, S. Seifert, A. Busch, & P. Helling (Eds.), DHd 2022 Kulturen des digitalen Gedächtnisses. 8. Tagung des Verbands “Digital Humanities im deutschsprachigen Raum” (DHd 2022) (pp. 93–98). Zenodo. https://doi.org/10.5281/zenodo.6327957
Dennerlein, K., Schmidt, T., & Wolff, C. (2023a). Computational emotion classification for genre corpora of German tragedies and comedies from 17th to early 19th century. Digital Scholarship in the Humanities, 38(4), 1466–1481. https://doi.org/10.1093/llc/fqad046
Dennerlein, K., Schmidt, T., & Wolff, C. (2023b). EmoDrama. Ein Korpus mit Emotionsinformationen in Dramen von 1650-1815. Zeitschrift für digitale Geisteswissenschaften (ZfdG). https://doi.org/10.17175/2023_010
Dennerlein, K., Schmidt, T., & Wolff, C. (2024; In publication). Emotions in Stage Directions in German Drama of the Early Modern Period: Explorations via Computational Emotion Classification. In M. Andresen & N. Reiter (Eds.), Computational Drama Analysis. Reflecting Methods and Interpretation. (pp. 166–194). De Gruyter.
Schmidt, T., Dennerlein, K., & Wolff, C. (2021a). Emotion Classification in German Plays with Transformer-based Language Models Pretrained on Historical and Contemporary Language. In S. Degaetano-Ortlieb, A. Kazantseva, N. Reiter, & S. Szpakowicz (Eds.), Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (pp. 67–79). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.latechclfl-1.8
Schmidt, T., Dennerlein, K., & Wolff, C. (2021b). Towards a Corpus of Historical German Plays with Emotion Annotations. In D. Gromann, G. Sérasset, T. Declerck, J. P. McCrae, J. Gracia, J. Bosque-Gil, F. Bobillo, & B. Heinisch (Eds.), 3rd Conference on Language, Data and Knowledge (LDK 2021) (Vol. 93, p. 9:1-9:11). Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/OASIcs.LDK.2021.9
Schmidt, T., Dennerlein, K., & Wolff, C. (2021c). Using Deep Learning for Emotion Analysis of 18th and 19th Century German Plays. In M. Burghardt, L. Dieckmann, T. Steyer, P. Trilcke, N.-O. Walkowski, J. Weis, & U. Wuttke (Eds.), Fabrikation von Erkenntnis: Experimente in den Digital Humanities. Teilband 1. Melusina Press. https://doi.org/10.26298/melusina.8f8w-y749-udlf
Schmidt, T., Dennerlein, K., & Wolff, C. (2022). Evaluation computergestützter Verfahren der Emotionsklassifikation für deutschsprachige Dramen um 1800. In M. Geierhos, P. Trilcke, I. Börner, S. Seifert, A. Busch, & P. Helling (Eds.), DHd 2022 Kulturen des digitalen Gedächtnisses. 8. Tagung des Verbands “Digital Humanities im deutschsprachigen Raum” (DHd 2022) (pp. 107–113). Zenodo. https://doi.org/10.5281/zenodo.6328169
Schmidt, T., Dennerlein, K., & Wolff, C. (2023). Results of Emotion Annotation in German Drama from 1650-1815. In A. Baillot, T. Tasovac, W. Scholger, & G. Vogeler (Eds.), Digital Humanities 2023. Collaboration as Opportunity (DH2023) (pp. 181–183). Alliance of Digital Humanities Organizations (ADHO). https://doi.org/10.5281/ZENODO.8107952
Dennerlein, K., Schmidt, T., & Wolff, C. (2022c). Figurenemotionen in deutschsprachigen Dramen annotieren. Zenodo. https://doi.org/10.5281/zenodo.6228152
Dennerlein, K. (2021). Emotion und Gattung. Zur Analyse von Dramen um 1800. Göttingen, Germany. (Presentation at the University Göttingen).
Dennerlein, K. & Schmidt, T. (2021). Annotating and quantifying sentiment and emotions in German plays from around 1800. In Sentiment Analysis in Literary Studies (Workshop). Graz, Austria. (Keynote presentation). Video of the presentation: https://www.youtube.com/watch?v=WvJ8BvaSJCw
Schmidt, T., Dennerlein, K. & Wolff, C. (2022). Insights and Perspectives of the Research project ‘Emotions in Drama’. In Computational Stylistics Workshop on Emotion and Sentiment Analysis in Literature. Paris, France. (Presentation at the University Paris)
Wolff, C., Dennerlein, K. & Schmidt, T. (2020). Emotions in Drama - Emotionen im Drama. Projektvorstellung. In Digital Humanities Day Leipzig 2020 (DHDL 2020). (Poster presentation). Link to poster: https://fdhl.info/wp-content/uploads/2020/12/Poster_DINA4.pdf Link to video: https://youtube.com/watch?v=9DdybUzN92E
This project was funded by the DFG (German Research Association) in the priority programme Computational Literary Studies (SPP 2207/1) with two grants (project number 424207618; grants DE 2188/3-1 and WO 835/4-1). https://dfg-spp-cls.github.io/projects_en/2020/01/24/TP-Emotions_in_Drama/
All material on this repository is licensed under a CC BY 4.0 Deed license.
That means if you use any material of this repository, please cite one or more of the papers in the publications section depending on what material you use and what fits your usage best.
If you use the annotations in general or the metadata without a focus on a specific publication, please cite the following publications:
- Dennerlein, K., Schmidt, T., & Wolff, C. (2023b). EmoDrama. Ein Korpus mit Emotionsinformationen in Dramen von 1650-1815. Zeitschrift für digitale Geisteswissenschaften (ZfdG). https://doi.org/10.17175/2023_010
- Schmidt, T., Dennerlein, K., & Wolff, C. (2023). Results of Emotion Annotation in German Drama from 1650-1815. In A. Baillot, T. Tasovac, W. Scholger, & G. Vogeler (Eds.), Digital Humanities 2023. Collaboration as Opportunity (DH2023) (pp. 181–183). Alliance of Digital Humanities Organizations (ADHO). https://doi.org/10.5281/ZENODO.8107952
- Schmidt, T., Dennerlein, K., & Wolff, C. (2021). Emotion Classification in German Plays with Transformer-based Language Models Pretrained on Historical and Contemporary Language. In S. Degaetano-Ortlieb, A. Kazantseva, N. Reiter, & S. Szpakowicz (Eds.), Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (pp. 67–79). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.latechclfl-1.8
- Dennerlein, K., Schmidt, T., & Wolff, C. (2023). Computational emotion classification for genre corpora of German tragedies and comedies from 17th to early 19th century. Digital Scholarship in the Humanities, 38(4), 1466–1481. https://doi.org/10.1093/llc/fqad046