-
Notifications
You must be signed in to change notification settings - Fork 7
Introduction to Treebanking Part I (February 10)
Giuseppe G.A. Celano (University of Leipzig) and Dag Haug (University of Oslo)
Sunoikisis Digital Classics 2016
Session 2: Introduction to treebanking
February 10, 2016, 17h–18h15 CET
Giuseppe Celano (University of Leipzig) and Dag Haug (University of Oslo)
The lecture aims to be a gentle introduction to what treebanking is and how it can be useful for (linguistic) research.
Outline of class
- Introduction to treebanking (GC 30 min)
- Using and querying treebanks (DH 30 min)
- Use cases for treebanks
- Querying with INESS query language: http://iness.uib.no
Required tasks
-
Go through the analysis of at least the first 20 sentences of Phaedrus' Fables and Aesop's Fables (Aesop’s fables also have semantic annotation: click on the SG tab on the right to see it)
-
Alternatively, if you cannot follow Latin, focus on English (note, however, that the rules for Latin annotation are different from those for English in a few respects. Documentation can be found on the website for Universal Dependencies, whose link is in required readings below):
-
Go to Stanford CoreNLP, where you can have sentence automatic annotation (simply type a sentence in the form field). Examples you can use are:
-
A Wolf and a Lamb had come to the same stream.
-
This fable is applicable to those men who, under false pretences,
-
oppress the innocent.
-
A Wolf indicted a Fox upon a charge of theft.
Required readings
- Dag Haug, “Treebanks in historical linguistic research” in Carlotta Viti (ed.), Perspectives on Historical Syntax, Benjamins 2015, p. 188-202. A preprint is available here. For the published version context Dag Haug at [email protected]
- Celano, Giuseppe G. A. 2014. Guidelines for the annotation of the Ancient Greek Dependency Treebank 2.0. https://github.com/PerseusDL/treebank_data/edit/master/AGDT2/guidelines (only Chapter 3, including analysis of the hyperlinked examples)
- Bamman David & al. 2008. Guidelines for the Syntactic Annotation of Latin Treebanks (v. 1.3). http://nlp.perseus.tufts.edu/syntax/treebank/1.3/docs/guidelines.pdf (only p. 3-21; 24; 26)
- Universal Dependencies: http://universaldependencies.github.io/docs/#language-en (In particular: Introduction and Syntax: General Principles)
Practical exercise
-
Create an account on Perseids (insert the following text with the Text input method described in 3.1)
-
Beneficiorum simplex ratio est: tantum erogatur; si reddet aliquid, lucrum est, si non reddet, damnum non est. Ego illud dedi, ut darem. Nemo beneficia in calendario scribit nec avarus exactor ad horam et diem appellat. Numquam illa vir bonus cogitat nisi admonitus a reddente; alioqui in formam crediti transeunt. Turpis feneratio est beneficium expensum ferre.
-
Treebank the passage (there should be at least two independent annotators, i.e., they should not talk to each other about the annotation of the passage)
-
Query exercise to come