The Lithuanian dependency treebank ALKSNIS v3.0 (Vytautas Magnus university). From v.2.1 to v.3.0 was developed during the project "Semantika2" (Nr. 02.3.1-CPVA-V-527-01-0002)
This is a new corrected and enhanced version of the ALKSNIS Lithuanian treebank. It is annotated in a style derived from the Prague Dependency Treebank of Czech.
The previous ALKSNIS v2.1 consists of 2,355 syntactically annotated sentences. Each node of a tree corresponds to a word, a punctuation mark or other text element (symbol, digit etc.) within a sentence. ALKSNIS v.2.1 is published in CLARIN LT repository at http://hdl.handle.net/20.500.11821/10. (Some users experience DNS errors when trying to access the repository; configuring the client machine to use 8.8.8.8 as the DNS server may help. See also http://clarin-lt.lt/?page_id=86.)
A version of the MULTEXT-East (http://nl.ijs.si/ME/V4/msd/html/index.html) tag set is used in ALKSNIS v2.1. The following information is presented for each node: 1) a used form; 2) a lemma; 3) a morphology tag, and 4) a syntactic function (subject, object, etc.). Dependencies are shown by links between words.
ALKSNIS v3.0 from v2.1 was developed during the Vytautas Magnus University project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002). It consists of 3,643 syntactically annotated sentences.
Modifications from v2.1 to 3.0 (2019-07-08)
- The older version undergone full review of syntactic information based on improved guidelines to enhance annotation quality.
- New layer added: non-compositional multiword expressions (light verbs and idioms).
- Added new data: scientific abstracts and reviews, additional administrative texts.
- Schema version modified as 3.0.
- Jablonskis tagset, which is human-friendly, is used instead of MULTEXT-East tagset.
- Some syntactic relations were corrected or modified (details to be published in the improved guidelines).
- Conllu files are added together with the pml files (RMQ conllu files does not keep the mwe field).
- ALKSNIS-3.0.ZIP - The Lithuanian dependency treebank files.
- Jablonskis-LT.pdf - Morphological annotation standart used in ALKSNIS.
- ALksnio-3.0_sandara.docx - the structure of ALKSNIS v.3.0 files
From v.2.1 to v.3.0 was developed during the project "Semantika2" (Nr. 02.3.1-CPVA-V-527-01-0002). The Project funded by European Structural Funds
For ALKSNIS v.2.1: • Agnė Bielinskienė, Loïc Boizou, Jolanta Kovalevskaitė, Erika Rimkutė (2016): Lithuanian Dependency Treebank ALKSNIS. In: I. Skadiņa and R. Rozis (Eds.): Human Language Technologies – The Baltic Perspective, pp. 107–114. Amsterdam: IOS Press. doi:10.3233/978-1-61499-701-6-107 http://fcim.vdu.lt/~erika_rimkute/straipsniai/Alksnis_HLT.pdf, http://ebooks.iospress.nl/volumearticle/45523
- License: CC BY-SA 4.0;
- Includes text: yes;
- Genre: news nonfiction legal scientific;
- Lemmas: manual native;
- UPOS: converted from manual;
- XPOS: manual native;
- Features: converted from manual;
- Relations: converted from manual;
- Contributors: Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina;
- Contact: [email protected], [email protected].