Supplementary Data for the paper "Microsyntactic Unit Detection using Word Embedding Models: Experiments on Slavic Languages"
This repository contains supplementary data for the submission titled "Microsyntactic Unit Detection using Word Embedding Models: Experiments on Slavic Languages." The paper explores the performance of word embedding models in detecting microsyntactic units in Slavic languages.
The repository is organized as follows:
- src/: This directory contains the evaluation and processing code used for analyzing the results.
- data/: This directory contains the multilingual data used in the experiments. It includes the microsyntactic units by category and in six languages (Belarusian, Bulgarian, Czech, Polish, Ukrainian, Russian), and their compositional counterparts.
- trained_models/ (would be published later): This directory contains the trained word embedding models used in the experiments.