The “de/da” clitic in Turkish is a conjunction when it is written separately and has the same meaning as "as well", "too", and "also" in English. In addition to being a conjunction, the “de” and “da” homonyms may be used as locative suffixes meaning “at” or “in”. For example, the word “araba” (car) with the suffix “-da” (“arabada”) means “in the car”. Although the “de/da” clitic in the meaning of conjunction must always be written separately, it is commonly confused with the locative suffix "de/da" and incorrectly written concatenated to the previous word. This project focuses on a common spelling error in Turkish, namely the spelling of the “de/da” and "ki" clitics.
Detailed explanation about the project is in this document: Document
git clone https://github.com/asumansaree/TurkishSpellChecker
cd TurkishSpellChecker
For "de/da" separation testing
cd Data/
Edit the test_sentences_de.txt (or other text_sentences for other models) and save
cd ../Test
python3 test_for_de_separation.py
Output will be printed to the terminal (or if you're using Colab, you'll see it directly)
Contact me for any problem and question [email protected]
- "Detecting Clitics Related Orthographic Errors in Turkish", Proceedings of Recent Advances in Natural Language Processing, pages 71–76, Varna, Bulgaria, Sep 2–4, 2019.