This is the code for the paper Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling
This is a collaborative work by Adeep Hande, Karthik Puranik, Konthala Yasaswini, Ruba Priyadharshini, Sajeetha Thavareesan, Anbukkarasi Sampath, Kogilavani Shanmugavadivel, Durairaj Thenmozhi, and Bharathi Raja Chakravarthi
This approach could be used for any multilingual datasets. The weights of the fine-tuned models are available on my Huggingface account AdWeeb.
We have provided the notebooks for reference.
[To be updated soon]
If you use our dataset, and/or find our codes useful, please cite our paper:
@misc{hande2021offensive,
title={Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling},
author={Adeep Hande and Karthik Puranik and Konthala Yasaswini and Ruba Priyadharshini and Sajeetha Thavareesan and Anbukkarasi Sampath and Kogilavani Shanmugavadivel and Durairaj Thenmozhi and Bharathi Raja Chakravarthi},
year={2021},
eprint={2108.12177},
archivePrefix={arXiv},
primaryClass={cs.CL}
}