KODOLI

KODOLI is a novel KOrean Dataset for Offensive Language Identification.

Warning: it contains highly offensive expressions.

KODOLI comprises more fine-grained offensiveness categories (i.e., not offensive, likely offensive, and offensive)
A likely offensive language refers to texts with implicit offensiveness or abusive language without offensive intentions.
In addition, we propose two auxiliary tasks to help identify offensive languages: abusive language detection and sentiment analysis.
- You could utilize toxic detection through the auxiliary task. (Be careful the raw expressions)

Download

You can download benchmark KODOLI in this repository. Please, follow the data's license.

Dataset Description

Source

Texts are mainly collected and sampled from online communities and news articles.

Statistics

Guideline Details

Guideline(ENG.)

[Guideline(KOR.)] Comming Soon

Updates

Apr 20, 2023 We release 3.6k examples for offensive language identification task

Citation

@inproceedings{park2023feel,
  title={“Why do I feel offended?”-Korean Dataset for Offensive Language Identification},
  author={Park, San-Hee and Kim, Kang-Min and Lee, O-joun and Kang, Youjin and Lee, Jaewon and Lee, Su-min and Lee, Sangkeun},
  booktitle={Findings of the Association for Computational Linguistics: EACL 2023},
  pages={1112--1123},
  year={2023}
}

Contributors

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bow		bow
data		data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KODOLI

Download

Dataset Description

Source

Statistics

Guideline Details

Updates

Citation

Contributors

License

About

Releases

Packages

Contributors 3

cardy20/KODOLI

Folders and files

Latest commit

History

Repository files navigation

KODOLI

Download

Dataset Description

Source

Statistics

Guideline Details

Updates

Citation

Contributors

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages