Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research NLP/slang literature #3

Open
3 tasks done
rohildshah opened this issue Jan 16, 2024 · 0 comments
Open
3 tasks done

Research NLP/slang literature #3

rohildshah opened this issue Jan 16, 2024 · 0 comments

Comments

@rohildshah
Copy link
Collaborator

rohildshah commented Jan 16, 2024

  • Slang detection and identification: https://aclanthology.org/K19-1082.pdf

  • "Throughout our experiments, we found that the flexibility of Part-ofSpeech (POS) feature is most diagnostic of slang usage: Slang often entails structured POS transformation of existing syntactic uses of words"

  • "Slang-less sentence dataset: 15-thousand non-slang sentences from Wall Street News (2011-2016) in Penn Treebank (Marcus et al., 1993) as the negative examples." (there is also some additional postprocessing mentioned in the paper)

  • "Slang-specific sentence dataset: 15-thousand sentences that contain slang words from Online Slang Dictionary (http://onlineslangdictionary.com/) as the positive examples."

  • "All models are trained using the Adam optimizer (Kingma and Ba, 2015) with a learning rate
    0.001 with β1 = 0.9 and β2 = 0.999."

  • Precision and recall used as statistics see wikipedia. Also F1-score as harmonic mean of precision and recall (Wikipedia)[https://en.wikipedia.org/wiki/F-score].

  • Slang generation (similar authors as above): https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00378/100687/A-Computational-Framework-for-Slang-Generation

  • References paper by Vivek Kulkarni and William Wang from UCSB! It is here.

  • "We collected lexical entries of slang and conventional words/phrases from three separate online dictionaries: 1) Online Slang Dictionary (OSD), 2) Green’s Dictionary of Slang (GDoS), and 3) an open source subset of Urban Dictionary (UD) data from Kaggle. In addition, we gathered dictionary definitions of conventional senses of words from the online version of Oxford Dictionary (OD).

  • "We next performed a temporal analysis to evaluate whether our model explains slang emergence over time"

  • Slang growth and decline: https://aclanthology.org/D18-1467.pdf

  • "large-scale analysis of the frequencies of nonstandard words in Reddit."

  • We analyze a set of public monthly Reddit comments posted between 1 June 2013 and 31 May 2016, totalling T = 36 months of data. This dataset has been analyzed in prior work (Hessel et al., 2016; Tan and Lee, 2015) and has been noted to have some missing data (Gaffney and Matias, 2018)

  • "Dissemination across many linguistic contexts is a predictor of success: words that appear in more linguistic contexts grow faster and survive longer. Furthermore, social dissemination plays a less important role in explaining word growth and decline than previously hypothesized."

  • Interestingly, they found linguistic dissemination to be a better explainer of worth growth event after controlling for social dissemination - linguistic dissemination is incredibly vital.

@rohildshah rohildshah changed the title Research recent slang-identification research Research NLP/slang literature Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant