You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Throughout our experiments, we found that the flexibility of Part-ofSpeech (POS) feature is most diagnostic of slang usage: Slang often entails structured POS transformation of existing syntactic uses of words"
"Slang-less sentence dataset: 15-thousand non-slang sentences from Wall Street News (2011-2016) in Penn Treebank (Marcus et al., 1993) as the negative examples." (there is also some additional postprocessing mentioned in the paper)
"Slang-specific sentence dataset: 15-thousand sentences that contain slang words from Online Slang Dictionary (http://onlineslangdictionary.com/) as the positive examples."
"All models are trained using the Adam optimizer (Kingma and Ba, 2015) with a learning rate
0.001 with β1 = 0.9 and β2 = 0.999."
Precision and recall used as statistics see wikipedia. Also F1-score as harmonic mean of precision and recall (Wikipedia)[https://en.wikipedia.org/wiki/F-score].
References paper by Vivek Kulkarni and William Wang from UCSB! It is here.
"We collected lexical entries of slang and conventional words/phrases from three separate online dictionaries: 1) Online Slang Dictionary (OSD), 2) Green’s Dictionary of Slang (GDoS), and 3) an open source subset of Urban Dictionary (UD) data from Kaggle. In addition, we gathered dictionary definitions of conventional senses of words from the online version of Oxford Dictionary (OD).
"We next performed a temporal analysis to evaluate whether our model explains slang emergence over time"
"large-scale analysis of the frequencies of nonstandard words in Reddit."
We analyze a set of public monthly Reddit comments posted between 1 June 2013 and 31 May 2016, totalling T = 36 months of data. This dataset has been analyzed in prior work (Hessel et al., 2016; Tan and Lee, 2015) and has been noted to have some missing data (Gaffney and Matias, 2018)
"Dissemination across many linguistic contexts is a predictor of success: words that appear in more linguistic contexts grow faster and survive longer. Furthermore, social dissemination plays a less important role in explaining word growth and decline than previously hypothesized."
Interestingly, they found linguistic dissemination to be a better explainer of worth growth event after controlling for social dissemination - linguistic dissemination is incredibly vital.
The text was updated successfully, but these errors were encountered:
rohildshah
changed the title
Research recent slang-identification research
Research NLP/slang literature
Jan 23, 2024
Slang detection and identification: https://aclanthology.org/K19-1082.pdf
"Throughout our experiments, we found that the flexibility of Part-ofSpeech (POS) feature is most diagnostic of slang usage: Slang often entails structured POS transformation of existing syntactic uses of words"
"Slang-less sentence dataset: 15-thousand non-slang sentences from Wall Street News (2011-2016) in Penn Treebank (Marcus et al., 1993) as the negative examples." (there is also some additional postprocessing mentioned in the paper)
"Slang-specific sentence dataset: 15-thousand sentences that contain slang words from Online Slang Dictionary (http://onlineslangdictionary.com/) as the positive examples."
"All models are trained using the Adam optimizer (Kingma and Ba, 2015) with a learning rate
0.001 with β1 = 0.9 and β2 = 0.999."
Precision and recall used as statistics see wikipedia. Also F1-score as harmonic mean of precision and recall (Wikipedia)[https://en.wikipedia.org/wiki/F-score].
Slang generation (similar authors as above): https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00378/100687/A-Computational-Framework-for-Slang-Generation
References paper by Vivek Kulkarni and William Wang from UCSB! It is here.
"We collected lexical entries of slang and conventional words/phrases from three separate online dictionaries: 1) Online Slang Dictionary (OSD), 2) Green’s Dictionary of Slang (GDoS), and 3) an open source subset of Urban Dictionary (UD) data from Kaggle. In addition, we gathered dictionary definitions of conventional senses of words from the online version of Oxford Dictionary (OD).
"We next performed a temporal analysis to evaluate whether our model explains slang emergence over time"
Slang growth and decline: https://aclanthology.org/D18-1467.pdf
"large-scale analysis of the frequencies of nonstandard words in Reddit."
We analyze a set of public monthly Reddit comments posted between 1 June 2013 and 31 May 2016, totalling T = 36 months of data. This dataset has been analyzed in prior work (Hessel et al., 2016; Tan and Lee, 2015) and has been noted to have some missing data (Gaffney and Matias, 2018)
"Dissemination across many linguistic contexts is a predictor of success: words that appear in more linguistic contexts grow faster and survive longer. Furthermore, social dissemination plays a less important role in explaining word growth and decline than previously hypothesized."
Interestingly, they found linguistic dissemination to be a better explainer of worth growth event after controlling for social dissemination - linguistic dissemination is incredibly vital.
The text was updated successfully, but these errors were encountered: