GitHub - zjy-ucas/DeepLearning-for-NLP

#Deep Learning for NLP

A list of resources dedicated to deep learning for natural language processing tasks

##Word Vector

Bengio Y, Schwenk H, Senécal J S, et al. A Neural Probabilistic Language Models[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155.
—— Introduction to a neural langauge model that learns a distributed representation for each word, along with the probability function for word sequence.
Morin F, Bengio Y. Hierarchical probabilistic neural network language model[J]. Aistats, 2005.
—— A hierarchical neural network called hierasrchical softmax that provides exponential speed-up when used to compute conditional probabilities.
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science, 2013.
—— CBOW and Skip-gram are two new log-linear model architectures for learning distributed representations of words. They can be used for learning high-quality word vectors from huge data sets with billions of words, and with millions of words in the vocabulary.
of words and a
Gutmann M U, Hyv&#, Rinen A. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics[J]. Journal of Machine Learning Research, 2012, 13(1):307-361.
—— Noise-Contrastive Estimation(NCE) is an is an objective function for estimation of both normalized and unnormalized models, a simplified verson of NEC called Negative Sampling Estimation(NSC) is applied on Word2Vec to speed up training.
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26:3111-3119.
—— This paper discribes architecture of Google's word2vec, it is an extension of Skip-gram models with subsampling of frequent words and NSC as an alternation to the hierarchical softmax.
Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method[J]. Eprint Arxiv, 2014.
—— Detailed description of Negative Sampling Estimation
Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Conference on Empirical Methods in Natural Language Processing. 2014.
—— Glove is a global logbilinear regression model that combines the advantages of the two major model families:global matrix factorization and local context window methods
Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. 2016.
—— Langusge model for Facebook's FastText, it is an extension of Skip-gram model and propose a different scoring function that take into account of internal structure of words.
word2vec中的数学原理详解
—— A Chinese blog for word2vec

##Tools

Gensim is a free Python library designed to automatically extract semantic topics from documents, as efficiently (computer-wise) and painlessly (human-wise) as possible.It can also be used to train word2vec models
TensorFlow is an open source software library for machine intelligence,The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

zjy-ucas/DeepLearning-for-NLP

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages