CNN for sentiment analysis

Overview

This project has been created from my own goal of implementing a research paper for the first time. Yoon Kim's CNN for Sentence Classification is something that I have some prior domain experience with my attempt to build an LSTM recurrent neural network (found here).

My approach in this notebook is to go through the different models that the paper has created and performed ablation studies on various datasets.

Lastly, this project's other goal is to dive in deeper with the PyTorch framework and writing more streamlined functions to avoid code repeatability.

Model architectures

In the paper, there are four various CNN architectures created.

1) `CNN-rand`

In this model, the architecture goes as follows: Embedding -> Conv2d -> MaxPool1d -> Dropout -> Linear. The Embedding layer uses no pre-trained word embeddings and its parameters are learned during training.

2) `CNN-static`

In this model, it has the same architecture as in CNN-rand. The only difference is the Embedding layer. A pre-trained word embedding, GloVe, is used and its parameters are not learned during training (i.e., setting requires_grad to False in this layer).

3) `CNN-non-static`

In this model, it has the same architecture as CNN-rand. Unlike CNN-static, this model enables the learning of the Embedding layer's parameters. Lastly, a pre-trained word embedding is used.

4) `CNN-multichannel`

In this model, it combines the idea of having a "2-channel" embedding matrix: static and non-static channels. Note that it has the same architecture as all of the other models above. The idea of this model is to learn various contexts for the embedding layers. The static channel is used to regulate the learned parameters in the non-static channel. As a result, the word embeddings of the static channel still maintains the relationship of words from the pre-trained word embedding. Meanwhile, the non-static channel gets to learn more about the relationship of words with others based on the movie reviews context.

Credits

Thank you to Yoon Kim's paper for inspiring me with creating a CNN-based sentiment analysis model. Also, thanks to this repository (in particular, Juypter notebook titled 4 - Convolutional Sentiment Analysis) for guiding me into starting the scaffolding for the CNN-rand model architecture.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
cnn-sentiment-analysis.ipynb		cnn-sentiment-analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN for sentiment analysis

Overview

Model architectures

1) `CNN-rand`

2) `CNN-static`

3) `CNN-non-static`

4) `CNN-multichannel`

Credits

About

Releases

Packages

Languages

francislata/CNN-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

CNN for sentiment analysis

Overview

Model architectures

1) CNN-rand

2) CNN-static

3) CNN-non-static

4) CNN-multichannel

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

1) `CNN-rand`

2) `CNN-static`

3) `CNN-non-static`

4) `CNN-multichannel`

Packages