Skip to content

Attakuan/EEE443-A-Comparative-Study-of-Image-Captioning-Methods

Repository files navigation

EEE443-A-Comparative-Study-of-Image-Captioning-Methods

The purpose of this project is to caption textual descriptions of images by generating sentences that include the states of objects and the behaviors of people and animals in the corresponding images. The dataset provided by the instructor contains training and test images along with their captions. The image features are extracted by using well-known pretrained network which is ResNetV2. Neural network architectures containing Convolutional Neural Networks (CNNs) and variations of LSTM and GRU recurrent neural networks are utilized to create captions that capture information about the contents of an image. Furthermore, trainable GloVe 300d and non-trainable GloVe 300d embedding vectors have been utilized to understand the effects of embedding vectors. The training and validation loss graphs are obtained for each model and their BLEU scores are calculated on the test dataset in order to compare the performances of the models. Overall, we were able to train and test six networks and compare their performances in order to decide on the model with the best performance.

Example Results

Installation

It has quite long library installation process. Thus, it is advised that you read the jupyter notebook version of the code where we explained each installation in detail.

Contributing

The contributors to this implementation are as follows:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published