Image Captioning in Nepali Language.

This project aims to develop a deep learning-based system mainly focus for image captioning in Nepali languages. The system will take an input image and generate paragraph caption in Nepali Language. The project leverages state-of-the-art deep learning models and techniques to achieve accurate and meaningful results.

Abstract

The advent of deep neural networks has made the image captioning task more feasible. It is a method of generating text by analyzing the different parts of an image. A lot of tasks related to this have been done in the English language while very little effort is put into this task in other languages, particularly in Nepali language. It is an even harder task to carry out research in the Nepali language because of its difficult grammatical structure and vast language domain. Further, the little work done in the Nepali language is done to generate only a single sentence but we emphasize to generate the paragraph long (3-4) coherent sentences. We used the Stanford human-genome dataset which was translated into Nepali language using the Google Translate API. Along with this, we manually curated a dataset consisting of 800 images of the cultural sites of Nepal along with their Nepali captions. These two datasets were combined to train the deep learning model. The work was carried out on encoder-decoder architecture, with pre-trained CNN (Inception-V3)acting as an encoder that extracts the features from the images, and for the decoder purpose, we have used two architectures LSTM and Transformers to see which architecture works better. We used the BLEU score as an evaluation metric for this research. Experiments showed the transformer works better than LSTM in the case of Nepali language for this captioning task

Overall Tech Used:

Model ( /jupyter notebook)
- LSTM
  - RESNET152 ( For Feature Extraction)
- Transformer
  - Inception-V3 ( For encoder)
Frontend ( /Projet UI)
- React
- axios
- scss
Backend ( /Server & /Backend)
- Flask
- Python
- MongoDB
- Node JS

Setup Guide

Clone the repository. /Data Collection folder is heavy: Be very careful here!
```
git clone https://github.com/chhetri123/Major_Project.git
```
Once cloned successfully, open this project in your IDE

Backend ( Trained Model Setup)

Once the above steps are done, open the terminal of your IDE and head over to the \Server:
```
Cd Server
```

Then Create the virutalenv.

# For windows
python -m venv venv

<!-- OR -->

# For macos
python3 -m venv venv

Activate virutalenv using the below command :
```
source venv/bin/activate
```

Install Require packages

# For windows
pip install -r requirements.txt

<!-- OR -->

# For macos
pip3 install -r requirements.txt

As everything is ready now, we can run the Model as

# For windows
python app.py

<!-- OR -->

# For macos
python3 app.py

Frontend Setup

Open another terminal and head over into the \Project UI :
```
cd  Project UI
```
Install the require packages:
```
yarn install
```
And run server:

  yarn run dev

And you can view the page with the url http://localhost:3000

Team Members

_{Manish Chhetri}

_{Nabraj Subedi}

_{Nirajan Paudel}

Contributions and License

Contributions to the project are welcome! If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request. The project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Data Collection		Data Collection
Major_Project-master		Major_Project-master
Project UI		Project UI
Server		Server
jupyter notebook		jupyter notebook
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning in Nepali Language.

Abstract

Overall Tech Used:

Setup Guide

Backend ( Trained Model Setup)

Frontend Setup

Team Members

Contributions and License

About

Releases

Packages

Languages

subedinab/major-project

Folders and files

Latest commit

History

Repository files navigation

Image Captioning in Nepali Language.

Abstract

Overall Tech Used:

Setup Guide

Backend ( Trained Model Setup)

Frontend Setup

Team Members

Contributions and License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages