NanoGPT is a lightweight custom GPT (Generative Pre-trained Transformer) model designed for the WMT2014 English-to-Hindi machine translation task. This model is implemented using PyTorch and is inspired by the Transformer architecture introduced in the paper "Attention is All You Need"
by Vaswani et al
.
[
- Introduction
- Features
- Installation
- Usage
- Model Architecture
- Training
- Inference
- Acknowledgements
Machine translation is a challenging natural language processing task that involves translating text from one language to another. NanoGPT is specifically designed for the English-to-Hindi translation task, leveraging the power of the Transformer architecture to achieve state-of-the-art performance in a lightweight manner.
- Lightweight GPT model tailored for English-to-Hindi translation.
- Based on the Transformer architecture introduced in the paper "Attention is All You Need."
- Easy-to-use PyTorch implementation for training and inference.
pip install torch
# Additional dependencies may be required. Refer to the requirements.txt file.
To use NanoGPT for English-to-Hindi translation, follow these steps:
- Clone the repository:
git clone https://github.com/your_username/nanogpt.git
cd nanogpt
- Install dependencies:
pip install -r requirements.txt
- Train the model using your dataset or use a pre-trained model.
- Perform inference on new English sentences to get Hindi translations.
The NanoGPT model architecture is based on the Transformer introduced in the "Attention is All You Need" paper. It consists of multiple layers of self-attention and feedforward neural networks, enabling effective learning of contextual information for translation.
To train NanoGPT on the WMT2014 English-to-Hindi dataset, use the following command:
python train.py --dataset_path /path/to/wmt2014/dataset
Additional training options and hyperparameters can be configured in the config.py
file.
Performing inference with NanoGPT is straightforward. Simply load the pre-trained model and use it to translate English sentences to Hindi:
from nanogpt import NanoGPT, translate_sentence
model = NanoGPT.load_model('/path/to/pretrained/model')
english_sentence = "Hello, how are you?"
hindi_translation = translate_sentence(model, english_sentence)
print(f"English: {english_sentence}\nHindi: {hindi_translation}")
Acknowledgements
This project is built upon the Transformer architecture and is heavily influenced by the works of Vaswani et al.
in "Attention is All You Need."