Natural Language Processing Course - University of Tehran

This repository contains continuous assignments from the Natural Language Processing course at the University of Tehran. The assignments cover various NLP concepts, from tokenization to knowledge-based QA systems. Below is an overview of each assignment.

Assignments Overview

Assignment 1: Tokenization and N-Grams

Topics: Tokenization, Custom Tokenizers, BERT, GPT, N-Gram Language Models.
Key Tasks:
- Implement a custom tokenizer using regular expressions.
- Compare tokenization methods in BERT and GPT.
- Build N-gram models for text completion.

Assignment 2: Sentiment and Sarcasm Detection

Topics: Sentiment Analysis, Sarcasm Detection, Word Embeddings (GloVe), Logistic Regression.
Key Tasks:
- Perform sentiment analysis using Naive Bayes.
- Detect sarcasm using Logistic Regression and GloVe.
- Explore word similarities with skip-gram.

Assignment 3: Semantic Role Labeling

Topics: SRL, LSTM and GRU Encoders, Encoder-Decoder Models.
Key Tasks:
- Label semantic roles using SRL.
- Implement LSTM and GRU for SRL.
- Convert SRL tasks into question-answer pairs.

Assignment 4: Fine-Tuning Large Language Models

Topics: Fine-Tuning, LoRA, QLoRA, In-Context Learning (ICL).
Key Tasks:
- Fine-tune large models like Roberta and LLaMA.
- Use zero-shot and one-shot learning.
- Analyze model performance with LoRA and P-Tuning.

Assignment 5: Machine Translation

Topics: Machine Translation, BPE, LSTM, Transformer Models.
Key Tasks:
- Build an English-to-Farsi translation system using Fairseq.
- Train LSTM and Transformer models with BPE.
- Evaluate using BLEU and COMET scores.

Assignment 6: Langchain Knowledge-Based QA System

Topics: Knowledge-based QA, LangChain, Chain of Thought Reasoning.
Key Tasks:
- Build a multi-step QA system using LangChain.
- Implement relevancy checks and context-based answers.

Each assignment is organized into folders, containing code, data, and reports. Implementations are in Python, utilizing libraries like PyTorch, Fairseq, and Huggingface.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CA1		CA1
CA2		CA2
CA3		CA3
CA4		CA4
CA5		CA5
CA6		CA6
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Processing Course - University of Tehran

Assignments Overview

Assignment 1: Tokenization and N-Grams

Assignment 2: Sentiment and Sarcasm Detection

Assignment 3: Semantic Role Labeling

Assignment 4: Fine-Tuning Large Language Models

Assignment 5: Machine Translation

Assignment 6: Langchain Knowledge-Based QA System

About

Releases

Packages

Languages

Khoramfar/NLP-COURSE-UT

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing Course - University of Tehran

Assignments Overview

Assignment 1: Tokenization and N-Grams

Assignment 2: Sentiment and Sarcasm Detection

Assignment 3: Semantic Role Labeling

Assignment 4: Fine-Tuning Large Language Models

Assignment 5: Machine Translation

Assignment 6: Langchain Knowledge-Based QA System

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages