SQuAD_Agent_Experiment

title	app_file	sdk	sdk_version	python_version
SQuAD_Agent_Experiment	app.py	gradio	5.0.1	3.11.9

SQuAD_Agent_Experiment

Overview

The project is built using Transformers Agents 2.0, and uses the Stanford SQuAD dataset for training. The chatbot is designed to answer questions about the dataset, while also incorporating conversational context and various tools to provide a more natural and engaging conversational experience.

At the time of writing, the project is available on Hugging Face Spaces.

Getting Started

Install dependencies:

Requires Python >= 3.11.9

pip install -r pre-requirements.txt
pip install -r requirements.txt

Set up required keys:

Create a .env file and set the following environment variables:

HF_TOKEN=<your token>
OPENAI_API_KEY=<your key>

Run the app:

python app.py

Methods Used

SQuAD Dataset: The dataset used for training the chatbot is the Stanford SQuAD dataset, which contains over 100,000 questions and answers extracted from 500+ articles.
RAG: RAG is a technique used to improve the accuracy of chatbots by using a custom knowledge base. In this project, the Stanford SQuAD dataset is used as the knowledge base.
Transformers Agents 2.0: Transformers Agents 2.0 is a framework for building conversational AI systems. It is used in this project to build the chatbot.
Created a SquadRetrieverTool to integrate a fine-tuned BERT model into the agent, along with a TextToImageTool for a playful way to engage with the question-answering agent.
Gradio: Gradio is used to create the chatbot interface, in app.py.

Evaluation

SemScore is used in this project to evaluate the chatbot's responses in the notebook benchmarking.ipynb.

See SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

In this experiment, the agent is evaluated with 3 different system prompting approaches:

The default prompting approach, which is just the default system prompt used in Hugging Face Transformers Agents 2.0, with only an example of using the squad_retriever tool added.
A succinct prompting approach, which guides the agent to be concise if possible while still answering the question.
A focused prompting approach, which reframes the entire chatbots purpose to focus more on the specific task of answering questions about the SQuAD dataset, while still being open to exploring other topics.

Results

Limitations

This experiment is not designed for multiple users. While it has in-session memory, simply refreshing the browser will reset the chat history, which is convenient for experimentation.
Some of the agent's underlying engines, models, and tools use keys that have usage limits, so the app may not work if those limits have been reached.
- It is recommended to clone the repo and run the code using your own keys, to avoid running into those limits.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
prompts		prompts
samples		samples
tools		tools
.gitignore		.gitignore
README.md		README.md
SQuAD.png		SQuAD.png
agent.py		agent.py
app.py		app.py
benchmarking.ipynb		benchmarking.ipynb
bots.py		bots.py
data.py		data.py
pre-requirements.txt		pre-requirements.txt
requirements.txt		requirements.txt
semscore.py		semscore.py
test_bots.py		test_bots.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQuAD_Agent_Experiment

Overview

Getting Started

Methods Used

Evaluation

Results

Limitations

Acknowledgments

About

Releases

Packages

Languages

kurisu/SQuAD_Agent_Experiment

Folders and files

Latest commit

History

Repository files navigation

SQuAD_Agent_Experiment

Overview

Getting Started

Methods Used

Evaluation

Results

Limitations

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages