Repo to evaluate SQL Agent across different databases and their corresponding evaluation datasets
Two main notebooks:
build_evaluation_dataset.ipnyb
for building and saving evaluation datasets. You can ignore it.evaluate_agent.ipnyb
for evaluating SQL Agents.