collection of text2cypher datasets, evaluations, and finetuning instructions
-
Updated
Jun 13, 2024 - Jupyter Notebook
collection of text2cypher datasets, evaluations, and finetuning instructions
Repository for organizing datasets and papers used in Open LLM.
A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ
A collection of LLM related papers, thesis, tools, datasets, courses, open source models, benchmarks
Efficiently fetch and perform sentiment analysis (Turkish Only) on eksisozluk.com entries using Rust
A bunch of very famous repos source code's in python as pure localdocs all in this repo to train CODE AI
WikiText syntax dataset generation pipeline and open dataset for auto UI generation in TiddlyWiki. (WIP)
Synthetically Generating Intent-Aware Information-Seeking Dialogues! Useful for various tasks such as training/evaluating User Intent Predictors with the possibility to training/evaluating on real human dialogues. The backbone LLM of SOLID is Zephyr-7b-beta.
PARROT (Performance Assessment of Reasoning and Responses On Trivia) is a novel benchmarking framework designed to evaluate Large Language Models (LLMs) on real-world, complex, and ambiguous QA tasks.
A modified dataset consisting of English dialogs between a user and an assistant discussing movie preferences in natural language.
Collection of ETL scripts used to create a dataset of text in Spanish to train Large Language Models.
Add a description, image, and links to the llm-datasets topic page so that developers can more easily learn about it.
To associate your repository with the llm-datasets topic, visit your repo's landing page and select "manage topics."