Navigator Helpers 🚀

Gretel Navigator is a compound AI-based system for generating high-quality synthetic data using contextual tags and an evolutionary approach. This method iteratively improves, validates, and evaluates outputs to create synthetic data with greater quality than an underlying LLM could do on its own, and to combat hallucinations in AI-generated content.

Installation

Clone the repository:

git clone https://github.com/gretelai/navigator-helpers.git
cd navigator-helpers

Create a virtual environment and activate it:

python3 -m venv venv
source venv/bin/activate

Install the required dependencies:
```
pip install -r requirements.txt
```

Basic Usage

Configuration

The data synthesis configuration is defined in a YAML file. Here's a simplified example:

# Generator configuration
api_key: prompt
llm_model: gretelai/gpt-auto
log_level: INFO
use_reflection: true
output_filename: synthetic_data.jsonl
evolution_generations: 1
num_examples: 1000

# Data model definition
generation_instructions: |
  You are a seasoned SQL expert specializing in crafting intricate, context-rich queries and explanations. 
  Use the provided contextual tags as instructions for generation.

fields:
  - name: sql_context
    type: str
    description: A single string comprising multiple valid PostgreSQL CREATE TABLE statements.
    validator: sql:postgres

  - name: prompt
    type: str
    description: A detailed, nuanced natural language prompt that a user might ask to a database for a particular task.

contextual_tags:
  tags:
    - name: sql_complexity
      values:
        - value: Moderate
          weight: 0.4
        - value: Complex
          weight: 0.3
        - value: Very Complex
          weight: 0.2
        - value: Expert
          weight: 0.1

evolution:
  rate: 0.1
  strategies:
    - Enhance the schema to include domain-specific tables and data types.
    - Add relevant indexes, constraints, and views reflecting real-world designs.

Generating Synthetic Data

To generate synthetic data, use the run_generation.py script:

python examples/run_generation.py examples/example_nl2sql.yml

This script will:

Load the YAML configuration
Create a DataModel from the YAML
Initialize the SyntheticDataGenerator
Generate the data

The output will be saved to the file specified by output_filename in the YAML configuration.

Key Features

YAML Configuration: All configuration is centralized in a YAML file, making it easier to manage and modify.
Evolutionary Process: Controlled by the evolution section in the YAML config, including strategies and rate.
Contextual Tags: Defined in the YAML config under contextual_tags, supporting weighted values.
Field-specific Configuration: Each field can have its own type, description, and validator.
Flexible Validation: Validators can be specified for each field in the YAML config.
Reflection: Controlled by the use_reflection parameter in the YAML config.
Customizable Output: The output_filename in the YAML config determines the output file name.

For more detailed information on these features, please refer to the source code and comments within the project.

Examples

Example YAML configurations and usage scenarios can be found in the examples/ directory.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue.

License

This project is licensed under the Gretel License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.github/workflows		.github/workflows
config		config
docs		docs
examples		examples
experiments/code_validation		experiments/code_validation
navigator_helpers		navigator_helpers
tests		tests
.gitignore		.gitignore
ADVANCED_USAGE.md		ADVANCED_USAGE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Navigator Helpers 🚀

Table of Contents

Installation

Basic Usage

Configuration

Generating Synthetic Data

Key Features

Examples

Contributing

License

About

Releases

Packages

Contributors 7

Languages

License

gretelai/navigator-helpers

Folders and files

Latest commit

History

Repository files navigation

Navigator Helpers 🚀

Table of Contents

Installation

Basic Usage

Configuration

Generating Synthetic Data

Key Features

Examples

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages