Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate GaussianCopula model into the Synthesizer. #241

Merged
merged 1 commit into from
Nov 19, 2024

Conversation

jalr4ever
Copy link
Collaborator

@jalr4ever jalr4ever commented Nov 19, 2024

Description

I have made GaussianCopula extended with SynthesizerModel.

Motivation and Context

We could see that the Core Data Pre-Post Process Pipeline was triggered in Synthesizer, which meant data would not be processed if just instance a Model to sythetic data, you have to assign a model to Synthesizer.

Now we support model=GaussianCopulaSynthesizerModel(), we could write code just like this:

from sdgx.data_connectors.csv_connector import CsvConnector
from sdgx.models.ml.single_table.ctgan import CTGANSynthesizerModel
from sdgx.synthesizer import Synthesizer
from sdgx.utils import download_demo_data

# This will download demo data to ./dataset
dataset_csv = download_demo_data()

# Create data connector for csv file
data_connector = CsvConnector(path=dataset_csv)

# Initialize synthesizer, use GaussianCopula model
synthesizer = Synthesizer(
    model=GaussianCopulaSynthesizerModel(),  # For quick demo
    data_connector=data_connector,
)

# Fit the model
synthesizer.fit()

# Sample
sampled_data = synthesizer.sample(1000)
print(sampled_data)

How has this been tested?

Types of changes

  • Maintenance (no change in code, maintain the project's CI, docs, etc.)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@jalr4ever jalr4ever marked this pull request as ready for review November 19, 2024 08:00
Copy link
Collaborator

@Wh1isper Wh1isper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jalr4ever jalr4ever merged commit edfab2e into main Nov 19, 2024
11 checks passed
@jalr4ever jalr4ever deleted the jalr4ever-gaussian-intregration branch November 21, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants