-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ConstInspector and ConstValueTransformer for Handling Constant Columns #202
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
The management of metadata fields may be flawed, necessitating an examination of the eq method or the manner in which fields are retrieved. We will open a separate pull request to address this issue.
for more information, see https://pre-commit.ci
Addressing issues in pytest where erroneous references to certain pytest.fixture instances arise can be resolved through the utilization of deepcopy.
for more information, see https://pre-commit.ci
…s to ensure they are comprehensive and reflect the latest functionality.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
We have observed that initializing different Metadata within the same function or the same batch of unit tests seems to interfere with each other, leading to inaccurate table metadata. This might be a bug, and we should create a separate Issue and PR to address it. For example, we can look at the error in the test , in |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Description
This pull request introduces several enhancements and fixes to the Synthetic Data Generator (SDG) framework, focusing on the handling of constant columns in tabular data. The changes include:
ConstInspector
class to identify columns with constant values in a DataFrame.ConstValueTransformer
class to transform and reverse transform data by replacing specified columns with constant values.Motivation and Context
This change is required to improve the quality and utility of the synthetic data generated by the SDG framework.
By identifying and handling constant columns, we ensure that the synthetic data maintains the integrity of the original data.
This enhancement also addresses the need for more robust data transformation capabilities, allowing for more accurate and controlled generation of synthetic data.
How has this been tested?
The changes have been thoroughly tested using unit tests that cover the new functionality introduced by
ConstInspector
andConstValueTransformer
.Types of changes
Checklist: