Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - dataframe caching #36

Closed
wants to merge 7 commits into from
Closed

WIP - dataframe caching #36

wants to merge 7 commits into from

Conversation

shouples
Copy link
Collaborator

@shouples shouples commented Sep 2, 2022

This is somewhat of an overdue cleanup after getting a lot of the push-down filtering logic prototyped; it involved a lot more tracking of relationships between display_id and other dataframe info than I initially expected, which meant there was a lot of abuse of global variables. It "worked" but wasn't clean and won't scale well, so this PR is going to fix at least the bulk of that.

Going forward, we should be creating DXDataFrameCache objects during the display format process, which will generate display_id values and track original attributes of the data, while handling the cleaning and any sampling/truncating. When push-down filtering occurs, these objects should be referenced to get any necessary information and continue making associations between parent and child datasets.

@shouples
Copy link
Collaborator Author

closing and resubmitting to separate branch/PR after restructuring in #34

@shouples shouples closed this Sep 22, 2022
@shouples shouples deleted the djs/df-caching branch March 17, 2023 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant