Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support loading ensemble from pandas and dask dataframes #224

Merged
merged 6 commits into from
Sep 8, 2023

Conversation

wilsonbb
Copy link
Collaborator

@wilsonbb wilsonbb commented Sep 8, 2023

Adds the loader functions Ensemble.from_pandas and Ensemble.from_dask_dataframe. This allows the user to provide TAPE with their own Pandas or Dask dataframes constructed via their own preferred sources and I/O as outlined in issue #223.

Ensemble.from_source_dict and Ensemble.from_parquet were simplified with some of their functionality moved into Ensemble.from_dask_dataframe.

@wilsonbb wilsonbb linked an issue Sep 8, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Sep 8, 2023

Codecov Report

Patch coverage: 86.48% and project coverage change: +0.13% 🎉

Comparison is base (776d05e) 92.44% compared to head (43f111d) 92.57%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #224      +/-   ##
==========================================
+ Coverage   92.44%   92.57%   +0.13%     
==========================================
  Files          22       22              
  Lines        1125     1132       +7     
==========================================
+ Hits         1040     1048       +8     
+ Misses         85       84       -1     
Files Changed Coverage Δ
src/tape/ensemble.py 89.46% <86.48%> (+0.37%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wilsonbb wilsonbb marked this pull request as ready for review September 8, 2023 05:48
@wilsonbb
Copy link
Collaborator Author

wilsonbb commented Sep 8, 2023

Will add an additional test to bump up the code coverage, but I wanted it to be ready for a first review pass.

Copy link
Contributor

@hombit hombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, it looks great! I list some general questions and non-important code suggestions.

tests/tape_tests/test_ensemble.py Outdated Show resolved Hide resolved
src/tape/ensemble.py Outdated Show resolved Hide resolved
tests/tape_tests/conftest.py Outdated Show resolved Hide resolved
src/tape/ensemble.py Show resolved Hide resolved
src/tape/ensemble.py Show resolved Hide resolved
@wilsonbb wilsonbb merged commit c00341b into main Sep 8, 2023
8 of 9 checks passed
@dougbrn dougbrn deleted the tape_load_dask branch December 11, 2023 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Construct Ensemble from Dask data-frames
2 participants