Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Pandas deprecation warnings. #263

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

jimwhite
Copy link

@jimwhite jimwhite commented Sep 18, 2024

Hi Stefan,

This is the bulk of changes needed to reduce/eliminate Pandas future/deprecation warnings when using Pandas 2.2. There were about 33000 when running the tests and this gets down to around 1500. This change is a WIP but I wanted to create the PR early and get your feedback on how to proceed. I'm also still dealing with errors in tests/data/test_daily_bars.py and tests/test_bar_data.py (I think the issue has to do with the timestamp format in the test data init).

I've only tested with Python 3.11 on MacOS so I expect to see issues from the other versions in the GitHub Actions testing.

I really appreciate the work you've done to keep Zipline alive. I've got a Polygon.io stock data bundle in progress (https://github.com/fovi-llc/zipline-polygon-bundle) which is finally working end-to-end (but needing much to do yet).

Thanks!
Jim

@stefan-jansen
Copy link
Owner

Thanks @jimwhite , please let me know if you have any questions. Hope the new release suits your situation.

@jimwhite
Copy link
Author

jimwhite commented Sep 26, 2024

Thanks @jimwhite , please let me know if you have any questions. Hope the new release suits your situation.

Well, the problem (high warning count when running tests) is now 53371 (up from around 31K):

=================================================================================== short test summary info ===================================================================================
FAILED tests/pipeline/test_factor.py::SummaryTestCase::test_summaries_after_fillna - zipline.testing.core.SubTestFailures: failures:
FAILED tests/pipeline/test_factor.py::SummaryTestCase::test_summary_methods - zipline.testing.core.SubTestFailures: failures:
FAILED tests/pipeline/test_statistical.py::TestStatisticalBuiltIns::test_correlation_factors[4-2] - KeyError: TestingDataSet<US>.float_col::float64
FAILED tests/pipeline/test_statistical.py::TestStatisticalBuiltIns::test_simple_beta_target - assert Equity(1 [A]) is Equity(1 [A])
FAILED tests/pipeline/test_statistical.py::StatisticalMethodsTestCase::test_factor_correlation_methods - zipline.testing.core.SubTestFailures: failures:
================================================= 5 failed, 3163 passed, 16 skipped, 5 xfailed, 1 xpassed, 53371 warnings in 96.53s (0:01:36) =================================================

My reason for trying to get the warning count down to something more manageable is I'm thinking in terms of future maintenance and enhancements for zipline-reloaded.

For example I'm working on a Polygon.io bundle (https://github.com/fovi-llc/zipline-polygon-bundle) and that bcolz is now also reloaded has me thinking about replacing that with (or adding the alternative of) using PyArrow Hive.

Shall I update my PR to this new HEAD?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants