Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken datasets due to pandas API changes #13

Closed
francois-rozet opened this issue Aug 23, 2023 · 2 comments
Closed

Broken datasets due to pandas API changes #13

francois-rozet opened this issue Aug 23, 2023 · 2 comments
Labels

Comments

@francois-rozet
Copy link

Hello @gpapamak,

Due to API changes in pandas, the GAS and HEPMASS datasets are not usable anymore. Notably, the DataFrame.as_matrix method has been deprecated since pandas=0.23.0 and the DataFrame pickling format of pandas<2.0 is not compatible with pandas>=2.0. There is also an issue with Counter.iteritems which is deprecated since Python 3.0.

I don't think modifying this repository to fix these issues is a good idea as it could break the code. Instead, I made a lightweight fork (francois-rozet/uci-datasets) of the repo's UCI datasets and wrote instructions to generate environment-agnostic .npy files containing the processed data. These .npy files can then be used without relying on the original code and its dependencies. I hope it's ok for you.

@francois-rozet francois-rozet changed the title Unsable datasets due to pandas API changes Broken datasets due to pandas API changes Aug 23, 2023
@gpapamak
Copy link
Owner

Thanks François for the bug report and for creating the fork! I agree, it's better to use your fork rather than modifying the repository. I added a comment to the README linking to francois-rozet/uci-datasets. Thanks again!

(Sorry it took me so long to respond... I don't actively maintain this repository anymore, so I barely check for issues.)

@gpapamak gpapamak pinned this issue Jul 28, 2024
@gpapamak gpapamak added the bug label Jul 28, 2024
@francois-rozet
Copy link
Author

I had forgotten about that issue 😂 Thank you for taking a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants