You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to API changes in pandas, the GAS and HEPMASS datasets are not usable anymore. Notably, the DataFrame.as_matrix method has been deprecated since pandas=0.23.0 and the DataFrame pickling format of pandas<2.0 is not compatible with pandas>=2.0. There is also an issue with Counter.iteritems which is deprecated since Python 3.0.
I don't think modifying this repository to fix these issues is a good idea as it could break the code. Instead, I made a lightweight fork (francois-rozet/uci-datasets) of the repo's UCI datasets and wrote instructions to generate environment-agnostic .npy files containing the processed data. These .npy files can then be used without relying on the original code and its dependencies. I hope it's ok for you.
The text was updated successfully, but these errors were encountered:
francois-rozet
changed the title
Unsable datasets due to pandas API changes
Broken datasets due to pandas API changes
Aug 23, 2023
Thanks François for the bug report and for creating the fork! I agree, it's better to use your fork rather than modifying the repository. I added a comment to the README linking to francois-rozet/uci-datasets. Thanks again!
(Sorry it took me so long to respond... I don't actively maintain this repository anymore, so I barely check for issues.)
Hello @gpapamak,
Due to API changes in
pandas
, the GAS and HEPMASS datasets are not usable anymore. Notably, theDataFrame.as_matrix
method has been deprecated sincepandas=0.23.0
and theDataFrame
pickling format ofpandas<2.0
is not compatible withpandas>=2.0
. There is also an issue withCounter.iteritems
which is deprecated since Python 3.0.I don't think modifying this repository to fix these issues is a good idea as it could break the code. Instead, I made a lightweight fork (francois-rozet/uci-datasets) of the repo's UCI datasets and wrote instructions to generate environment-agnostic
.npy
files containing the processed data. These.npy
files can then be used without relying on the original code and its dependencies. I hope it's ok for you.The text was updated successfully, but these errors were encountered: