-
Notifications
You must be signed in to change notification settings - Fork 147
Home
James Bergstra edited this page Mar 14, 2013
·
12 revisions
Welcome to the skdata (scikit-data) wiki!
This is not the main entry point for the project, that would be the skdata project home page.
The goal of the skdata project is to standardize the representation of community benchmark data sets (including large and awkward ones), and facilitate the development of broadly applicable machine learning algorithm implementations. Skdata is meant to interoperate with other Python machine learning software (such as scikit-learn, PyBrain, or custom algorithms) but skdata does not aim to provide machine learning algorithms.
- The code of the library is currently usable (and frequently used).
- The API should not be considered stable, it will probably remain a work-in-progress for some time.
- There are tests for some but not all lines of code (estimated 50%).
- Some of the older data set modules do not use the newer "dataset.py", "view.py" code layout, and it is on the TODO list to forward-port them.
- Data set list enumerates the data sets that skdata supports.
- Managing Data explains how to configure where large files are stored.
- Evaluation Protocol introduces the communication protocol between view objects and learning algorithms.
- How to create a new dataset explains how data set submodules should be organized and what they should do.