Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a datasets.toy_dataframe()? #1189

Open
jeromedockes opened this issue Dec 7, 2024 · 1 comment
Open

Add a datasets.toy_dataframe()? #1189

jeromedockes opened this issue Dec 7, 2024 · 1 comment
Labels
documentation Add or improve the documentation good first issue Good for newcomers

Comments

@jeromedockes
Copy link
Member

(or some better name)

a function that creates and returns a small dataframe with columns of various types that we can use in the docstring examples. the goal is to avoid the need of creating the data in the examples, and also not need to download an actual dataset.

first step is make a list of docstrings that could benefit from such a function and seeing if we can come up with one that would suit a good proportion of docstrings

an alternative could be using one of the scikit-learn datasets that don't need network access, but I think IIRC they will not have many different types in their columns. maybe the titanic one?

note we have a similar function as a pytest fixture but the goal is different: cover many different weird combinations of int, float, missing values etc. whereas the new function would be for illustration not testing

@jeromedockes jeromedockes added documentation Add or improve the documentation discussion Something somewhat open-ended to discuss labels Dec 7, 2024
@GaelVaroquaux
Copy link
Member

Good idea!

@jeromedockes jeromedockes added good first issue Good for newcomers and removed discussion Something somewhat open-ended to discuss labels Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Add or improve the documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants