Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fillna() und dropna() in Table and Column #10

Closed
dorianheidorn opened this issue Jan 27, 2023 · 2 comments · Fixed by #97
Closed

fillna() und dropna() in Table and Column #10

dorianheidorn opened this issue Jan 27, 2023 · 2 comments · Fixed by #97
Assignees
Labels
enhancement 💡 New feature or request released Included in a release

Comments

@dorianheidorn
Copy link
Contributor

Is your feature request related to a problem? Please describe

when handling missing values in tables, a function to handle these values would be useful.

Desired solution

Thinking of pandas it could be similar to Table.dropna(list[column_names: str]) and Table.fillna(column_name: str, replace_value). As well as Column.dropna() and Column.fillna(replace_value).

@dorianheidorn dorianheidorn added the enhancement 💡 New feature or request label Jan 27, 2023
@lars-reimann lars-reimann transferred this issue from Safe-DS/DSL Mar 4, 2023
@lars-reimann lars-reimann added wontfix This will not be worked on and removed wontfix This will not be worked on labels Mar 27, 2023
@lars-reimann
Copy link
Member

These transformations can already be done using the current API:

  • fillna corresponds to our Imputer with a Constant strategy.
  • dropna (columns) can be done with drop_columns and list_columns_with_missing_values on a Table.
  • dropna (rows) can be done with filter on Table.

I would definitely object fillna since the Imputer does exactly what you want in a more general way and it can also be applied to a test/validation set. dropna might be useful if we find that the task of dropping columns/rows with missing values is very common and the current solution too cumbersome. But then the name of the method should not be dropna but

  • drop_columns_with_missing_values
  • drop_rows_with_missing_values

These operations should also only be added to Table, not Column.

@lars-reimann lars-reimann self-assigned this Mar 27, 2023
@lars-reimann lars-reimann linked a pull request Mar 27, 2023 that will close this issue
lars-reimann added a commit that referenced this issue Mar 27, 2023
Closes #10.

### Summary of Changes

Add methods to `Table` to handle missing values:
* `drop_columns_with_missing_values` returns a `Table` without the
columns that have missing values.
* `drop_rows_with_missing_values` returns a `Table` without the rows
that have missing values.

---------

Co-authored-by: lars-reimann <[email protected]>
lars-reimann pushed a commit that referenced this issue Mar 27, 2023
## [0.6.0](v0.5.0...v0.6.0) (2023-03-27)

### Features

* allow calling `correlation_heatmap` with non-numerical columns ([#92](#92)) ([b960214](b960214)), closes [#89](#89)
* function to drop columns with non-numerical values from `Table` ([#96](#96)) ([8f14d65](8f14d65)), closes [#13](#13)
* function to drop columns/rows with missing values ([#97](#97)) ([05d771c](05d771c)), closes [#10](#10)
* remove `list_columns_with_XY` methods from `Table` ([#100](#100)) ([a0c56ad](a0c56ad)), closes [#94](#94)
* rename `keep_columns` to `keep_only_columns` ([#99](#99)) ([de42169](de42169))
* rename `remove_outliers` to `drop_rows_with_outliers` ([#95](#95)) ([7bad2e3](7bad2e3)), closes [#93](#93)
* return new model when calling `fit` ([#91](#91)) ([165c97c](165c97c)), closes [#69](#69)

### Bug Fixes

* handling of missing values when dropping rows with outliers ([#101](#101)) ([0a5e853](0a5e853)), closes [#7](#7)
@lars-reimann
Copy link
Member

🎉 This issue has been resolved in version 0.6.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request released Included in a release
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants