-
Notifications
You must be signed in to change notification settings - Fork 211
[WIP] Added TabularRegressionDataFrameDataSource wrapping TimeSeriesDataSet and TabularRegressionPreprocess stub #588
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #588 +/- ##
===========================================
- Coverage 91.41% 79.06% -12.36%
===========================================
Files 117 132 +15
Lines 7200 7829 +629
===========================================
- Hits 6582 6190 -392
- Misses 618 1639 +1021
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
else: | ||
DataFrame = object | ||
|
||
from pytorch_forecasting import TimeSeriesDataSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should check if pytorch_forecasting is available.
from flash.tabular.data import TabularData, TabularDataFrameDataSource, TabularPreprocess | ||
|
||
|
||
class TabularRegressionDataFrameDataSource(TabularDataFrameDataSource): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing that TabularRegression
is used to timeseries.
This should be a TimeSeriesForecasting
Task using from_data_frame
or from_csv
function :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sumanmichael great work so far! It's definitely getting there. I think it would be simpler to just extend the base Preprocess and DataSource classes for tabular regression since regression and classification don't share many arguments or any functionality.
from flash.tabular.data import TabularData, TabularDataFrameDataSource, TabularPreprocess | ||
|
||
|
||
class TabularRegressionDataFrameDataSource(TabularDataFrameDataSource): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this would be better just extending DataSource as it doesn't use any of the functionality from the TabularDataFrameDataSource
) | ||
|
||
|
||
class TabularRegressionPreprocess(TabularPreprocess): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, maybe should just be a Preprocess? That way you can remove the arguments to init that are not used for regression
num_workers: Optional[int] = None, | ||
**preprocess_kwargs: Any, | ||
): | ||
super().from_data_frame( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be super().from_data_source("data_frame", ...
Maybe this structure would be more appropriate:
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
What does this PR do?
Created TabularRegressionPreprocess, TabularRegressionDataFrameDataSource wrapping TimeSeriesDataSet in pytorch-forecasting
Here's what I did.
in load_data, we return a TimeSeriesDataset with the dataframe and the args from the init
Fixes #528 #529
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃