-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Redshift historical retrieval #1720
Implement Redshift historical retrieval #1720
Conversation
Signed-off-by: Tsotne Tabidze <[email protected]>
Signed-off-by: Tsotne Tabidze <[email protected]>
Directionally this PR looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[30%] looks right to me
Fixed get_historical_features where entity_df is a SQL query, while keeping the utility functions common between Redshift and BigQuery. `infer_event_timestamp_from_entity_df` and `assert_expected_columns_in_entity_df` are now based on the entity schema rather than the dataframe. I also completely removed the min/max timestamp inference, since those could not be merged (needed to query BigQuery and Redshift). Instead, I moved the logic inside the SQL templates, reducing the code complexity. Signed-off-by: Tsotne Tabidze <[email protected]>
Signed-off-by: Tsotne Tabidze <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #1720 +/- ##
==========================================
+ Coverage 78.69% 79.01% +0.31%
==========================================
Files 80 80
Lines 6765 6849 +84
==========================================
+ Hits 5324 5412 +88
+ Misses 1441 1437 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Co-authored-by: Willem Pienaar <[email protected]> Signed-off-by: Tsotne Tabidze <[email protected]>
44154ee
to
1da26bf
Compare
Signed-off-by: Tsotne Tabidze <[email protected]>
Signed-off-by: Tsotne Tabidze <[email protected]>
Signed-off-by: Tsotne Tabidze <[email protected]>
sdk/python/tests/integration/offline_store/test_historical_retrieval.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Tsotne Tabidze <[email protected]>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tsotnet, woop The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Signed-off-by: Tsotne Tabidze [email protected]
What this PR does / why we need it: This PR implements
RedfshiftOfflineStore:get_historical_features
. Also refactored couple of methods from bigquery.py and moved them in common_utils.py so that both RedshiftOfflineStore and BigQueryOfflineStore can use them.Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: