-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Feast AWS Athena offline store (again) #3044
Conversation
I noticed there were a bunch of weird merge artifacts so I fixed them. Also removed Athena from the integration tests that run as part of CI since this is a contrib plugin. can you sign your commits? None of them seem signed. Also: I have no way of reproducing the test results because I don't have Athena setup. It's ok for this PR, but maybe leave a comment in the Makefile to specify that it needs the user to have their own custom athena setup? |
Codecov Report
@@ Coverage Diff @@
## master #3044 +/- ##
==========================================
+ Coverage 67.64% 75.72% +8.08%
==========================================
Files 167 202 +35
Lines 14696 16776 +2080
==========================================
+ Hits 9941 12704 +2763
+ Misses 4755 4072 -683
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Signed-off-by: Youngkyu OH <[email protected]>
…al_retrieval - 100% passed Signed-off-by: Youngkyu OH <[email protected]>
Signed-off-by: Youngkyu OH <[email protected]>
Signed-off-by: Youngkyu OH <[email protected]>
…bucket_name hardcoding to variable in AthenaDataSourceCreator Signed-off-by: Youngkyu OH <[email protected]>
Signed-off-by: Youngkyu OH <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Signed-off-by: Youngkyu OH <[email protected]>
/ok-to-test |
sdk/python/feast/infra/offline_stores/contrib/athena_offline_store/athena_source.py
Outdated
Show resolved
Hide resolved
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adchia, toping4445 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Youngkyu OH <[email protected]>
it’d be good to have some documentation, but this is good for an initial cut. Thanks! |
/lgtm |
Is there a documentation guide? If you share it, I will refer to it and fill it out when I have time. |
* fixed bugs, cleaned code, added AthenaDataSourceCreator Signed-off-by: Youngkyu OH <[email protected]> * fixed bugs, cleaned code, added some methods. test_universal_historical_retrieval - 100% passed Signed-off-by: Youngkyu OH <[email protected]> * fixed bugs to pass test_validation Signed-off-by: Youngkyu OH <[email protected]> * changed boolean data type mapping Signed-off-by: Youngkyu OH <[email protected]> * 1.added test-python-universal-athena in Makefile 2.replaced database,bucket_name hardcoding to variable in AthenaDataSourceCreator Signed-off-by: Youngkyu OH <[email protected]> * format,run lint Signed-off-by: Youngkyu OH <[email protected]> * revert merge changes Signed-off-by: Danny Chiao <[email protected]> * add entity_key_serialization Signed-off-by: Danny Chiao <[email protected]> * restore deleted file Signed-off-by: Danny Chiao <[email protected]> * modified confusing environment variable names, added how to use Athena Signed-off-by: Youngkyu OH <[email protected]> * enforce AthenaSource to have a name Signed-off-by: Youngkyu OH <[email protected]> Co-authored-by: toping4445 <[email protected]> Co-authored-by: Danny Chiao <[email protected]> Signed-off-by: Francisco Javier Arceo <[email protected]>
# [0.24.0](v0.23.0...v0.24.0) (2022-08-25) ### Bug Fixes * Check if on_demand_feature_views is an empty list rather than None for snowflake provider ([#3046](#3046)) ([9b05e65](9b05e65)) * FeatureStore.apply applies BatchFeatureView correctly ([#3098](#3098)) ([41be511](41be511)) * Fix Feast Java inconsistency with int64 serialization vs python ([#3031](#3031)) ([4bba787](4bba787)) * Fix feature service inference logic ([#3089](#3089)) ([4310ed7](4310ed7)) * Fix field mapping logic during feature inference ([#3067](#3067)) ([cdfa761](cdfa761)) * Fix incorrect on demand feature view diffing and improve Java tests ([#3074](#3074)) ([0702310](0702310)) * Fix Java helm charts to work with refactored logic. Fix FTS image ([#3105](#3105)) ([2b493e0](2b493e0)) * Fix on demand feature view output in feast plan + Web UI crash ([#3057](#3057)) ([bfae6ac](bfae6ac)) * Fix release workflow to release 0.24.0 ([#3138](#3138)) ([a69aaae](a69aaae)) * Fix Spark offline store type conversion to arrow ([#3071](#3071)) ([b26566d](b26566d)) * Fixing Web UI, which fails for the SQL registry ([#3028](#3028)) ([64603b6](64603b6)) * Force Snowflake Session to Timezone UTC ([#3083](#3083)) ([9f221e6](9f221e6)) * Make infer dummy entity join key idempotent ([#3115](#3115)) ([1f5b1e0](1f5b1e0)) * More explicit error messages ([#2708](#2708)) ([e4d7afd](e4d7afd)) * Parse inline data sources ([#3036](#3036)) ([c7ba370](c7ba370)) * Prevent overwriting existing file during `persist` ([#3088](#3088)) ([69af21f](69af21f)) * Register BatchFeatureView in feature repos correctly ([#3092](#3092)) ([b8e39ea](b8e39ea)) * Return an empty infra object from sql registry when it doesn't exist ([#3022](#3022)) ([8ba87d1](8ba87d1)) * Teardown tables for Snowflake Materialization testing ([#3106](#3106)) ([0a0c974](0a0c974)) * UI error when saved dataset is present in registry. ([#3124](#3124)) ([83cf753](83cf753)) * Update sql.py ([#3096](#3096)) ([2646a86](2646a86)) * Updated snowflake template ([#3130](#3130)) ([f0594e1](f0594e1)) ### Features * Add authentication option for snowflake connector ([#3039](#3039)) ([74c75f1](74c75f1)) * Add Cassandra/AstraDB online store contribution ([#2873](#2873)) ([feb6cb8](feb6cb8)) * Add Snowflake materialization engine ([#2948](#2948)) ([f3b522b](f3b522b)) * Adding saved dataset capabilities for Postgres ([#3070](#3070)) ([d3253c3](d3253c3)) * Allow passing repo config path via flag ([#3077](#3077)) ([0d2d951](0d2d951)) * Contrib azure provider with synapse/mssql offline store and Azure registry store ([#3072](#3072)) ([9f7e557](9f7e557)) * Custom Docker image for Bytewax batch materialization ([#3099](#3099)) ([cdd1b07](cdd1b07)) * Feast AWS Athena offline store (again) ([#3044](#3044)) ([989ce08](989ce08)) * Implement spark offline store `offline_write_batch` method ([#3076](#3076)) ([5b0cc87](5b0cc87)) * Initial Bytewax materialization engine ([#2974](#2974)) ([55c61f9](55c61f9)) * Refactor feature server helm charts to allow passing feature_store.yaml in environment variables ([#3113](#3113)) ([85ee789](85ee789))
What this PR does / why we need it:
It enables Feast users to use S3+AWS Athena as an offline store.
The tests above failed in my environment because of MySQLdb package-M1 compatibility issues.
I didn't implement some methods related to feature write. and tests with fixed S3 bucket name and related to GCP failed.
However, all tests in test_universal_historical_retrieval.py and test_univeral_types.py related to feature extraction have passed.
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?:
AWS Users can choose between Redshift and S3+Athena for an offline store.