-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add file write_to_offline_store functionality #2808
Conversation
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #2808 +/- ##
===========================================
- Coverage 80.44% 59.40% -21.04%
===========================================
Files 173 173
Lines 15271 15449 +178
===========================================
- Hits 12284 9177 -3107
- Misses 2987 6272 +3285
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
sdk/python/tests/integration/offline_store/test_offline_push.py
Outdated
Show resolved
Hide resolved
|
||
@pytest.mark.integration | ||
@pytest.mark.universal_online_stores | ||
def test_writing_incorrect_order_fails(environment, universal_data_sources): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain (and maybe add this explanation in a docstring for this test) why exactly we expect it to fail? the name suggests an incorrect order, but from inspecting the underlying data source it looks like the df might not just be out of order, but it's also missing some columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
@pytest.mark.integration | ||
@pytest.mark.universal_online_stores | ||
def test_writing_incorrect_schema_fails(environment, universal_data_sources): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a copy of the above test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This writes an incorrect schema. I just want to test both trigger the valuerror.
assert df["conv_rate"].isnull().all() | ||
assert df["avg_daily_trips"].isnull().all() | ||
|
||
first_df = pd.DataFrame.from_dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might leave a comment here indicating that these columns are arranged in exactly the same order as the underlying DS (whose schema can be found in driver_test_data.py
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sdk/python/tests/integration/offline_store/test_offline_push.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Kevin Zhang <[email protected]>
Signed-off-by: Kevin Zhang <[email protected]>
@@ -275,7 +275,7 @@ def write_logged_features( | |||
def offline_write_batch( | |||
config: RepoConfig, | |||
table: FeatureView, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this shouldn't be called table it should be feature_view
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was adhering to how we defined in online write batch but ill change it.
@@ -127,7 +127,7 @@ def ingest_df( | |||
pass | |||
|
|||
def ingest_df_to_offline_store( | |||
self, feature_view: FeatureView, df: pd.DataFrame, | |||
self, feature_view: FeatureView, df: pyarrow.Table, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this interface changing? can you add that to the PR description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure left some comments
Signed-off-by: Kevin Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@@ -1371,7 +1371,7 @@ def write_to_online_store( | |||
provider.ingest_df(feature_view, entities, df) | |||
|
|||
@log_exceptions_and_usage | |||
def write_to_offline_store( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do in separate pr once every offline store is merged because I want to document the entire push api.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achals, kevjumba The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# [0.22.0](v0.21.0...v0.22.0) (2022-06-21) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5))
# [0.22.0](v0.21.0...v0.22.0) (2022-06-23) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5))
# [0.22.0](v0.21.0...v0.22.0) (2022-06-24) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Revert "chore(release): release 0.22.0" ([#2852](#2852)) ([e6a4636](e6a4636)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5))
# [0.22.0](v0.21.0...v0.22.0) (2022-06-24) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Correctly generate projects-list.json when calling feast ui and using postgres as a source ([#2845](#2845)) ([bee8076](bee8076)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Revert "chore(release): release 0.22.0" ([#2852](#2852)) ([e6a4636](e6a4636)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add feast repo-upgrade for automated repo upgrades ([#2733](#2733)) ([a3304d4](a3304d4)) * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5))
# [0.22.0](v0.21.0...v0.22.0) (2022-06-25) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Correctly generate projects-list.json when calling feast ui and using postgres as a source ([#2845](#2845)) ([bee8076](bee8076)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Revert "chore(release): release 0.22.0" ([#2852](#2852)) ([e6a4636](e6a4636)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add feast repo-upgrade for automated repo upgrades ([#2733](#2733)) ([a3304d4](a3304d4)) * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5))
# [0.22.0](v0.21.0...v0.22.0) (2022-06-28) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Correctly generate projects-list.json when calling feast ui and using postgres as a source ([#2845](#2845)) ([bee8076](bee8076)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Go install gopy instead using go mod tidy ([#2863](#2863)) ([2f2b519](2f2b519)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Revert "chore(release): release 0.22.0" ([#2852](#2852)) ([e6a4636](e6a4636)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add feast repo-upgrade for automated repo upgrades ([#2733](#2733)) ([a3304d4](a3304d4)) * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5)) ### Reverts * Revert "Create main.yml" (#2867) ([47922a4](47922a4)), closes [#2867](#2867)
# [0.22.0](v0.21.0...v0.22.0) (2022-06-28) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Correctly generate projects-list.json when calling feast ui and using postgres as a source ([#2845](#2845)) ([bee8076](bee8076)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fix workflow syntax error ([#2869](#2869)) ([fae45a1](fae45a1)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Go install gopy instead using go mod tidy ([#2863](#2863)) ([2f2b519](2f2b519)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Revert "chore(release): release 0.22.0" ([#2852](#2852)) ([e6a4636](e6a4636)) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Sync publish and build_wheels workflow to fix verify wheel error. ([#2871](#2871)) ([b0f050a](b0f050a)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add feast repo-upgrade for automated repo upgrades ([#2733](#2733)) ([a3304d4](a3304d4)) * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5)) ### Reverts * Revert "chore(release): release 0.22.0" (#2870) ([ffb0892](ffb0892)), closes [#2870](#2870) * Revert "Create main.yml" (#2867) ([47922a4](47922a4)), closes [#2867](#2867)
# [0.22.0](v0.21.0...v0.22.0) (2022-06-29) ### Bug Fixes * Add columns for user metadata in the tables ([#2760](#2760)) ([269055e](269055e)) * Add project columns in the SQL Registry ([#2784](#2784)) ([336fdd1](336fdd1)) * Add S3FS dependency (which Dask depends on for S3 files) ([#2701](#2701)) ([5d6fa94](5d6fa94)) * Bugfixes for how registry is loaded ([#2768](#2768)) ([ecb8b2a](ecb8b2a)) * Conversion of null timestamp from proto to python ([#2814](#2814)) ([cb23648](cb23648)) * Correct feature statuses during feature logging test ([#2709](#2709)) ([cebf609](cebf609)) * Correctly generate projects-list.json when calling feast ui and using postgres as a source ([#2845](#2845)) ([bee8076](bee8076)) * Dynamodb drops missing entities when batching ([#2802](#2802)) ([a2e9209](a2e9209)) * Enable faulthandler and disable flaky tests ([#2815](#2815)) ([4934d84](4934d84)) * Explicitly translate errors when instantiating the go fs ([#2842](#2842)) ([7a2c4cd](7a2c4cd)) * Fix broken roadmap links ([#2690](#2690)) ([b3ba8aa](b3ba8aa)) * Fix bugs in applying stream feature view and retrieving online features ([#2754](#2754)) ([d024e5e](d024e5e)) * Fix Feast UI failure with new way of specifying entities ([#2773](#2773)) ([0d1ac01](0d1ac01)) * Fix feature view __getitem__ for feature services ([#2769](#2769)) ([88cc47d](88cc47d)) * Fix issue when user specifies a port for feast ui ([#2692](#2692)) ([1c621fe](1c621fe)) * Fix macos wheel version for 310 and also checkout edited go files ([#2890](#2890)) ([bdf170f](bdf170f)) * Fix on demand feature view crash from inference when it uses df.apply ([#2713](#2713)) ([c5539fd](c5539fd)) * Fix SparkKafkaProcessor `query_timeout` parameter ([#2789](#2789)) ([a8d282d](a8d282d)) * Fix workflow syntax error ([#2869](#2869)) ([fae45a1](fae45a1)) * Fixed custom S3 endpoint read fail ([#2786](#2786)) ([6fec431](6fec431)) * Go install gopy instead using go mod tidy ([#2863](#2863)) ([2f2b519](2f2b519)) * Hydrate infra object in the sql registry proto() method ([#2782](#2782)) ([452dcd3](452dcd3)) * Implement apply_materialization and infra methods in sql registry ([#2775](#2775)) ([4ed107c](4ed107c)) * Minor refactor to format exception message ([#2764](#2764)) ([da763c6](da763c6)) * Prefer installing gopy from feast's fork as opposed to upstream ([#2839](#2839)) ([34c997d](34c997d)) * Python server is not correctly starting in integration tests ([#2706](#2706)) ([7583a0b](7583a0b)) * Random port allocation for python server in tests ([#2710](#2710)) ([dee8090](dee8090)) * Refactor test to reuse LocalRegistryFile ([#2763](#2763)) ([4339c0a](4339c0a)) * Revert "chore(release): release 0.22.0" ([#2852](#2852)) ([e6a4636](e6a4636)) * Stop running go mod tidy in setup.py ([#2877](#2877)) ([676ecbb](676ecbb)), closes [/github.com/pypa/cibuildwheel/issues/189#issuecomment-549933912](https://github.com//github.com/pypa/cibuildwheel/issues/189/issues/issuecomment-549933912) * Support push sources in stream feature views ([#2704](#2704)) ([0d60eaa](0d60eaa)) * Sync publish and build_wheels workflow to fix verify wheel error. ([#2871](#2871)) ([b0f050a](b0f050a)) * Update roadmap with stream feature view rfc ([#2824](#2824)) ([fc8f890](fc8f890)) * Update udf tests and add base functions to streaming fcos and fix some nonetype errors ([#2776](#2776)) ([331a214](331a214)) ### Features * Add feast repo-upgrade for automated repo upgrades ([#2733](#2733)) ([a3304d4](a3304d4)) * Add file write_to_offline_store functionality ([#2808](#2808)) ([c0e2ad7](c0e2ad7)) * Add http endpoint to the Go feature server ([#2658](#2658)) ([3347a57](3347a57)) * Add simple TLS support in Go RedisOnlineStore ([#2860](#2860)) ([521488d](521488d)) * Add StreamProcessor and SparkKafkaProcessor as contrib ([#2777](#2777)) ([83ab682](83ab682)) * Added Spark support for Delta and Avro ([#2757](#2757)) ([7d16516](7d16516)) * CLI interface for validation of logged features ([#2718](#2718)) ([c8b11b3](c8b11b3)) * Enable stream feature view materialization ([#2798](#2798)) ([a06700d](a06700d)) * Enable stream feature view materialization ([#2807](#2807)) ([7d57724](7d57724)) * Implement `offline_write_batch` for BigQuery and Snowflake ([#2840](#2840)) ([97444e4](97444e4)) * Offline push endpoint for pushing to offline stores ([#2837](#2837)) ([a88cd30](a88cd30)) * Push to Redshift batch source offline store directly ([#2819](#2819)) ([5748a8b](5748a8b)) * Scaffold for unified push api ([#2796](#2796)) ([1bd0930](1bd0930)) * SQLAlchemy Registry Support ([#2734](#2734)) ([b3fe39c](b3fe39c)) * Stream Feature View FCOS ([#2750](#2750)) ([0cf3c92](0cf3c92)) * Update stream fcos to have watermark and sliding interval ([#2765](#2765)) ([3256952](3256952)) * Validating logged features via Python SDK ([#2640](#2640)) ([2874fc5](2874fc5)) ### Reverts * Revert "chore(release): release 0.22.0" (#2891) ([e5abf58](e5abf58)), closes [#2891](#2891) * Revert "chore(release): release 0.22.0" (#2870) ([ffb0892](ffb0892)), closes [#2870](#2870) * Revert "Create main.yml" (#2867) ([47922a4](47922a4)), closes [#2867](#2867)
What this PR does / why we need it:
The interface for write_to_offline_store is changing(i.e the dataframe passed in is an arrow table instead of a pandas dataframe. The reason for this is because is we have to convet the pandas dataframe to pyarrow table for write anyways but some functions such as
_run_field_mapping
use pyarrow table so it makes more sense to not rewrite code and just convert the pyarrow table earlier.Which issue(s) this PR fixes:
Fixes #