Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge feast-snowflake plugin into main repo with documentation #2193

Merged
merged 21 commits into from
Jan 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,24 +136,24 @@ The list below contains the functionality that contributors are planning to deve
* Want to speak to a Feast contributor? We are more than happy to jump on a call. Please schedule a time using [Calendly](https://calendly.com/d/x2ry-g5bb/meet-with-feast-team).

* **Data Sources**
* [x] [Snowflake source](https://docs.feast.dev/reference/data-sources/snowflake)
* [x] [Redshift source](https://docs.feast.dev/reference/data-sources/redshift)
* [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery)
* [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file)
* [x] [Synapse source (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] Kafka source (with [push support into the online store](reference/alpha-stream-ingestion.md))
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [ ] HTTP source
* **Offline Stores**
* [x] [Snowflake](https://docs.feast.dev/reference/offline-stores/snowflake)
* [x] [Redshift](https://docs.feast.dev/reference/offline-stores/redshift)
* [x] [BigQuery](https://docs.feast.dev/reference/offline-stores/bigquery)
* [x] [Synapse (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] [In-memory / Pandas](https://docs.feast.dev/reference/offline-stores/file)
* [x] [Custom offline store support](https://docs.feast.dev/how-to-guides/adding-a-new-offline-store)
* [x] [Snowflake (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [x] [Trino (communiuty plugin)](https://github.com/Shopify/feast-trino)
* **Online Stores**
* [x] [DynamoDB](https://docs.feast.dev/reference/online-stores/dynamodb)
Expand Down Expand Up @@ -208,7 +208,7 @@ The list below contains the functionality that contributors are planning to deve
Please refer to the official documentation at [Documentation](https://docs.feast.dev/)
* [Quickstart](https://docs.feast.dev/getting-started/quickstart)
* [Tutorials](https://docs.feast.dev/tutorials/tutorials-overview)
* [Running Feast with GCP/AWS](https://docs.feast.dev/how-to-guides/feast-gcp-aws)
* [Running Feast with Snowflake/GCP/AWS](https://docs.feast.dev/how-to-guides/feast-snowflake-gcp-aws)
* [Change Log](https://github.com/feast-dev/feast/blob/master/CHANGELOG.md)
* [Slack (#Feast)](https://slack.feast.dev/)

Expand All @@ -224,4 +224,4 @@ Thanks goes to these incredible people:

<a href="https://github.com/feast-dev/feast/graphs/contributors">
<img src="https://contrib.rocks/image?repo=feast-dev/feast" />
</a>
</a>
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,6 @@ Explore the following resources to get started with Feast:
* [Concepts](getting-started/concepts/) describes all important Feast API concepts
* [Architecture](getting-started/architecture-and-components/) describes Feast's overall architecture.
* [Tutorials](tutorials/tutorials-overview.md) shows full examples of using Feast in machine learning applications.
* [Running Feast with GCP/AWS](how-to-guides/feast-gcp-aws/) provides a more in-depth guide to using Feast.
* [Running Feast with Snowflake/GCP/AWS](how-to-guides/feast-snowflake-gcp-aws/) provides a more in-depth guide to using Feast.
* [Reference](reference/feast-cli-commands.md) contains detailed API and design documents.
* [Contributing](project/contributing.md) contains resources for anyone who wants to contribute to Feast.
5 changes: 4 additions & 1 deletion docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,11 @@
* [Driver ranking](tutorials/driver-ranking-with-feast.md)
* [Fraud detection on GCP](tutorials/fraud-detection.md)
* [Real-time credit scoring on AWS](tutorials/real-time-credit-scoring-on-aws.md)
* [Driver Stats using Snowflake](tutorials/driver-stats-using-snowflake.md)

## How-to Guides

* [Running Feast with GCP/AWS](how-to-guides/feast-gcp-aws/README.md)
* [Running Feast with Snowflake/GCP/AWS](how-to-guides/feast-snowflake-gcp-aws/README.md)
* [Install Feast](how-to-guides/feast-gcp-aws/install-feast.md)
* [Create a feature repository](how-to-guides/feast-gcp-aws/create-a-feature-repository.md)
* [Deploy a feature store](how-to-guides/feast-gcp-aws/deploy-a-feature-store.md)
Expand All @@ -54,10 +55,12 @@

* [Data sources](reference/data-sources/README.md)
* [File](reference/data-sources/file.md)
* [Snowflake](reference/data-sources/snowflake.md)
* [BigQuery](reference/data-sources/bigquery.md)
* [Redshift](reference/data-sources/redshift.md)
* [Offline stores](reference/offline-stores/README.md)
* [File](reference/offline-stores/file.md)
* [Snowflake](reference/offline-stores/snowflake.md)
* [BigQuery](reference/offline-stores/bigquery.md)
* [Redshift](reference/offline-stores/redshift.md)
* [Online stores](reference/online-stores/README.md)
Expand Down
4 changes: 2 additions & 2 deletions docs/getting-started/third-party-integrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,26 +13,26 @@ Don't see your offline store or online store of choice here? Check out our guide

### **Data Sources**

* [x] [Snowflake source](https://docs.feast.dev/reference/data-sources/snowflake)
sfc-gh-madkins marked this conversation as resolved.
Show resolved Hide resolved
* [x] [Redshift source](https://docs.feast.dev/reference/data-sources/redshift)
* [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery)
* [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file)
* [x] [Synapse source (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] Kafka source (with [push support into the online store](https://docs.feast.dev/reference/alpha-stream-ingestion))
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [ ] HTTP source

### Offline Stores

* [x] [Snowflake](https://docs.feast.dev/reference/offline-stores/snowflake)
* [x] [Redshift](https://docs.feast.dev/reference/offline-stores/redshift)
* [x] [BigQuery](https://docs.feast.dev/reference/offline-stores/bigquery)
* [x] [Synapse (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] [In-memory / Pandas](https://docs.feast.dev/reference/offline-stores/file)
* [x] [Custom offline store support](https://docs.feast.dev/how-to-guides/adding-a-new-offline-store)
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [x] [Trino (communiuty plugin)](https://github.com/Shopify/feast-trino)

### Online Stores
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,21 @@ Creating a new Feast repository in /<...>/tiny_pika.
```
{% endtab %}

{% tabs %}
{% tab title="Snowflake template" %}
```bash
feast init -t snowflake
Snowflake Deployment URL: ...
Snowflake User Name: ...
Snowflake Password: ...
Snowflake Role Name: ...
Snowflake Warehouse Name: ...
Snowflake Database Name: ...

Creating a new Feast repository in /<...>/tiny_pika.
```
{% endtab %}

{% tab title="GCP template" %}
```text
feast init -t gcp
Expand All @@ -30,7 +45,7 @@ Redshift Database Name: ...
Redshift User Name: ...
Redshift S3 Staging Location (s3://*): ...
Redshift IAM Role for S3 (arn:aws:iam::*:role/*): ...
Should I upload example data to Redshift (overwriting 'feast_driver_hourly_stats' table)? (Y/n):
Should I upload example data to Redshift (overwriting 'feast_driver_hourly_stats' table)? (Y/n):

Creating a new Feast repository in /<...>/tiny_pika.
```
Expand Down Expand Up @@ -63,4 +78,3 @@ You can now use this feature repository for development. You can try the followi
* Run `feast apply` to apply these definitions to Feast.
* Edit the example feature definitions in `example.py` and run `feast apply` again to change feature definitions.
* Initialize a git repository in the same directory and checking the feature repository into version control.

Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ Install Feast using [pip](https://pip.pypa.io):
pip install feast
```

Install Feast with Snowflake dependencies (required when using Snowflake):

```
pip install 'feast[snowflake]'
```

Install Feast with GCP dependencies (required when using BigQuery or Firestore):

```
Expand Down
3 changes: 2 additions & 1 deletion docs/reference/data-sources/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Please see [Data Source](../../getting-started/concepts/feature-view.md#data-sou

{% page-ref page="file.md" %}

{% page-ref page="snowflake.md" %}

{% page-ref page="bigquery.md" %}

{% page-ref page="redshift.md" %}

44 changes: 44 additions & 0 deletions docs/reference/data-sources/snowflake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Snowflake

## Description

Snowflake data sources allow for the retrieval of historical feature values from Snowflake for building training datasets as well as materializing features into an online store.

* Either a table reference or a SQL query can be provided.

## Examples

Using a table reference

```python
from feast import SnowflakeSource

my_snowflake_source = SnowflakeSource(
database="FEAST",
schema="PUBLIC",
table="FEATURE_TABLE",
)
```

Using a query

```python
from feast import SnowflakeSource

my_snowflake_source = SnowflakeSource(
query="""
SELECT
timestamp_column AS "ts",
"created",
"f1",
"f2"
FROM
`FEAST.PUBLIC.FEATURE_TABLE`
""",
)
```

One thing to remember is how Snowflake handles table and column name conventions.
You can read more about quote identifiers [here](https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html)

Configuration options are available [here](https://rtd.feast.dev/en/latest/index.html#feast.data_source.SnowflakeSource).
3 changes: 2 additions & 1 deletion docs/reference/offline-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Please see [Offline Store](../../getting-started/architecture-and-components/off

{% page-ref page="file.md" %}

{% page-ref page="snowflake.md" %}

{% page-ref page="bigquery.md" %}

{% page-ref page="redshift.md" %}

30 changes: 30 additions & 0 deletions docs/reference/offline-stores/snowflake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Snowflake

## Description

The Snowflake offline store provides support for reading [SnowflakeSources](../data-sources/snowflake.md).

* Snowflake tables and views are allowed as sources.
* All joins happen within Snowflake.
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. Pandas dataframes will be uploaded to Snowflake in order to complete join operations.
* A [SnowflakeRetrievalJob](https://github.com/feast-dev/feast/blob/bf557bcb72c7878a16dccb48443bbbe9dc3efa49/sdk/python/feast/infra/offline_stores/snowflake.py#L185) is returned when calling `get_historical_features()`.

## Example

{% code title="feature_store.yaml" %}
```yaml
project: my_feature_repo
registry: data/registry.db
provider: local
offline_store:
type: snowflake.offline
account: snowflake_deployment.us-east-1
user: user_login
password: user_password
role: sysadmin
warehouse: demo_wh
database: FEAST
```
{% endcode %}

Configuration options are available [here](https://github.com/feast-dev/feast/blob/bf557bcb72c7878a16dccb48443bbbe9dc3efa49/sdk/python/feast/infra/offline_stores/snowflake.py#L39).
26 changes: 0 additions & 26 deletions docs/reference/offline-stores/untitled.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/reference/online-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,3 @@ Please see [Online Store](../../getting-started/architecture-and-components/onli
{% page-ref page="datastore.md" %}

{% page-ref page="dynamodb.md" %}

1 change: 0 additions & 1 deletion docs/reference/providers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,3 @@ Please see [Provider](../../getting-started/architecture-and-components/provider
{% page-ref page="google-cloud-platform.md" %}

{% page-ref page="amazon-web-services.md" %}

2 changes: 2 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ The list below contains the functionality that contributors are planning to deve
* Want to speak to a Feast contributor? We are more than happy to jump on a call. Please schedule a time using [Calendly](https://calendly.com/d/x2ry-g5bb/meet-with-feast-team).

* **Data Sources**
* [x] [Snowflake source](https://docs.feast.dev/reference/data-sources/snowflake)
* [x] [Redshift source](https://docs.feast.dev/reference/data-sources/redshift)
* [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery)
* [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file)
Expand All @@ -18,6 +19,7 @@ The list below contains the functionality that contributors are planning to deve
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [ ] HTTP source
* **Offline Stores**
* [x] [Snowflake](https://docs.feast.dev/reference/offline-stores/snowflake)
* [x] [Redshift](https://docs.feast.dev/reference/offline-stores/redshift)
* [x] [BigQuery](https://docs.feast.dev/reference/offline-stores/bigquery)
* [x] [Synapse (community plugin)](https://github.com/Azure/feast-azure)
Expand Down
22 changes: 18 additions & 4 deletions docs/specs/offline_store_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ One of the design goals of Feast is being able to plug seamlessly into existing

Feast provides first class support for the following data warehouses (DWH) to store feature data offline out of the box:
* [BigQuery](https://cloud.google.com/bigquery)
* [Snowflake](https://www.snowflake.com/) (Coming Soon)
* [Redshift](https://aws.amazon.com/redshift/) (Coming Soon)
* [Snowflake](https://www.snowflake.com/)
* [Redshift](https://aws.amazon.com/redshift/)

The integration between Feast and the DWH is highly configurable, but at the same time there are some non-configurable implications and assumptions that Feast imposes on table schemas and mapping between database-native types and Feast type system. This is what this document is about.

Expand All @@ -28,14 +28,14 @@ Feature data is stored in tables in the DWH. There is one DWH table per Feast Fe
## Type mappings

#### Pandas types
Here's how Feast types map to Pandas types for Feast APIs that take in or return a Pandas dataframe:
Here's how Feast types map to Pandas types for Feast APIs that take in or return a Pandas dataframe:

| Feast Type | Pandas Type |
|-------------|--|
| Event Timestamp | `datetime64[ns]` |
| BYTES | `bytes` |
| STRING | `str` , `category`|
| INT32 | `int32`, `uint32` |
| INT32 | `int16`, `uint16`, `int32`, `uint32` |
| INT64 | `int64`, `uint64` |
| UNIX_TIMESTAMP | `datetime64[ns]`, `datetime64[ns, tz]` |
| DOUBLE | `float64` |
Expand Down Expand Up @@ -80,3 +80,17 @@ Here's how Feast types map to BigQuery types when using BigQuery for offline sto
| BOOL\_LIST | `ARRAY<BOOL>`|

Values that are not specified by the table above will cause an error on conversion.

#### Snowflake Types
Here's how Feast types map to Snowflake types when using Snowflake for offline storage
See source here:
https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#snowflake-to-pandas-data-mapping

| Feast Type | Snowflake Python Type |
|-------------|--|
| Event Timestamp | `DATETIME64[NS]` |
| UNIX_TIMESTAMP | `DATETIME64[NS]` |
| STRING | `STR` |
| INT32 | `INT8 / UINT8 / INT16 / UINT16 / INT32 / UINT32` |
| INT64 | `INT64 / UINT64` |
| DOUBLE | `FLOAT64` |
Loading