Skip to content

feast-dev/feast

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Jun 28, 2022
f16d464 Β· Jun 28, 2022
Jun 28, 2022
Mar 5, 2022
Jun 23, 2022
Jun 14, 2022
Jun 24, 2022
Jun 25, 2022
Jun 25, 2022
Jun 17, 2022
Jun 27, 2022
Jun 9, 2022
Jan 8, 2021
Sep 11, 2021
May 11, 2022
May 31, 2022
May 2, 2020
May 13, 2022
Jan 19, 2022
Jun 10, 2022
Apr 20, 2022
Jun 25, 2022
May 16, 2022
Dec 10, 2018
May 13, 2022
Jun 23, 2022
Mar 25, 2022
Jun 23, 2022
Jun 24, 2022
Jun 24, 2022
May 13, 2022
May 3, 2022
Jun 27, 2022

Repository files navigation


unit-tests integration-tests-and-build java-integration-tests linter Docs Latest Python API License GitHub Release

Overview

Feast is an open source feature store for machine learning. Feast is the fastest path to productionizing analytic data for model training and online inference.

Please see our documentation for more information about the project.

πŸ“ Architecture

The above architecture is the minimal Feast deployment. Want to run the full Feast on Snowflake/GCP/AWS? Click here.

🐣 Getting Started

1. Install Feast

pip install feast

2. Create a feature repository

feast init my_feature_repo
cd my_feature_repo

3. Register your feature definitions and set up your feature store

feast apply

4. Explore your data in the web UI (experimental)

Web UI

feast ui

5. Build a training dataset

from feast import FeatureStore
import pandas as pd
from datetime import datetime

entity_df = pd.DataFrame.from_dict({
    "driver_id": [1001, 1002, 1003, 1004],
    "event_timestamp": [
        datetime(2021, 4, 12, 10, 59, 42),
        datetime(2021, 4, 12, 8,  12, 10),
        datetime(2021, 4, 12, 16, 40, 26),
        datetime(2021, 4, 12, 15, 1 , 12)
    ]
})

store = FeatureStore(repo_path=".")

training_df = store.get_historical_features(
    entity_df=entity_df,
    features = [
        'driver_hourly_stats:conv_rate',
        'driver_hourly_stats:acc_rate',
        'driver_hourly_stats:avg_daily_trips'
    ],
).to_df()

print(training_df.head())

# Train model
# model = ml.fit(training_df)
            event_timestamp  driver_id  conv_rate  acc_rate  avg_daily_trips
0 2021-04-12 08:12:10+00:00       1002   0.713465  0.597095              531
1 2021-04-12 10:59:42+00:00       1001   0.072752  0.044344               11
2 2021-04-12 15:01:12+00:00       1004   0.658182  0.079150              220
3 2021-04-12 16:40:26+00:00       1003   0.162092  0.309035              959

6. Load feature values into your online store

CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME
Materializing feature view driver_hourly_stats from 2021-04-14 to 2021-04-15 done!

7. Read online features at low latency

from pprint import pprint
from feast import FeatureStore

store = FeatureStore(repo_path=".")

feature_vector = store.get_online_features(
    features=[
        'driver_hourly_stats:conv_rate',
        'driver_hourly_stats:acc_rate',
        'driver_hourly_stats:avg_daily_trips'
    ],
    entity_rows=[{"driver_id": 1001}]
).to_dict()

pprint(feature_vector)

# Make prediction
# model.predict(feature_vector)
{
    "driver_id": [1001],
    "driver_hourly_stats__conv_rate": [0.49274],
    "driver_hourly_stats__acc_rate": [0.92743],
    "driver_hourly_stats__avg_daily_trips": [72]
}

πŸ“¦ Functionality and Roadmap

The list below contains the functionality that contributors are planning to develop for Feast

πŸŽ“ Important Resources

Please refer to the official documentation at Documentation

πŸ‘‹ Contributing

Feast is a community project and is still under active development. Please have a look at our contributing and development guides if you want to contribute to the project:

✨ Contributors

Thanks goes to these incredible people: