Skip to content

Commit

Permalink
Draft changes to add remote online store to feast.
Browse files Browse the repository at this point in the history
Signed-off-by: Lokesh Rangineni <[email protected]>

Adding the integration test and remote online creator class so that it will fit into existing integration testing framework.

Signed-off-by: Lokesh Rangineni <[email protected]>

Fix after rebase

Signed-off-by: Lokesh Rangineni <[email protected]>

Removing the RemoteOnlineStoreCreator and adding custom integration test case. Incorporating the code review comments.

Signed-off-by: Lokesh Rangineni <[email protected]>

reformatting the code, removing unnecessary braces.

Signed-off-by: Lokesh Rangineni <[email protected]>

Trying to fix the errors reported in make lint-python

Signed-off-by: Lokesh Rangineni <[email protected]>

Ran the command make format-python and trying to see if it fixes the lint errors.

Signed-off-by: Lokesh Rangineni <[email protected]>

increasing the server start timeout to see if it fixes the integration test cases.

Signed-off-by: Lokesh Rangineni <[email protected]>

checking changes after make format-python

Signed-off-by: Lokesh Rangineni <[email protected]>

trying to see if this fixes the PR integrationt test failure.
Signed-off-by: Lokesh Rangineni <[email protected]>

Signed-off-by: Lokesh Rangineni <[email protected]>

checking in the changes for make format-python

Signed-off-by: Lokesh Rangineni <[email protected]>

Upgrading python version to 3.11, adding support for 3.11 as well.

Signed-off-by: Lokesh Rangineni <[email protected]>

chore: Bump macOS runners to macos-13 (#4152)

bump macos runner to 13

Signed-off-by: tokoko <[email protected]>

Signed-off-by: Lokesh Rangineni <[email protected]>

chore: Use pixi to lock python dependencies in a single command (#4114)

use pixi to lock python dependencies in a single command

Signed-off-by: tokoko <[email protected]>

Signed-off-by: Lokesh Rangineni <[email protected]>

feat: List all feature views (#4256)

* feature: Adding type to base feature view

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fixed linter

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fixed type and meta

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding new listing

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated

Signed-off-by: Francisco Javier Arceo <[email protected]>

* cleaning up changes

Signed-off-by: Francisco Javier Arceo <[email protected]>

* reverting FV proto

Signed-off-by: Francisco Javier Arceo <[email protected]>

* doing simple way

Signed-off-by: Francisco Javier Arceo <[email protected]>

* added a test

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated to add warnings

Signed-off-by: Francisco Javier Arceo <[email protected]>

---------

Signed-off-by: Francisco Javier Arceo <[email protected]>
feat: Adding vector search for sqlite (#4176)

* feat: Adding vector search for sqlite

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding the sqlite_vss dependency

Signed-off-by: Francisco Javier Arceo <[email protected]>

* linter

Signed-off-by: Francisco Javier Arceo <[email protected]>

* latest progress

Signed-off-by: Francisco Javier Arceo <[email protected]>

* uploading latest progress

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated function

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding configuration

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding current progress

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updating requirements files

Signed-off-by: Francisco Javier Arceo <[email protected]>

* moving to sqlite-vec

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updating sqlite.py

Signed-off-by: Francisco Javier Arceo <[email protected]>

* checking in progress

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated test type

Signed-off-by: Francisco Javier Arceo <[email protected]>

* got the initialization working, nice

Signed-off-by: Francisco Javier Arceo <[email protected]>

* checking in progress from last night

Signed-off-by: Francisco Javier Arceo <[email protected]>

* removing unnecessary stuff

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fixing merge conflicts

Signed-off-by: Francisco Javier Arceo <[email protected]>

* removing files changed accidentally]

Signed-off-by: Francisco Javier Arceo <[email protected]>

* uploading current progress...things run but need to update the virtual table insertion

Signed-off-by: Francisco Javier Arceo <[email protected]>

* linted

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding working notes

Signed-off-by: Francisco Javier Arceo <[email protected]>

* found a bug, original feature_store.py was only grabbing first feature view, adjusted

Signed-off-by: Francisco Javier Arceo <[email protected]>

* cant use a string have to verify it is a proper FeatureView object

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated got it working, need to fix some other stuff still

Signed-off-by: Francisco Javier Arceo <[email protected]>

* working

Signed-off-by: Francisco Javier Arceo <[email protected]>

* linter

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fixing some type issues

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fixed typing and lint issues

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated dependencies

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fix for pixi and updating requirements

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fixed type

Signed-off-by: Francisco Javier Arceo <[email protected]>

* linter

Signed-off-by: Francisco Javier Arceo <[email protected]>

* testing sqlite_vec import

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding minimal example test

Signed-off-by: Francisco Javier Arceo <[email protected]>

* lint

Signed-off-by: Francisco Javier Arceo <[email protected]>

* testing raw sqlite

Signed-off-by: Francisco Javier Arceo <[email protected]>

* Printing package version

* printing version

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated requirements

* rebuilding requirments

Signed-off-by: Francisco Javier Arceo <[email protected]>

* only going to run this on 3.10 for now

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated docs for sqlite caveats

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding reason

Signed-off-by: Francisco Javier Arceo <[email protected]>

* skipping

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated tests

Signed-off-by: Francisco Javier Arceo <[email protected]>

* removing print

Signed-off-by: Francisco Javier Arceo <[email protected]>

* added method call

Signed-off-by: Francisco Javier Arceo <[email protected]>

* added prubt

Signed-off-by: Francisco Javier Arceo <[email protected]>

* added print

Signed-off-by: Francisco Javier Arceo <[email protected]>

* removing print

Signed-off-by: Francisco Javier Arceo <[email protected]>

* adding check in sqlite

Signed-off-by: Francisco Javier Arceo <[email protected]>

* missed an =

Signed-off-by: Francisco Javier Arceo <[email protected]>

* still running on 3.11

Signed-off-by: Francisco Javier Arceo <[email protected]>

* typo

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fix

Signed-off-by: Francisco Javier Arceo <[email protected]>

* fix

Signed-off-by: Francisco Javier Arceo <[email protected]>

* updated setup and docs

Signed-off-by: Francisco Javier Arceo <[email protected]>

* renamed things

Signed-off-by: Francisco Javier Arceo <[email protected]>

---------

Signed-off-by: Francisco Javier Arceo <[email protected]>
squashing the last 15 commits to one.

Merge branch 'master' into feature/adding-remote-onlinestore-rebase

Adding documentation and incorporating code review comment.

Signed-off-by: Lokesh Rangineni <[email protected]>

Adding documentation and incorporating code review comment.

Signed-off-by: Lokesh Rangineni <[email protected]>

Merge remote-tracking branch 'fork/feature/adding-remote-onlinestore-rebase' into feature/adding-remote-onlinestore-rebase
  • Loading branch information
lokeshrangineni committed Jun 12, 2024
1 parent 0d162e9 commit 5231bac
Show file tree
Hide file tree
Showing 26 changed files with 1,465 additions and 421 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pr_integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.9", "3.10", "3.11" ]
python-version: [ "3.11" ]
os: [ ubuntu-latest ]
env:
OS: ${{ matrix.os }}
Expand Down
6 changes: 0 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,6 @@ lock-python-dependencies-all:
pixi run --environment py311 --manifest-path infra/scripts/pixi/pixi.toml "uv pip compile --system --no-strip-extras setup.py --output-file sdk/python/requirements/py3.11-requirements.txt"
pixi run --environment py311 --manifest-path infra/scripts/pixi/pixi.toml "uv pip compile --system --no-strip-extras setup.py --extra ci --output-file sdk/python/requirements/py3.11-ci-requirements.txt"

lock-python-dependencies-all:
pixi run --environment py39 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --output-file sdk/python/requirements/py3.9-requirements.txt"
pixi run --environment py39 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --extra ci --output-file sdk/python/requirements/py3.9-ci-requirements.txt"
pixi run --environment py310 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --output-file sdk/python/requirements/py3.10-requirements.txt"
pixi run --environment py310 --manifest-path infra/scripts/pixi/pixi.toml "python -m piptools compile -U --extra ci --output-file sdk/python/requirements/py3.10-ci-requirements.txt"

benchmark-python:
IS_TEST=True python -m pytest --integration --benchmark --benchmark-autosave --benchmark-save-data sdk/python/tests

Expand Down
18 changes: 18 additions & 0 deletions docs/reference/alpha-vector-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@ Below are supported vector databases and implemented features:
| Elasticsearch | [x] | [x] |
| Milvus | [ ] | [ ] |
| Faiss | [ ] | [ ] |
| SQLite | [x] | [ ] |

Note: SQLite is in limited access and only working on Python 3.10. It will be updated as [sqlite_vec](https://github.com/asg017/sqlite-vec/) progresses.

## Example

Expand Down Expand Up @@ -108,4 +110,20 @@ def print_online_features(features):
print(key, " : ", value)

print_online_features(features)
```

### Configuration
We offer two Online Store options for Vector Databases. PGVector and SQLite.

#### Installation with SQLite
If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
```bash
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
pyenv install 3.10.14
```
And you can the Feast install package via:
```bash
pip install feast[sqlite_vec]
```
4 changes: 4 additions & 0 deletions docs/reference/online-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,7 @@ Please see [Online Store](../../getting-started/architecture-and-components/onli
{% content-ref url="scylladb.md" %}
[scylladb.md](scylladb.md)
{% endcontent-ref %}

{% content-ref url="remote.md" %}
[remote.md](remote.md)
{% endcontent-ref %}
21 changes: 21 additions & 0 deletions docs/reference/online-stores/remote.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Remote online store

## Description

This remote online store will let you interact with remote feature server. At this moment this only supports the read operation. You can use this online store and able retrieve online features `store.get_online_features` from remote feature server.

## Examples

The registry is pointing to registry of remote feature store. If it is not accessible then should be configured to use remote registry.

{% code title="feature_store.yaml" %}
```yaml
project: my-local-project
registry: /remote/data/registry.db
provider: local
online_store:
path: http://localhost:6566
type: remote
entity_key_serialization_version: 2
```
{% endcode %}
660 changes: 419 additions & 241 deletions infra/scripts/pixi/pixi.lock

Large diffs are not rendered by default.

10 changes: 7 additions & 3 deletions infra/scripts/pixi/pixi.toml
Original file line number Diff line number Diff line change
@@ -1,19 +1,23 @@
[project]
name = "pixi-feast"
channels = ["conda-forge"]
platforms = ["linux-64"]
platforms = ["linux-64", "osx-arm64"]

[tasks]

[dependencies]
pip-tools = ">=7.4.1,<7.5"
uv = ">=0.1.39,<0.2"

[feature.py39.dependencies]
python = "~=3.9.0"

[feature.py310.dependencies]
python = "~=3.10.0"

[feature.py311.dependencies]
python = "~=3.11.0"

[environments]
py39 = ["py39"]
py310 = ["py310"]
py310 = ["py310"]
py311 = ["py311"]
67 changes: 65 additions & 2 deletions sdk/python/feast/feature_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# limitations under the License.
import copy
import itertools
import logging
import os
import warnings
from collections import Counter, defaultdict
Expand Down Expand Up @@ -247,6 +248,20 @@ def list_feature_services(self) -> List[FeatureService]:
"""
return self._registry.list_feature_services(self.project)

def list_all_feature_views(
self, allow_cache: bool = False
) -> List[Union[FeatureView, StreamFeatureView, OnDemandFeatureView]]:
"""
Retrieves the list of feature views from the registry.
Args:
allow_cache: Whether to allow returning entities from a cached registry.
Returns:
A list of feature views.
"""
return self._list_all_feature_views(allow_cache)

def list_feature_views(self, allow_cache: bool = False) -> List[FeatureView]:
"""
Retrieves the list of feature views from the registry.
Expand All @@ -257,12 +272,50 @@ def list_feature_views(self, allow_cache: bool = False) -> List[FeatureView]:
Returns:
A list of feature views.
"""
logging.warning(
"list_feature_views will make breaking changes. Please use list_batch_feature_views instead. "
"list_feature_views will behave like list_all_feature_views in the future."
)
return self._list_feature_views(allow_cache)

def _list_all_feature_views(
self,
allow_cache: bool = False,
) -> List[Union[FeatureView, StreamFeatureView, OnDemandFeatureView]]:
all_feature_views = (
self._list_feature_views(allow_cache)
+ self._list_stream_feature_views(allow_cache)
+ self.list_on_demand_feature_views(allow_cache)
)
return all_feature_views

def _list_feature_views(
self,
allow_cache: bool = False,
hide_dummy_entity: bool = True,
) -> List[FeatureView]:
logging.warning(
"_list_feature_views will make breaking changes. Please use _list_batch_feature_views instead. "
"_list_feature_views will behave like _list_all_feature_views in the future."
)
feature_views = []
for fv in self._registry.list_feature_views(
self.project, allow_cache=allow_cache
):
if (
hide_dummy_entity
and fv.entities
and fv.entities[0] == DUMMY_ENTITY_NAME
):
fv.entities = []
fv.entity_columns = []
feature_views.append(fv)
return feature_views

def _list_batch_feature_views(
self,
allow_cache: bool = False,
hide_dummy_entity: bool = True,
) -> List[FeatureView]:
feature_views = []
for fv in self._registry.list_feature_views(
Expand Down Expand Up @@ -1881,18 +1934,28 @@ def _retrieve_online_documents(
"Using embedding functionality is not supported for document retrieval. Please embed the query before calling retrieve_online_documents."
)
(
requested_feature_views,
available_feature_views,
_,
) = self._get_feature_views_to_use(
features=[feature], allow_cache=True, hide_dummy_entity=False
)
requested_feature_view_name = (
feature.split(":")[0] if isinstance(feature, str) else feature
)
for feature_view in available_feature_views:
if feature_view.name == requested_feature_view_name:
requested_feature_view = feature_view
if not requested_feature_view:
raise ValueError(
f"Feature view {requested_feature_view} not found in the registry."
)
requested_feature = (
feature.split(":")[1] if isinstance(feature, str) else feature
)
provider = self._get_provider()
document_features = self._retrieve_from_online_store(
provider,
requested_feature_views[0],
requested_feature_view,
requested_feature,
query,
top_k,
Expand Down
174 changes: 174 additions & 0 deletions sdk/python/feast/infra/online_stores/remote.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Copyright 2021 The Feast Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import json
import logging
from datetime import datetime
from typing import Any, Callable, Dict, List, Literal, Optional, Sequence, Tuple

import requests
from pydantic import StrictStr

from feast import Entity, FeatureView, RepoConfig
from feast.infra.online_stores.online_store import OnlineStore
from feast.protos.feast.types.EntityKey_pb2 import EntityKey as EntityKeyProto
from feast.protos.feast.types.Value_pb2 import Value as ValueProto
from feast.repo_config import FeastConfigBaseModel
from feast.type_map import python_values_to_proto_values
from feast.value_type import ValueType

logger = logging.getLogger(__name__)


class RemoteOnlineStoreConfig(FeastConfigBaseModel):
"""Remote Online store config for remote online store"""

type: Literal["remote"] = "remote"
"""Online store type selector"""

path: StrictStr = "http://localhost:6566"
""" str: Path to metadata store.
If type is 'remote', then this is a URL for registry server """


class RemoteOnlineStore(OnlineStore):
"""
remote online store implementation wrapper to communicate with feast online server.
"""

def online_write_batch(
self,
config: RepoConfig,
table: FeatureView,
data: List[
Tuple[EntityKeyProto, Dict[str, ValueProto], datetime, Optional[datetime]]
],
progress: Optional[Callable[[int], Any]],
) -> None:
pass

def online_read(
self,
config: RepoConfig,
table: FeatureView,
entity_keys: List[EntityKeyProto],
requested_features: Optional[List[str]] = None,
) -> List[Tuple[Optional[datetime], Optional[Dict[str, ValueProto]]]]:
assert isinstance(config.online_store, RemoteOnlineStoreConfig)
config.online_store.__class__ = RemoteOnlineStoreConfig

req_body = self._construct_online_read_api_json_request(
entity_keys, table, requested_features
)
response = requests.post(
f"{config.online_store.path}/get-online-features", data=req_body
)
if response.status_code == 200:
logger.debug("Able to retrieve the online features from feature server.")
response_json = json.loads(response.text)
event_ts = self._get_event_ts(response_json)
# Iterating over results and converting the API results in column format to row format.
result_tuples: List[
Tuple[Optional[datetime], Optional[Dict[str, ValueProto]]]
] = []
for feature_value_index in range(len(entity_keys)):
feature_values_dict: Dict[str, ValueProto] = dict()
for index, feature_name in enumerate(
response_json["metadata"]["feature_names"]
):
if (
requested_features is not None
and feature_name in requested_features
):
if (
response_json["results"][index]["statuses"][
feature_value_index
]
== "PRESENT"
):
message = python_values_to_proto_values(
[
response_json["results"][index]["values"][
feature_value_index
]
],
ValueType.UNKNOWN,
)
feature_values_dict[feature_name] = message[0]
else:
feature_values_dict[feature_name] = ValueProto()

result_tuples.append((event_ts, feature_values_dict))
return result_tuples
else:
error_msg = f"Unable to retrieve the online store data using feature server API. Error_code={response.status_code}, error_message={response.reason}"
logger.error(error_msg)
raise RuntimeError(error_msg)

def _construct_online_read_api_json_request(
self,
entity_keys: List[EntityKeyProto],
table: FeatureView,
requested_features: Optional[List[str]] = None,
):
api_requested_features = []
if requested_features is not None:
for requested_feature in requested_features:
api_requested_features.append(f"{table.name}:{requested_feature}")

entity_values = []
entity_key = ""
for row in entity_keys:
entity_key = row.join_keys[0]
entity_values.append(
getattr(row.entity_values[0], row.entity_values[0].WhichOneof("val"))
)

req_body = json.dumps(
{
"features": api_requested_features,
"entities": {entity_key: entity_values},
}
)
return req_body

def _check_if_feature_requested(self, feature_name, requested_features):
for requested_feature in requested_features:
if feature_name in requested_feature:
return True
return False

def _get_event_ts(self, response_json) -> datetime:
event_ts = ""
if len(response_json["results"]) > 1:
event_ts = response_json["results"][1]["event_timestamps"][0]
return datetime.fromisoformat(event_ts.replace("Z", "+00:00"))

def update(
self,
config: RepoConfig,
tables_to_delete: Sequence[FeatureView],
tables_to_keep: Sequence[FeatureView],
entities_to_delete: Sequence[Entity],
entities_to_keep: Sequence[Entity],
partial: bool,
):
pass

def teardown(
self,
config: RepoConfig,
tables: Sequence[FeatureView],
entities: Sequence[Entity],
):
pass
Loading

0 comments on commit 5231bac

Please sign in to comment.