Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Registry is broken due to caching #3445

Closed
jiewpeng opened this issue Jan 12, 2023 · 5 comments · Fixed by #3450
Closed

SQL Registry is broken due to caching #3445

jiewpeng opened this issue Jan 12, 2023 · 5 comments · Fixed by #3450

Comments

@jiewpeng
Copy link

Expected Behavior

SQL registry works

Current Behavior

When calling feature_store.get_online_features(), we should get back the features.

Steps to reproduce

Use any SQL registry and call feature_store.get_online_features().

Traceback (apologies for the REDACTED stuff, that has to do with Databricks redacting some stuff which it thinks are secrets:

/databricks/python/lib/python3.8/site-packages/feast/usage.py in wrapper(*args, **kwargs)
    297 
    298                 if traceback:
--> 299                     raise exc.with_traceback(traceback)
    300 
    301                 raise exc

/databricks/python/lib/python3.8/site-packages/feast/usage.py in wrapper(*args, **kwargs)
    286 
    287             try:
--> 288                 return func(*args, **kwargs)
    289             except Exception:
    290                 if ctx.exception:

/databricks/python/lib/python3.8/site-packages/feast/feature_store.py in get_online_features(self, features, entity_rows, full_feature_names)
   1586                     raise ValueError("All entity_rows must have the same keys.") from e
   1587 
-> 1588         return self._get_online_features(
   1589             features=features,
   1590             entity_values=columnar,

/databricks/python/lib/python3.8/site-packages/feast/feature_store.py in _get_online_features(self, features, entity_values, full_feature_names, native_entity_values)
   1654             requested_request_feature_views,
   1655             requested_on_demand_feature_views,
-> 1656         ) = self._get_feature_views_to_use(
   1657             features=features, allow_cache=True, hide_dummy_entity=False
   1658         )

/databricks/python/lib/python3.8/site-packages/feast/feature_store.py in _get_feature_views_to_use(self, features, allow_cache, hide_dummy_entity)
   2203             fv.name: fv
   2204             for fv in [
-> 2205                 *self._list_feature_views(allow_cache, hide_dummy_entity),
   2206                 *self._[REDACTED].list_stream_feature_views(
   2207                     project=self.project, allow_cache=allow_cache

/databricks/python/lib/python3.8/site-packages/feast/feature_store.py in _list_feature_views(self, allow_cache, hide_dummy_entity)
    287     ) -> List[FeatureView]:
    288         feature_views = []
--> 289         for fv in self._[REDACTED].list_feature_views(
    290             self.project, allow_cache=allow_cache
    291         ):

/databricks/python/lib/python3.8/site-packages/feast/infra/[REDACTED]/sql.py in list_feature_views(self, project, allow_cache)
    535     ) -> List[FeatureView]:
    536         if allow_cache:
--> 537             self._refresh_cached_[REDACTED]_if_necessary()
    538             return proto_[REDACTED]_utils.list_feature_views(
    539                 self.cached_[REDACTED]_proto, project

/databricks/python/lib/python3.8/site-packages/feast/infra/[REDACTED]/sql.py in _refresh_cached_[REDACTED]_if_necessary(self)
    232 
    233             if expired:
--> 234                 self.refresh()
    235 
    236     def get_stream_feature_view(

TypeError: refresh() missing 1 required positional argument: 'project'

This seems to be caused by triggering a refresh of the registry cache, which expects a project parameter, which type hint states that it is optional, but because the method spec does not define a default value, will require a value to be passed in. However, this value is not passed in, causing self.refresh() to fail.

Possible Solution

In https://github.com/feast-dev/feast/blob/master/sdk/python/feast/infra/registry/sql.py#L234, change self.refresh() to self.refresh(None). Alternatively, change the refresh method to have a default value for the project parameter (not sure why the parameter is required; it doesn't seem to be used).

@rajeshreddykundur
Copy link

@jiewpeng
There are 3 options to fix as you suggested.

  1. Call the refresh() method with None like self.refresh(None)
  2. adding a default value in the method definition makes def refresh(self, project: Optional[str] = None) so there are no changes while calling. I have seen other methods which defined None as default value https://github.com/feast-dev/feast/blob/master/sdk/python/feast/infra/registry/sql.py#L842
  3. project parameter is not used anywhere. We can remove it as well.

I'm leaning towards option 3 of removing the project parameter. Please let me know how do we want to proceed this.

I can take up this issue, this will be my first contribution to the feast and I would love to learn the process and contribute.

@jiewpeng
Copy link
Author

@rajeshreddykundur I have no preference as to which option is implemented, as long as it fixes the issue (all 3 methods will fix the issue). Option 3 makes sense to me as well, I'm just not sure whether the feast devs had something in mind when they designed the refresh method with that parameter (maybe it's to support something in their roadmap but not implemented yet).

@rajeshreddykundur
Copy link

is there a way to tag the CODEOWNERS and check if its okay to remove the project argument? If they need that argument in future, we can go option 2 adding default argument value = None.

@jiewpeng
Copy link
Author

I guess an easy way to check is just remove the project parameter and submit the PR, if they have any issues they will raise it?

@achals
Copy link
Member

achals commented Jan 18, 2023

Thanks for catching the issue @jiewpeng - I think using None as the default value in SqlRegistry is probably the best solution here. I'll also add the default value for BaseRegistry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants