-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove feature specs being able to declare their serving or warehouse stores #159
Conversation
/hold in progress |
ok... I think this covers everything... it really depends how much we trust our tests. Need to try some integration level testing and check things like the CLI and SDK still make sense. |
… a single serving and warehouse storage spec
…t can retreive them
I've rebased this against master, @zhilingc @pradithya please take a look |
@woop Can you chime in on this based on our discussion this morning, we talked about whether having just a single serving store is the right strategy. And I think we agreed that it is in the short term. Then we talked about how in the future we might make it possible to have multiple serving stores again; but they would be configured with their features or models, rather than the features being configured with stores. So the strategy is: The short term strategy is to prioritise simpler deployment, configuration and usability. Are we all aligned on this? |
Just to be clear: We are proposing having only a single serving backend per deployment. We would still support clusters. We would still support different store types, just not within a single deployment of Feast. That may not be clear from the above phrasing. Anyway, I think it's the right move going forward. It would fix some of the complexity and provide abstraction. Part of the problem was that the feature specs have a reference to the store. This seems unnatural. Another unrelated issue is that in many cases you want to have centralized feature management, but decentralized deployments in a production environment. This could lead to issues if only a single serving store is supported, since clients might want to have their own customized setups. One proposal to alleviate this would be to have many serving deployments with their own store/backend, but with each deployment having only a single serving store and store type active at one time. Then we could have
Many different ways to go here, but I like the idea of the core/warehouse maintaining authority on features, but the serving layer being free to subscribe to only relevant features and using the appropriate stores. |
100% agreed. This pull request is in alignment. The only quirk I see is that it still allows features to provide options for stores, regardless of what the store actually is, like redis expiry. It just moves these out of the featureSpec.datastore.serving.options map into the featureSpec.options map, and adds a prefix eg: "redis.expiry". I would be okay with dropping them completely to be honest, just didn't want to do too much at once. |
/test integration-batch |
/retest |
Is this somehow related to #38? |
Looks fine to me, I made a PR against yours updating the CLI to be consistent with your API changes. |
Update CLI to remove references to stores
/cancel hold |
/hold cancel |
/retest |
/test integration-batch |
@pradithya @zhilingc All tests, including integration tests pass, can you lgtm+approve? :) |
@tims I see that we remove this command?
Actually I find being able to see the actual storage location is quite useful, especially if we're still dependent on BigQuery UI for people to quickly discover and access Warehouse data. Since I want to know the Project ID and BigQuery dataset my Feast is storing the warehouse data to. What do you think?
|
I didn't remove that Zhiling removed it, @zhilingc what do you think? |
I think it makes sense to remove the specific api ( Although maybe it might be better to add the method back in (in this PR), mark it as deprecated, and implement that alternative in another PR. I'll pick this up if you guys agree @tims @davidheryanto |
Yup I agree, that would be the easier way. |
Sounds good :) |
@zhilingc @davidheryanto you need to give it lgtm or approve or it will just sit here for ever :) |
/retest |
Add list storage functions back in, mark as deprecated
/approve |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidheryanto, zhilingc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Currently a Feature can declare it's own data stores in the feature spec, which must have been registered with Core before hand.. This adds a lot of complexity to declaring features and is very error prone.
Instead we should have the data stores dictated by Core, and feature specs should know nothing about them.
This means that a Feast deployment will now only be able to have 1 serving store and 1 warehouse store at a time.
Some things to note:
This changes the way features can configure some settings. For example redis expiry must now be set in FeatureSpec.options rather than FeautureSpec.datastores.serving.options
So the option key has changed from "expiry" to "redis.expiry". It is still called "expiry" when overriding the a default it in the StorageSpec.options however.
We need to find a better way to document this.
I think I like the idea of only have one place to set options in a FeatureSpec. But if it applies to the actual underlying storage or not depends on if that storage is actually being used. So it's not clear that these should be feature options at all.