Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase efficiency of Registry updates #1698

Merged
merged 4 commits into from
Jul 16, 2021

Conversation

felixwang9817
Copy link
Collaborator

@felixwang9817 felixwang9817 commented Jul 8, 2021

Signed-off-by: Felix Wang [email protected]

What this PR does / why we need it: make Registry updates more efficient. Right now, Registry updates are done by looping through FeatureViews and Entities and applying them one at a time. Applying an update to the Registry forces the underlying RegistryStore to update the underlying RegistryProto and write it (either to disk, GCS, or S3, depending on the backend of choice). Thus, looped writes are highly inefficient.

Current design: the Registry current has a cache to improve reading speed. Writes always go through to the backend, but not through the cache. Thus, the cache is not always up-to-date with the ground truth in the backend. The user is granted a lot of optionality around the cache: they may choose not to use the cache, thus reading from backend directly, or they may force the cache to refresh.

After this PR: the cache will be write-back, meaning writes always go to the cache, and only go the backend when flushed. Then, looping updates to the Registry will be fast since all updates will be performed in-memory, and flushed once at the end. Note that this change only affects writing to the cache. Reading from the cache is unchanged: the refresh and cache_ttl mechanisms are still available to readers.

Advantages: looping updates become fast. Arguably, makes the boundary between the Registry and RegistryStore classes more clear (right now the RegistryStore takes in an updater function which means it is changing the proto itself, whereas in the new version the Registry would be changing the proto, which makes more sense since the RegistryStore should only be storing the proto, not changing it).

Disadvantages: changes the Registry interface (but not in a breaking fashion). More importantly, introduces more complexity around managing concurrent access to a Registry. Also, theoretically leaves a greater window for errors to occur (e.g. a bunch of operations could occur on the in-memory cache, then the process could crash without persisting those changes).

Alternative solution: expose a new function such as apply_all_changes(feature_views: List[FeatureView], entities: List[Entity]) that applies all changes in memory and then persists it. This solution does not require changing the current usage of the cache, but changes the Registry interface with a method that seems somewhat hacky.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

NONE

@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2021

Codecov Report

Merging #1698 (7ca4725) into master (20e0783) will increase coverage by 0.18%.
The diff coverage is 83.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1698      +/-   ##
==========================================
+ Coverage   84.26%   84.45%   +0.18%     
==========================================
  Files          78       79       +1     
  Lines        6910     7012     +102     
==========================================
+ Hits         5823     5922      +99     
- Misses       1087     1090       +3     
Flag Coverage Δ
integrationtests 84.38% <83.00%> (+0.18%) ⬆️
unittests 69.32% <68.00%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdk/python/feast/repo_operations.py 31.06% <0.00%> (-0.16%) ⬇️
sdk/python/feast/registry.py 80.61% <71.59%> (-0.69%) ⬇️
sdk/python/feast/feature_store.py 94.37% <100.00%> (+0.02%) ⬆️
sdk/python/tests/test_registry.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20e0783...7ca4725. Read the comment docs.

Copy link
Member

@achals achals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

sdk/python/feast/registry.py Outdated Show resolved Hide resolved
sdk/python/feast/registry.py Outdated Show resolved Hide resolved
sdk/python/feast/registry.py Outdated Show resolved Hide resolved
@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achals, felixwang9817

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@achals
Copy link
Member

achals commented Jul 8, 2021

@felixwang9817 linter needs to be fixed, and I had some questsions/nits

sdk/python/feast/registry.py Outdated Show resolved Hide resolved
@felixwang9817
Copy link
Collaborator Author

/retest

@felixwang9817 felixwang9817 changed the title Change registry cache to be write-back instead of write-through. Increase efficiency of Registry updates Jul 14, 2021
@woop
Copy link
Member

woop commented Jul 14, 2021

By changing the Registry's cache to be write-back instead of write-through

I think it's important that this PR have explicit "documentation" on the changes that are being introduced to the registry. @felixwang9817 can you please write out the exact changes (and trade offs) that we are making here? What is the before/after?

@felixwang9817
Copy link
Collaborator Author

By changing the Registry's cache to be write-back instead of write-through

I think it's important that this PR have explicit "documentation" on the changes that are being introduced to the registry. @felixwang9817 can you please write out the exact changes (and trade offs) that we are making here? What is the before/after?

@woop I added some documentation to the PR, let me know if you want any further clarification.

sdk/python/tests/test_registry.py Outdated Show resolved Hide resolved
sdk/python/tests/test_registry.py Show resolved Hide resolved
sdk/python/feast/registry.py Outdated Show resolved Hide resolved
sdk/python/feast/registry.py Outdated Show resolved Hide resolved
sdk/python/feast/registry.py Outdated Show resolved Hide resolved
@woop
Copy link
Member

woop commented Jul 14, 2021

Signed-off-by: Felix Wang [email protected]

What this PR does / why we need it: make Registry updates more efficient. Right now, Registry updates are done by looping through FeatureViews and Entities and applying them one at a time. Applying an update to the Registry forces the underlying RegistryStore to update the underlying RegistryProto and write it (either to disk, GCS, or S3, depending on the backend of choice). Thus, looped writes are highly inefficient.

Current design: the Registry current has a cache to improve reading speed. Writes always go through to the backend, but not through the cache. Thus, the cache is not always up-to-date with the ground truth in the backend. The user is granted a lot of optionality around the cache: they may choose not to use the cache, thus reading from backend directly, or they may force the cache to refresh.

After this PR: the cache will be write-back, meaning writes always go to the cache, and only go the backend when flushed. Then, looping updates to the Registry will be fast since all updates will be performed in-memory, and flushed once at the end.

Advantages: looping updates become fast. Also, leads to some refactoring opportunities (e.g. removing refresh_registry from FeatureStore, removing cache_ttl argument and the associated logic for Registry). Arguably, makes the boundary between the Registry and RegistryStore classes more clear (right now the RegistryStore takes in an updater function which means it is changing the proto itself, whereas in the new version the Registry would be changing the proto, which makes more sense since the RegistryStore should only be storing the proto, not changing it).

Disadvantages: changes the Registry interface (but not in a breaking fashion). More importantly, introduces more complexity around managing concurrent access to a Registry. Also, theoretically leaves a greater window for errors to occur (e.g. a bunch of operations could occur on the in-memory cache, then the process could crash without persisting those changes).

Alternative solution: expose a new function such as apply_all_changes(feature_views: List[FeatureView], entities: List[Entity]) that applies all changes in memory and then persists it. This solution does not require changing the current usage of the cache, but changes the Registry interface with a method that seems somewhat hacky.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

NONE

One part that isn't clear: If I am a reader only (like a model serving application that has no write access to the registry and doesn't plan to run apply at all), are there any regressions/changes to the way I do things? How do I update the locally cached registry? Is it being done at get_online_features() time (which is lazy) or FeatureStore() initialization? How do I keep the in memory local registry updated based on TTL or through manual operations like refresh()? do any of these changes affect me?

@felixwang9817 felixwang9817 force-pushed the registry_cache branch 2 times, most recently from 80a1814 to c252f47 Compare July 15, 2021 23:51
@felixwang9817
Copy link
Collaborator Author

One part that isn't clear: If I am a reader only (like a model serving application that has no write access to the registry and doesn't plan to run apply at all), are there any regressions/changes to the way I do things? How do I update the locally cached registry? Is it being done at get_online_features() time (which is lazy) or FeatureStore() initialization? How do I keep the in memory local registry updated based on TTL or through manual operations like refresh()? do any of these changes affect me?

@woop I revised my documentation, see above. tldr: there are no changes for a reader, only a writer.

sdk/python/feast/repo_operations.py Show resolved Hide resolved
sdk/python/tests/test_registry.py Outdated Show resolved Hide resolved
@achals achals removed the request for review from tsotnet July 16, 2021 21:40
@achals
Copy link
Member

achals commented Jul 16, 2021

/lgtm

@feast-ci-bot feast-ci-bot merged commit 5e89ccf into feast-dev:master Jul 16, 2021
@felixwang9817 felixwang9817 deleted the registry_cache branch July 16, 2021 22:43
SourdoughCat pushed a commit to SourdoughCat/feast that referenced this pull request Jul 16, 2021
* Modify Registry to allow for delayed persistence of changes

Signed-off-by: Felix Wang <[email protected]>

* Increase efficiency of Registry updates

Signed-off-by: Felix Wang <[email protected]>

* Registry tests

Signed-off-by: Felix Wang <[email protected]>

* Change copyright date

Signed-off-by: Felix Wang <[email protected]>
Signed-off-by: CS <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants