-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add feature and feature set labels, for metadata #536
Add feature and feature set labels, for metadata #536
Conversation
Hi @imjuanleonard. Thanks for your PR. I'm waiting for a gojek member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cced904
to
359fd3d
Compare
/ok-to-test |
Demo List Features from feast import Client,Entity, ValueType, FeatureSet, Feature
from pprint import pprint
client = Client(core_url="localhost:6565", serving_url="localhost6566")
client.set_project("test")
client_features = client.list_features()
pprint(vars(client_features['city'])) Output
|
Demo List Feature Set Requirements def __repr__(self):
return "<Test name:%s dtype:%s labels:%s>" % (self._name, self.dtype, self.labels) Create the python function to list feature set and print the features from feast import Client,Entity, ValueType, FeatureSet, Feature
client = Client(core_url="localhost:6565", serving_url="localhost6566")
client.set_project('juan')
test = client.list_feature_sets(project="juan",name="driver",version="1")
for featureset in test:
print(featureset.features) Result:
|
Example Set Labels from google.protobuf.duration_pb2 import Duration
from feast import Client,Entity, ValueType, FeatureSet, Feature
client = Client(core_url="localhost:6565", serving_url="localhost6566")
client.set_project('juan')
feature1=Feature(name='total_transaction4',dtype=ValueType.INT64, labels=dict())
feature1.set_label("test","tost")
feature1.set_label("tast","tust")
feature1.set_label("bost","bust")
customer_fs = FeatureSet("customer_transactions",
entities=[Entity(name='customer_id4', dtype=ValueType.INT64, labels=dict())],
features=[feature1],
max_age=Duration(seconds=432000))
client.apply(customer_fs) Result
DB
|
/retest |
sdk/python/feast/field.py
Outdated
class Field: | ||
""" | ||
High level field type. This is the parent type to both entities and | ||
features. | ||
""" | ||
|
||
def __init__(self, name: str, dtype: ValueType): | ||
def __init__(self, name: str, dtype: ValueType, labels=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldnt labels
have a type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is of type dictionary it is handled on the initiation below, this is to make the labels
Optional so the function signature doesn't change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use Optional[Dict] there I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If what you mean is typing.Optional
I have tried to use it before, it is not behaving as optional parameter
instead according to the documentation it is behaving as optional value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not entirely clear what you mean. Can you please refer to this explanation of Optional type hints: https://stackoverflow.com/questions/51710037/how-should-i-use-the-optional-type-hint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I guess after python 3.5 they are using the Optional aforementioned for type hinting. Have added it
@imjuanleonard we absolutely need tests in order to sign this off. Can you please add them in the same PR? |
fa1f75d
to
33be193
Compare
The test has been added |
sdk/python/feast/field.py
Outdated
@@ -46,6 +50,14 @@ def dtype(self) -> ValueType: | |||
""" | |||
return self._dtype | |||
|
|||
@property | |||
def labels(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why some code is on the feature and some of it is on the field? Can we label entities? If not, why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am following the current discussion that the use case people need are for the feature currently. So I don't want to assume that people need for the entities as well, Do we need metadata for the entity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not so much about entity vs feature as it is feature vs field. Entities inherit from Fields. If the Field can have a label then an Entity can have a label. So we should either limit this entirely to features, or we should add labels to both Entities and Features, which may happen through the Field class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the .proto files on this branch, I can see that labels field is only applied to Features (not Entities or Fields). Not sure how to proceed with this one as I'm not familiar with requirements/use cases of the Python client. Any suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually Field
s are the carrier of (feature and entity) values—actual data, not the metadata of registry specifications. I think these changes to field.py
should probably be backed out.
From discussion Friday, it sounds like we're in agreement to leave entity labeling out of scope for now. Useful thing to have someday (perhaps when #405 gets focus), but it wasn't initially planned for Feast 0.5. So I think we should proceed with adding map<string, string> labels
to FeatureSetSpec
as we discussed and providing the API for that.
We might have to defer Python SDK at least in part to the Gojek team, we don't currently use it. It's the primary means of verifying end-to-end functional integration in the project, but maybe as a compromise we can demonstrate the implementation using grpc_cli
examples as a test plan, and Python SDK could be finished in a second PR.
@woop Does this sound reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw @suwik have a look through tests/e2e
directory in the root of the project for examples of Python SDK, useful for learning the system. They're kicked off for CI by scripts in infra/scripts
(used to be in .prow/scripts
at the point of our internal fork).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggestion @ches. I'm thinking it would be good to add some logic testing the labels to the e2e tests (and later we could take that to our repo too). I tried to follow the installation instructions in the readme, but after running docker-compose up
none of the mentioned ports are open. Any suggestion how to get my environment running for this repo so that I'm able to run the e2e tests?
EDIT: ok, seems like I had COMPOSE_FILE set to our internal fork file and that's why it wasn't starting all the services that I was expecting to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the python SDK changes related to labels (implemented inside field.py). I did that in a separate commit so it should be easy to rollback in case we actually need them.
I was a bit misled by the PR title and trying to follow the history from #499—this PR is not only Python SDK, it's complete implementation of feature labels. Conflicts need to be fixed now (mostly generated files that have been removed from Git)—if in the process you could rebase and squash some of the false-start commits like |
84486a5
to
505d8f7
Compare
@@ -47,6 +48,10 @@ | |||
@Column(name = "project") | |||
private String project; | |||
|
|||
// Labels that this field belongs to | |||
@Column(name = "labels", columnDefinition = "text") | |||
private String labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm so I guess we're going to want to move this to the Feature
entity class in #655… this was mentioned before but I guess I wasn't understanding the implications until I saw #655 reduced from #612.
Creates a bit of a conundrum for us since I thought the labels feature could be a straightforward backport (internal, assuming Feast isn't going to backport features). That's not going to be the case if this has to wait on #655 and substantial SQL schema migrations are going to be required for it.
/test test-end-to-end-batch |
Looks good to me. Will you let me know when its ready to be merged? |
Thanks for being so patient here @ches. I guess the conversation around Field, Entity, and Feature has temporarily been answered, although I personally don't think we have reached our destination yet. I do feel like the door is left more open with our current approach. |
Map<String, String> featureSetLabels = | ||
new HashMap<>() { | ||
{ | ||
put("description", "My precious feature set"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot! ;)
tests/e2e/grpc-based-registration.py
Outdated
LABEL_KEY = "my" | ||
LABEL_VALUE = "label" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests look good, but possibly add a TODO
to migrate this over to the python sdk once that has been updated?
/test test-end-to-end |
return get_feature_set_response.feature_set | ||
|
||
@pytest.mark.timeout(45) | ||
@pytest.mark.run(order=9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @woop uses these to group similar tests together (10- - so maybe your tests can start from 50?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this stuff can be tweaked when changing the tests to use SDK in #663
/test test-end-to-end |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: imjuanleonard, woop The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Backports feast-dev#536 to v0.3
* Add feature and feature set labels Backports #536 to v0.3 * Fix feature labels test for round trip It appears that this test was only exercising the test setup code, not covering that labels work when applied. This change should be forward-ported to master. Co-authored-by: Suwinski, Krzysztof (Agoda) <[email protected]>
What this PR does / why we need it:
-> Extension for #463
Update (@suwik):
Which issue(s) this PR fixes:
Fixes #463
Does this PR introduce a user-facing change?: