Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Updating protos to separate transformation #4018

Merged
merged 18 commits into from
Mar 24, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 4 additions & 20 deletions protos/feast/core/OnDemandFeatureView.proto
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import "feast/core/FeatureView.proto";
import "feast/core/FeatureViewProjection.proto";
import "feast/core/Feature.proto";
import "feast/core/DataSource.proto";
import "feast/core/Transformation.proto";

message OnDemandFeatureView {
// User-specified specifications of this feature view.
Expand All @@ -48,10 +49,8 @@ message OnDemandFeatureViewSpec {
// Map of sources for this feature view.
map<string, OnDemandSource> sources = 4;

oneof transformation {
UserDefinedFunction user_defined_function = 5;
OnDemandSubstraitTransformation on_demand_substrait_transformation = 9;
}
// Oneof with {user_defined_function, on_demand_substrait_transformation}
FeatureTransformation transformation = 5;

// Description of the on demand feature view.
string description = 6;
Expand All @@ -61,6 +60,7 @@ message OnDemandFeatureViewSpec {

// Owner of the on demand feature view.
string owner = 8;
string mode = 9;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of mode field here since this is already a breaking change anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my team at Affirm :)

Copy link
Collaborator

@tokoko tokoko Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that, but I'm a little confused. If you care about proto encoding compatibility, this change will break your protos anyway, with or without mode field. If all you care about is user-facing API (in other words mode parameter in on_demand_feature_view decorator) we can do that without a redundant mode field in the proto. mode parameter in on_demand_feature_view would simply determine which one of oneof fields will be populated. Maybe I'm missing something...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point but it does make API interface similar to stream feature view, which I prefer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not disputing the mode field in the API. I agree we should do that as part of native python PR. All I'm saying is we don't need a mode field in proto for that to be possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i gues @tokoko means to handle mode in Python only. either way it works. Explicitly adding it in the proto does help understanding the Feature View schema.

Copy link
Member Author

@franciscojavierarceo franciscojavierarceo Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly adding it in the proto does help understanding the Feature View schema.

+1

}

message OnDemandFeatureViewMeta {
Expand All @@ -78,19 +78,3 @@ message OnDemandSource {
DataSource request_data_source = 2;
}
}

// Serialized representation of python function.
message UserDefinedFunction {
// The function name
string name = 1;

// The python-syntax function body (serialized by dill)
bytes body = 2;

// The string representation of the udf
string body_text = 3;
}

message OnDemandSubstraitTransformation {
bytes substrait_plan = 1;
}
5 changes: 5 additions & 0 deletions protos/feast/core/StreamFeatureView.proto
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import "feast/core/FeatureView.proto";
import "feast/core/Feature.proto";
import "feast/core/DataSource.proto";
import "feast/core/Aggregation.proto";
import "feast/core/Transformation.proto";

message StreamFeatureView {
// User-specified specifications of this feature view.
Expand Down Expand Up @@ -79,6 +80,7 @@ message StreamFeatureViewSpec {
// Serialized function that is encoded in the streamfeatureview
UserDefinedFunction user_defined_function = 13;


// Mode of execution
string mode = 14;

Expand All @@ -87,5 +89,8 @@ message StreamFeatureViewSpec {

// Timestamp field for aggregation
string timestamp_field = 16;

// Oneof with {user_defined_function, on_demand_substrait_transformation}
FeatureTransformation transformation = 17;
}

33 changes: 33 additions & 0 deletions protos/feast/core/Transformation.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
syntax = "proto3";
package feast.core;

option go_package = "github.com/feast-dev/feast/go/protos/feast/core";
option java_outer_classname = "FeatureTransformationProto";
option java_package = "feast.proto.core";

import "google/protobuf/duration.proto";

// Serialized representation of python function.
message UserDefinedFunction {
// The function name
string name = 1;

// The python-syntax function body (serialized by dill)
bytes body = 2;

// The string representation of the udf
string body_text = 3;
}

// A feature transformation executed as a user-defined function
message FeatureTransformation {
// Note this Transformation starts at 5 for backwards compatibility
oneof transformation {
UserDefinedFunction user_defined_function = 5;
OnDemandSubstraitTransformation on_demand_substrait_transformation = 6;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good time to rethink the naming here. My first suggestion was to rename the field (not message type) to on_demand_pandas_transformation instead of user_defined_function. But on second thought, since we are also aiming to reuse this in StreamFeatureViews, I think protos should no longer be called OnDemand... What do you think? I'm thinking of something like this:

UserDefinedFunctionV2 pandas_transformation = 1;
SubstraitTransformationV2 substrait_transformation = 2;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning on doing that in a follow up PR to not add too much complexity here.

}

message OnDemandSubstraitTransformation {
bytes substrait_plan = 1;
}
6 changes: 3 additions & 3 deletions sdk/python/feast/diff/registry_diff.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,11 +144,11 @@ def diff_registry_objects(
if _field.name in FIELDS_TO_IGNORE:
continue
elif getattr(current_spec, _field.name) != getattr(new_spec, _field.name):
if _field.name == "user_defined_function":
if _field.name == "transformation":
current_spec = cast(OnDemandFeatureViewSpec, current_spec)
new_spec = cast(OnDemandFeatureViewSpec, new_spec)
current_udf = current_spec.user_defined_function
new_udf = new_spec.user_defined_function
current_udf = current_spec.transformation.user_defined_function
new_udf = new_spec.transformation.user_defined_function
for _udf_field in current_udf.DESCRIPTOR.fields:
if _udf_field.name == "body":
continue
Expand Down
2 changes: 1 addition & 1 deletion sdk/python/feast/infra/registry/base_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -663,7 +663,7 @@ def to_dict(self, project: str) -> Dict[str, List[Any]]:
):
odfv_dict = self._message_to_sorted_dict(on_demand_feature_view.to_proto())

odfv_dict["spec"]["userDefinedFunction"][
odfv_dict["spec"]["transformation"]["userDefinedFunction"][
"body"
] = on_demand_feature_view.transformation.udf_string
registry_dict["onDemandFeatureViews"].append(odfv_dict)
Expand Down
30 changes: 20 additions & 10 deletions sdk/python/feast/on_demand_feature_view.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@
OnDemandFeatureViewSpec,
OnDemandSource,
)
from feast.protos.feast.core.Transformation_pb2 import (
FeatureTransformation as FeatureTransformationProto,
)
from feast.type_map import (
feast_value_type_to_pandas_type,
python_type_to_feast_value_type,
Expand Down Expand Up @@ -205,16 +208,19 @@ def to_proto(self) -> OnDemandFeatureViewProto:
request_data_source=request_sources.to_proto()
)

spec = OnDemandFeatureViewSpec(
name=self.name,
features=[feature.to_proto() for feature in self.features],
sources=sources,
feature_transformation = FeatureTransformationProto(
user_defined_function=self.transformation.to_proto()
if type(self.transformation) == OnDemandPandasTransformation
else None,
on_demand_substrait_transformation=self.transformation.to_proto() # type: ignore
on_demand_substrait_transformation=self.transformation.to_proto()
if type(self.transformation) == OnDemandSubstraitTransformation
else None,
else None, # type: ignore
)
spec = OnDemandFeatureViewSpec(
name=self.name,
features=[feature.to_proto() for feature in self.features],
sources=sources,
transformation=feature_transformation,
description=self.description,
tags=self.tags,
owner=self.owner,
Expand Down Expand Up @@ -254,18 +260,22 @@ def from_proto(cls, on_demand_feature_view_proto: OnDemandFeatureViewProto):
)

if (
on_demand_feature_view_proto.spec.WhichOneof("transformation")
on_demand_feature_view_proto.spec.transformation.WhichOneof(
"transformation"
)
== "user_defined_function"
):
transformation = OnDemandPandasTransformation.from_proto(
on_demand_feature_view_proto.spec.user_defined_function
on_demand_feature_view_proto.spec.transformation.user_defined_function
)
elif (
on_demand_feature_view_proto.spec.WhichOneof("transformation")
on_demand_feature_view_proto.spec.transformation.WhichOneof(
"transformation"
)
== "on_demand_substrait_transformation"
):
transformation = OnDemandSubstraitTransformation.from_proto(
on_demand_feature_view_proto.spec.on_demand_substrait_transformation
on_demand_feature_view_proto.spec.transformation.on_demand_substrait_transformation
)
else:
raise Exception("At least one transformation type needs to be provided")
Expand Down
2 changes: 1 addition & 1 deletion sdk/python/feast/on_demand_pandas_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import dill
import pandas as pd

from feast.protos.feast.core.OnDemandFeatureView_pb2 import (
from feast.protos.feast.core.Transformation_pb2 import (
UserDefinedFunction as UserDefinedFunctionProto,
)

Expand Down
2 changes: 1 addition & 1 deletion sdk/python/feast/on_demand_substrait_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import pyarrow
import pyarrow.substrait as substrait # type: ignore # noqa

from feast.protos.feast.core.OnDemandFeatureView_pb2 import (
from feast.protos.feast.core.Transformation_pb2 import (
OnDemandSubstraitTransformation as OnDemandSubstraitTransformationProto,
)

Expand Down
6 changes: 3 additions & 3 deletions sdk/python/feast/stream_feature_view.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@
from feast.feature_view import FeatureView
from feast.field import Field
from feast.protos.feast.core.DataSource_pb2 import DataSource as DataSourceProto
from feast.protos.feast.core.OnDemandFeatureView_pb2 import (
UserDefinedFunction as UserDefinedFunctionProto,
)
from feast.protos.feast.core.StreamFeatureView_pb2 import (
StreamFeatureView as StreamFeatureViewProto,
)
from feast.protos.feast.core.StreamFeatureView_pb2 import (
StreamFeatureViewSpec as StreamFeatureViewSpecProto,
)
from feast.protos.feast.core.Transformation_pb2 import (
UserDefinedFunction as UserDefinedFunctionProto,
)

warnings.simplefilter("once", RuntimeWarning)

Expand Down
4 changes: 2 additions & 2 deletions sdk/python/tests/unit/diff/test_registry_diff.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,11 +139,11 @@ def post_changed(inputs: pd.DataFrame) -> pd.DataFrame:
assert feast_object_diffs.feast_object_property_diffs[0].property_name == "name"
assert (
feast_object_diffs.feast_object_property_diffs[1].property_name
== "user_defined_function.name"
== "transformation.name"
)
assert (
feast_object_diffs.feast_object_property_diffs[2].property_name
== "user_defined_function.body_text"
== "transformation.body_text"
)


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ const OnDemandFeatureViewOverviewTab = ({
</EuiTitle>
<EuiHorizontalRule margin="xs" />
<EuiCodeBlock language="py" fontSize="m" paddingSize="m">
{data?.spec?.userDefinedFunction?.bodyText}
{data?.spec?.transformation?.userDefinedFunction?.bodyText}
</EuiCodeBlock>
</EuiPanel>
</EuiFlexItem>
Expand Down
Loading