-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proto definition for simulator synthetic data specification #1037
Conversation
782cc12
to
da04efd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @jcorilla and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 34 at r1 (raw file):
int64 begin = 1; int64 end = 2; }
nit: maybe take a cue from Kotlin and use the same naming pattern as ClosedRange
Suggestion:
message VidRange {
int64 start = 1;
int64 end_inclusive = 2;
}
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 40 at r1 (raw file):
VidInterval vid_interval = 1; // For Age, Gender, Social Grade ...
The summary fragment should actually describe the field. e.g. "Set of fields that come from population data."
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 43 at r1 (raw file):
// These need to be assigned at the VID level here. When generating a // SimulatorSyntheticData you need to import the necessary EventTemplate. repeated google.protobuf.FieldDescriptorProto population_fields = 2;
Why did we go with FieldDescriptorProto instead of CEL expression (or basically any string syntax for field traversal, e.g. person.age_group)
? I don't think there was a tradeoff discussion.
With FieldDescriptorProto, you have the benefit of being really specific and typed when it comes to access and you don't need to do any parsing/evaluating the expression. The downside is that you're sticking likely repeated data into the API. You still need to do some kind of reflective traversal to find the field. There's also no way to distinguish multiple instances of a field (e.g. if you have more than one field using the same message type and you're trying to target a subfield of that message). The other benefit to using a string is that you can use a Map for the values with the string as the key (rather than correlated lists).
It's fine if we're okay with these tradeoffs, I just want to call them out. Note that I wouldn't bother using the CEL library if we were using string expressions since the parsing of an expression that just evaluates to a field value is trivial.
Code quote:
google.protobuf.FieldDescriptorProto
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 64 at r1 (raw file):
// Each Synthetic event group spec indicates which VID intervals are reached. message SyntheticEventGroupSpec { string property_under_measurement = 1;
What is this supposed to be? A DataProvider
resource name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @jcorilla, @SanjayVas, and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 25 at r1 (raw file):
option java_multiple_files = true; message SimulatorSyntheticDataSpec {
Please add a comment that describes this structure.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 34 at r1 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
nit: maybe take a cue from Kotlin and use the same naming pattern as ClosedRange
Though I'd much prefer Interval, it seems we've been using Range for this.
We should use half-open ranges [begin, end).
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 36 at r1 (raw file):
} // The specification of the entire VID space.
Maybe say: "The specification of a synthetic virtual population"
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 43 at r1 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
Why did we go with FieldDescriptorProto instead of CEL expression (or basically any string syntax for field traversal, e.g.
person.age_group)
? I don't think there was a tradeoff discussion.With FieldDescriptorProto, you have the benefit of being really specific and typed when it comes to access and you don't need to do any parsing/evaluating the expression. The downside is that you're sticking likely repeated data into the API. You still need to do some kind of reflective traversal to find the field. There's also no way to distinguish multiple instances of a field (e.g. if you have more than one field using the same message type and you're trying to target a subfield of that message). The other benefit to using a string is that you can use a Map for the values with the string as the key (rather than correlated lists).
It's fine if we're okay with these tradeoffs, I just want to call them out. Note that I wouldn't bother using the CEL library if we were using string expressions since the parsing of an expression that just evaluates to a field value is trivial.
From the point of view of the Simulator, having the Field Descriptors will force the proto to be compiled in, however from the point of view of a downstream EG client it doesn't help much. And maybe it's worse because you can't readily tie a field descriptor back to it's origin (afaik).
It seems like it would be better to go with the strings and make it clear that anyone who wants to work with this structure is expected to agree out of band on the event template messages.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 45 at r1 (raw file):
repeated google.protobuf.FieldDescriptorProto population_fields = 2; // For Device, Location, Duration ...
Same issue here as with population fields that Sanjay mentioned.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 64 at r1 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
What is this supposed to be? A
DataProvider
resource name?
It's basically a short description string that identifies the fake media property we are measuring. Dynamically populating it with the simulator name could be reasonable.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 67 at r1 (raw file):
string description = 2; // The VIDs reached with their frequency non-population attributes.
This doesn't make sense.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 83 at r1 (raw file):
} message DateSpec {
Please add a description for this.
da04efd
to
052c4eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 25 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
Please add a comment that describes this structure.
Done.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 34 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
Though I'd much prefer Interval, it seems we've been using Range for this.
We should use half-open ranges [begin, end).
Switching to using half open range.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 36 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
Maybe say: "The specification of a synthetic virtual population"
Done.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 40 at r1 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
The summary fragment should actually describe the field. e.g. "Set of fields that come from population data."
Done.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 43 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
From the point of view of the Simulator, having the Field Descriptors will force the proto to be compiled in, however from the point of view of a downstream EG client it doesn't help much. And maybe it's worse because you can't readily tie a field descriptor back to it's origin (afaik).
It seems like it would be better to go with the strings and make it clear that anyone who wants to work with this structure is expected to agree out of band on the event template messages.
Switching to using Strings instead of Field Descriptors.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 45 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
Same issue here as with population fields that Sanjay mentioned.
Done.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 67 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
This doesn't make sense.
Done.
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 83 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
Please add a description for this.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @jcorilla, @kungfucraig, and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 34 at r1 (raw file):
Previously, jcorilla wrote…
Switching to using half open range.
I'd still recommend using end_exclusive
for the name of the end
field to be explicit.
See Kotlin's OpenEndRange
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 64 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
It's basically a short description string that identifies the fake media property we are measuring. Dynamically populating it with the simulator name could be reasonable.
I'm not sure I'm seeing a value in this given that this whole spec will go inside of an EventGroup which is a child of a DataProvider. (Each EDP simulator instance is bound to a single DataProvider).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @jcorilla, @SanjayVas, and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 64 at r1 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
I'm not sure I'm seeing a value in this given that this whole spec will go inside of an EventGroup which is a child of a DataProvider. (Each EDP simulator instance is bound to a single DataProvider).
I'm good dropping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 64 at r1 (raw file):
Previously, kungfucraig (Craig Wright) wrote…
I'm good dropping it.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @stevenwarejones)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jcorilla)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 89 at r4 (raw file):
message DateSpec { // Date the VIDs were reached. google.type.Date date = 1;
can we make this a date range?
9f2e384
to
f54f107
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @SanjayVas and @stevenwarejones)
src/main/proto/wfa/measurement/api/v2alpha/event_group_metadata/testing/simulator_synthetic_data_spec.proto
line 89 at r4 (raw file):
Previously, stevenwarejones (Steven Ware Jones) wrote…
can we make this a date range?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jcorilla)
f54f107
to
b915ed7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3, 1 of 1 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jcorilla)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kungfucraig Could you take another look at the change in date spec from singular date -> date range? I believe you also have a block on the VidInterval discussion.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jcorilla)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jcorilla)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jcorilla)
b915ed7
to
3f78075
Compare
No description provided.