Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visit "Any" types in payload visitor #177

Merged
merged 1 commit into from
Sep 5, 2024
Merged

Conversation

cretz
Copy link
Member

@cretz cretz commented Sep 5, 2024

What changed?

  • Added WellKnownAnyVisitor to VisitPayloadsOptions with default for visiting any
  • Updated generator to make sure "Any" types are visited

Closes temporalio/sdk-go#1622

@cretz cretz requested review from a team as code owners September 5, 2024 13:25
@@ -193,15 +215,15 @@ func visitPayloads(
if o == nil { continue }
no, err := visitPayload(ctx, options, parent, o)
if err != nil { return err }
objs[i] = no
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file has some whitespace changes (this was using spaces which was looking bad even though the final generated source was always go fmt'd anyways)

Comment on lines +170 to +172
// We choose to visit and re-marshal always instead of cloning, visiting,
// and checking if anything changed before re-marshaling. It is assumed the
// clone + equality check is not much cheaper than re-marshal.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does everyone agree with this assumption?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we benchmark that assumption? I'd like to see numbers personally

Copy link
Member Author

@cretz cretz Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get numbers. Can you define the parameters of the benchmarks for the numbers you are wanting to see? Do you want the only way we use "any" today (update requests that have barely anything in them) or benchmarking this with more possibilities in "any"? And do you have recommend sizes of payloads? Meaning do you want to benchmark an update with 4K byte inputs or just 5 bytes (or a bunch of other values)?

Also, it should be noted that this utility is mostly used for payload manipulation which almost always mutates the payloads no matter what making this check-never-mutated piece moot. So that has to be factored in as well. Knowing that, can you clarify if you have concerns about the always-marshal approach since it may be a almost-always-marshal approach anyways?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the context around this assumption, then we should encode that in the code so it's obvious next time looks at an undocumented assumption :)

Copy link
Member Author

@cretz cretz Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not meant to be the context around the assumption. I definitely made the assumption regardless of that context and don't really want to choose one or the other based too much on expected frequency/usage. I just added that context in case you had concerns with the expected frequency/usage. I can provide benchmark numbers, I just need some context around which numbers/scenarios you want to see.

Copy link
Member Author

@cretz cretz Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting results here. So here are some basic benchmarks:

func Benchmark_AnyCloneEqualCheck(b *testing.B) {
	anyMsg, _ := anypb.New(&update.Request{Input: &update.Input{Args: &common.Payloads{
		Payloads: []*common.Payload{{Data: []byte("orig-val")}},
	}}})
	for i := 0; i < b.N; i++ {
		child, _ := anyMsg.UnmarshalNew()
		childOrig := proto.Clone(child)
		// Do visitor stuff...
		proto.Equal(childOrig, child)
	}
}

func Benchmark_AnyMarshalReMarshal(b *testing.B) {
	anyMsg, _ := anypb.New(&update.Request{Input: &update.Input{Args: &common.Payloads{
		Payloads: []*common.Payload{{Data: []byte("orig-val")}},
	}}})
	for i := 0; i < b.N; i++ {
		child, err := anyMsg.UnmarshalNew()
		require.NoError(b, err)
		// Do visitor stuff...
		_ = anyMsg.MarshalFrom(child)
	}
}

Running this (on Windows at least) shows that the current unmarshal-remarshal way is at least 3x faster every time I try:

Benchmark_AnyCloneEqualCheck
Benchmark_AnyCloneEqualCheck-20           464596              2538 ns/op
Benchmark_AnyMarshalReMarshal
Benchmark_AnyMarshalReMarshal-20         1717684               810.9 ns/op

(similar results even if I move anymsg creation into the loop)

Gonna merge based on this info...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Thank you for doing this; this is all I wanted to see: analysis to support our assumptions 😄

@@ -159,6 +162,25 @@ func NewFailureVisitorInterceptor(options FailureVisitorInterceptorOptions) (grp
}, nil
}

func (o *VisitPayloadsOptions) defaultWellKnownAnyVisitor(ctx *VisitPayloadsContext, p *anypb.Any) error {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be a need in the future to expose this for users to delegate to, but we will wait until that need to expose it

@cretz cretz merged commit 280c4f7 into temporalio:master Sep 5, 2024
3 checks passed
@cretz cretz deleted the visit-any branch September 5, 2024 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API Go's payload visitor not visiting Any for payloads inside
3 participants