Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Improve avro encoder performance #63829

Merged
merged 2 commits into from
Apr 21, 2021

Conversation

miretskiy
Copy link
Contributor

@miretskiy miretskiy commented Apr 18, 2021

Avoid expensive allocations (maps) when encoding datums.
Improve encoder performance by ~40%, and significantly reduce
memory allocations per op.

BenchmarkEncodeInt-16                    1834214               665.4 ns/op            73 B/op          5 allocs/op
BenchmarkEncodeBool-16                   1975244               597.8 ns/op            33 B/op          3 allocs/op
BenchmarkEncodeFloat-16                  1773226               661.6 ns/op            73 B/op          5 allocs/op                                                  BenchmarkEncodeBox2D-16                   628884              1740 ns/op             579 B/op         18 allocs/op
BenchmarkEncodeGeography-16              1734722               713.3 ns/op           233 B/op          5 allocs/op                                                  BenchmarkEncodeGeometry-16               1495227              1208 ns/op            2737 B/op          5 allocs/op                                                  BenchmarkEncodeBytes-16                  2171725               698.4 ns/op            64 B/op          5 allocs/op                                                  BenchmarkEncodeString-16                 1847884               696.0 ns/op            49 B/op          4 allocs/op
BenchmarkEncodeDate-16                   2159253               701.6 ns/op            64 B/op          5 allocs/op
BenchmarkEncodeTime-16                   1857284               682.9 ns/op            81 B/op          6 allocs/op
BenchmarkEncodeTimeTZ-16                  833163              1405 ns/op             402 B/op         14 allocs/op
BenchmarkEncodeTimestamp-16              1623998               720.5 ns/op            97 B/op          6 allocs/op
BenchmarkEncodeTimestampTZ-16            1614201               719.0 ns/op            97 B/op          6 allocs/op
BenchmarkEncodeDecimal-16                 790902              1473 ns/op             490 B/op         23 allocs/op
BenchmarkEncodeUUID-16                   2216424               783.0 ns/op           176 B/op          6 allocs/op
BenchmarkEncodeINet-16                   1545225               817.6 ns/op           113 B/op          8 allocs/op
BenchmarkEncodeJSON-16                   2146824              1731 ns/op             728 B/op         21 allocs/op

Release Notes: None

@miretskiy miretskiy requested a review from ajwerner April 18, 2021 21:25
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@miretskiy
Copy link
Contributor Author

@ajwerner I realized that after reverting an accidental merge, I never re-reverted this performance improvement PR.

@miretskiy miretskiy requested a review from stevendanna April 19, 2021 12:23
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having never read this code before, I'm not having an easy time reviewing it without understanding the surrounding context. Can you clarify when the maps are being allocated now vs. before and explain how it's safe?

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @stevendanna)


pkg/ccl/changefeedccl/avro.go, line 105 at r2 (raw file):

	typ      *types.T
	native   map[string]interface{}

native is not very descriptive to me, want to add some commentary here and in avroDataRecord


pkg/ccl/changefeedccl/avro_test.go, line 714 at r2 (raw file):

}

// BenchmarkEncodeInt-16                    2006964               666.0 ns/op            73 B/op          5 allocs/op

This is going to rot very fast. Rather than checking this into the code, I think it'd be better to have it in the commit messages.

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Added comments.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)


pkg/ccl/changefeedccl/avro.go, line 105 at r2 (raw file):

Previously, ajwerner wrote…

native is not very descriptive to me, want to add some commentary here and in avroDataRecord

I agree. I used native because that's what goavro uses. adding a comment here is helpful.
Let me know if that answers your question above re "maps"


pkg/ccl/changefeedccl/avro_test.go, line 714 at r2 (raw file):

Previously, ajwerner wrote…

This is going to rot very fast. Rather than checking this into the code, I think it'd be better to have it in the commit messages.

Okay.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @miretskiy and @stevendanna)


pkg/ccl/changefeedccl/avro.go, line 139 at r4 (raw file):

	colIdxByFieldIdx map[int]int
	fieldIdxByName   map[string]int
	native           map[string]interface{}

nit: comment to indicate that this field exists for reuse to avoid allocating a new map.


pkg/ccl/changefeedccl/avro.go, line 592 at r4 (raw file):

func (r *avroDataRecord) nativeFromRow(row rowenc.EncDatumRow) (interface{}, error) {
	if r.native == nil {
		r.native = make(map[string]interface{}, len(r.Fields))

note something like:

// Note that it's safe to reuse r.native without clearing it because all records will
// contain the same complete set of fields

@miretskiy miretskiy requested a review from ajwerner April 19, 2021 18:16
Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)


pkg/ccl/changefeedccl/avro.go, line 139 at r4 (raw file):

Previously, ajwerner wrote…

nit: comment to indicate that this field exists for reuse to avoid allocating a new map.

Done.


pkg/ccl/changefeedccl/avro.go, line 592 at r4 (raw file):

Previously, ajwerner wrote…

note something like:

// Note that it's safe to reuse r.native without clearing it because all records will
// contain the same complete set of fields

Ooops -- I even forgot I did this one too.
Thanks. Comment added.

@miretskiy
Copy link
Contributor Author

tftr
bors r=ajwerner

@miretskiy
Copy link
Contributor Author

bors r-

@craig
Copy link
Contributor

craig bot commented Apr 19, 2021

Canceled.

@miretskiy
Copy link
Contributor Author

tftr
bors r=ajwerner

@craig
Copy link
Contributor

craig bot commented Apr 20, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 20, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 20, 2021

Build failed:

Yevgeniy Miretskiy added 2 commits April 20, 2021 17:51
This reverts commit 5d1b141.

Add avro type encoding benchmarks.

Release Notes: None
This reverts commit 6a1b739.

Avoid expensive allocations (maps) when encoding datums.
Improve encoder performance by ~40%, and significantly reduce
memory allocations per op.

Release Notes: None
@miretskiy
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 21, 2021

Build succeeded:

@craig craig bot merged commit 075aa5d into cockroachdb:master Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants