Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bench: arrow batch benchmarks broken #71886

Closed
Tracked by #71820
cucaroach opened this issue Oct 22, 2021 · 2 comments · Fixed by #71892
Closed
Tracked by #71820

bench: arrow batch benchmarks broken #71886

cucaroach opened this issue Oct 22, 2021 · 2 comments · Fixed by #71892
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@cucaroach
Copy link
Contributor

After this commit (0d580ad) we crash on nil pointers.

@cucaroach cucaroach added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team labels Oct 22, 2021
@cucaroach cucaroach self-assigned this Oct 22, 2021
@yuzefovich
Copy link
Member

Oh yeah, that makes sense. I think we simply want to do something like

diff --git a/pkg/col/colserde/arrowbatchconverter_test.go b/pkg/col/colserde/arrowbatchconverter_test.go
index 06a8cee258..aa3e017be4 100644
--- a/pkg/col/colserde/arrowbatchconverter_test.go
+++ b/pkg/col/colserde/arrowbatchconverter_test.go
@@ -224,15 +224,19 @@ func BenchmarkArrowBatchConverter(b *testing.B) {
                for _, nullFraction := range nullFractions {
                        setNullFraction(batch, nullFraction)
                        data, err := c.BatchToArrow(batch)
+                       dataCopy := make([]*array.Data, len(data))
                        require.NoError(b, err)
                        testPrefix := fmt.Sprintf("%s/nullFraction=%0.2f", typ.String(), nullFraction)
                        result := testAllocator.NewMemBatchWithMaxCapacity([]*types.T{typ})
                        b.Run(testPrefix+"/ArrowToBatch", func(b *testing.B) {
                                b.SetBytes(numBytes[typIdx])
                                for i := 0; i < b.N; i++ {
+                                       // Since ArrowToBatch eagerly nils things out, we have to
+                                       // make a shallow copy each time.
+                                       copy(dataCopy, data)
                                        // Using require.NoError here causes large enough allocations to
                                        // affect the result.
-                                       if err := c.ArrowToBatch(data, batch.Length(), result); err != nil {
+                                       if err := c.ArrowToBatch(dataCopy, batch.Length(), result); err != nil {
                                                b.Fatal(err)
                                        }
                                        if result.Width() != 1 {

cucaroach added a commit to cucaroach/cockroach that referenced this issue Oct 22, 2021
In commit 0d580ad we optimized these
functions to clear the data buffer to help out the GC but we broke these
benchmarks.   Parameterize the clearing behavior so the old behavior is
preserved for benchmark continuity purposes.

Fixes: cockroachdb#71886

Release note: None
@cucaroach
Copy link
Contributor Author

I went in a different direction, I think that fix would make us look bad performance wise.

cucaroach added a commit to cucaroach/cockroach that referenced this issue Oct 22, 2021
In commit 0d580ad we optimized these
functions to clear the data buffer to help out the GC but we broke these
benchmarks.   Parameterize the clearing behavior so the old behavior is
preserved for benchmark continuity purposes.

Fixes: cockroachdb#71886

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue Oct 25, 2021
And optimization to aid the Go GC broke benchmarks of Serialize
and ArrowToBench, fix them by restoring the nil'd out arrays using a
copy.

Fixes: cockroachdb#71886

Release note: None
craig bot pushed a commit that referenced this issue Oct 25, 2021
71892: bench: fix colserde benchmarks r=cucaroach a=cucaroach

And optimization to aid the Go GC broke benchmarks of Serialize
and ArrowToBench, fix them by restoring the nil'd out arrays using a
copy.

Fixes: #71886

Release note: None


Co-authored-by: Tommy Reilly <[email protected]>
@craig craig bot closed this as completed in 757e898 Oct 25, 2021
blathers-crl bot pushed a commit that referenced this issue Mar 29, 2022
And optimization to aid the Go GC broke benchmarks of Serialize
and ArrowToBench, fix them by restoring the nil'd out arrays using a
copy.

Fixes: #71886

Release note: None
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants