Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WORKSPACE: enable unified GOEXPERIMENT while we're still on go1.19 #93977

Closed
ajwerner opened this issue Dec 20, 2022 · 0 comments
Closed

WORKSPACE: enable unified GOEXPERIMENT while we're still on go1.19 #93977

ajwerner opened this issue Dec 20, 2022 · 0 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Dec 20, 2022

Is your feature request related to a problem? Please describe.
See https://go-review.googlesource.com/c/go/+/422235

The unified experiment leads to better inlining of generic code. It will be on by default in go1.20. We should build cockroach with it enabled.

Additional context
The rules_go bazel package has made this easier lately. We need to update to pick up that convenience.

Jira issue: CRDB-22633

@ajwerner ajwerner added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Dec 20, 2022
ajwerner added a commit to ajwerner/cockroach that referenced this issue Dec 20, 2022
This commit introduces `util/btree` which offers a generic, sorted `Map` and
`Set` implementation. Additionally, `util/btree/interval` and
`util/btree/orderstat` also offer parallel data structures but with additional
features.

 * `interval` provides a generic interval tree which offers support for
   iterating overlapping intervals.
 * `orderstat` provides a generic order statistic tree which offers support
   for indexing and looking up the rank of an element in a set or map.

All of these btrees are implemented in terms of an internal `abstract` package
which provides a generic augmented btree implemenetation. The interval tree
and the order statistic tree utilize augmentations to implement their extra
functionality.

By structuring the logic this way, we're able to utilize the same core tree
logic for all of our tree-based data structures.

The logic for this PR was heavily inspired by the existing generic-templated
interval btree. This implementation is more generic than that one in that it
does not rely on intervals having byte keys or an ID value; the user supplies
the relevant comparison functions.

By targeting the same API, this package also ports the tests from that
implementation, and its benchmarks. The interval tree is competitive on the
benchmarks. Note that all benchmarks were run using `GOEXPERIMENT=unified`.
This is now the default in the go tip. Its usage is tracked in cockroachdb#93977.

```
name                                                       old time/op    new time/op    delta
BTreeInsert/count=16-16                                      46.7ns ± 3%    60.9ns ± 2%  +30.29%  (p=0.000 n=10+9)
BTreeInsert/count=128-16                                     78.9ns ± 1%    93.4ns ± 5%  +18.33%  (p=0.000 n=6+10)
BTreeInsert/count=1024-16                                     158ns ± 1%     171ns ± 2%   +8.21%  (p=0.000 n=10+10)
BTreeInsert/count=8192-16                                     238ns ± 1%     251ns ± 1%   +5.64%  (p=0.000 n=10+10)
BTreeInsert/count=65536-16                                    405ns ± 2%     420ns ± 3%   +3.59%  (p=0.000 n=10+10)
BTreeDelete/count=16-16                                      67.1ns ±10%    64.9ns ±11%     ~     (p=0.315 n=10+9)
BTreeDelete/count=128-16                                      103ns ±21%     123ns ±14%  +19.84%  (p=0.001 n=10+9)
BTreeDelete/count=1024-16                                     210ns ± 5%     223ns ± 4%   +6.52%  (p=0.000 n=10+10)
BTreeDelete/count=8192-16                                     320ns ± 2%     334ns ± 2%   +4.42%  (p=0.000 n=10+10)
BTreeDelete/count=65536-16                                    556ns ± 2%     588ns ± 2%   +5.85%  (p=0.000 n=10+9)
BTreeDeleteInsert/count=16-16                                86.0ns ± 1%    97.1ns ± 1%  +12.99%  (p=0.000 n=10+10)
BTreeDeleteInsert/count=128-16                                217ns ± 2%     232ns ± 2%   +6.51%  (p=0.000 n=9+10)
BTreeDeleteInsert/count=1024-16                               393ns ± 4%     420ns ± 3%   +6.99%  (p=0.000 n=9+10)
BTreeDeleteInsert/count=8192-16                               468ns ± 2%     489ns ± 1%   +4.58%  (p=0.000 n=10+10)
BTreeDeleteInsert/count=65536-16                             1.00µs ±27%    1.03µs ±17%     ~     (p=0.684 n=10+10)
BTreeDeleteInsertCloneOnce/count=16-16                       87.6ns ± 0%    97.3ns ± 0%  +11.05%  (p=0.000 n=9+8)
BTreeDeleteInsertCloneOnce/count=128-16                       218ns ± 3%     233ns ± 3%   +6.86%  (p=0.000 n=10+10)
BTreeDeleteInsertCloneOnce/count=1024-16                      393ns ± 2%     413ns ± 8%   +5.08%  (p=0.028 n=9+10)
BTreeDeleteInsertCloneOnce/count=8192-16                      470ns ± 0%     491ns ± 1%   +4.46%  (p=0.000 n=8+10)
BTreeDeleteInsertCloneOnce/count=65536-16                     979ns ±38%    1015ns ±32%     ~     (p=0.631 n=10+10)
BTreeDeleteInsertCloneEachTime/count=16/reset=false-16        306ns ± 0%     313ns ± 0%   +2.28%  (p=0.000 n=10+9)
BTreeDeleteInsertCloneEachTime/count=16/reset=true-16         131ns ± 1%     136ns ± 1%   +4.24%  (p=0.000 n=9+10)
BTreeDeleteInsertCloneEachTime/count=128/reset=false-16       754ns ± 2%     744ns ± 2%   -1.31%  (p=0.035 n=10+10)
BTreeDeleteInsertCloneEachTime/count=128/reset=true-16        374ns ± 3%     381ns ± 2%   +2.00%  (p=0.043 n=9+10)
BTreeDeleteInsertCloneEachTime/count=1024/reset=false-16     1.38µs ± 1%    1.32µs ± 3%   -4.00%  (p=0.000 n=8+10)
BTreeDeleteInsertCloneEachTime/count=1024/reset=true-16       864ns ± 3%     877ns ± 3%   +1.47%  (p=0.050 n=8+8)
BTreeDeleteInsertCloneEachTime/count=8192/reset=false-16     1.56µs ± 0%    1.66µs ± 1%   +6.22%  (p=0.000 n=10+10)
BTreeDeleteInsertCloneEachTime/count=8192/reset=true-16      1.09µs ± 1%    1.14µs ± 0%   +4.50%  (p=0.000 n=9+9)
BTreeDeleteInsertCloneEachTime/count=65536/reset=false-16    2.68µs ±20%    2.99µs ±21%  +11.82%  (p=0.050 n=10+10)
BTreeDeleteInsertCloneEachTime/count=65536/reset=true-16     1.94µs ±21%    2.05µs ±23%     ~     (p=0.382 n=10+10)
BTreeMakeIter-16                                             5.03ns ± 0%    5.03ns ± 0%     ~     (p=0.895 n=10+10)
BTreeIterSeekGE/count=16-16                                  53.2ns ± 0%    52.6ns ± 0%   -1.03%  (p=0.000 n=10+9)
BTreeIterSeekGE/count=128-16                                 90.7ns ± 0%    92.7ns ± 0%   +2.15%  (p=0.000 n=10+10)
BTreeIterSeekGE/count=1024-16                                 148ns ± 1%     153ns ± 1%   +3.34%  (p=0.000 n=9+10)
BTreeIterSeekGE/count=8192-16                                 226ns ± 1%     233ns ± 1%   +3.24%  (p=0.000 n=10+10)
BTreeIterSeekGE/count=65536-16                                459ns ± 4%     476ns ± 2%   +3.61%  (p=0.000 n=10+10)
BTreeIterSeekLT/count=16-16                                  54.6ns ± 0%    53.0ns ± 0%   -2.83%  (p=0.000 n=9+10)
BTreeIterSeekLT/count=128-16                                 95.5ns ± 1%    93.9ns ± 0%   -1.65%  (p=0.000 n=10+10)
BTreeIterSeekLT/count=1024-16                                 152ns ± 0%     156ns ± 1%   +2.15%  (p=0.000 n=9+10)
BTreeIterSeekLT/count=8192-16                                 229ns ± 1%     236ns ± 1%   +3.20%  (p=0.000 n=10+10)
BTreeIterSeekLT/count=65536-16                                473ns ± 2%     476ns ± 3%     ~     (p=0.148 n=10+10)
BTreeIterFirstOverlap/count=16-16                             149ns ± 0%     161ns ± 0%   +8.14%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=128-16                            266ns ± 1%     278ns ± 1%   +4.54%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=1024-16                           481ns ± 1%     497ns ± 0%   +3.40%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=8192-16                           761ns ± 1%     782ns ± 1%   +2.88%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=65536-16                         1.14µs ± 2%    1.16µs ± 2%   +2.16%  (p=0.000 n=10+10)
BTreeIterNext-16                                             2.55ns ± 0%    2.66ns ± 0%   +4.39%  (p=0.000 n=10+9)
BTreeIterPrev-16                                             15.0ns ± 1%    16.0ns ± 1%   +6.17%  (p=0.000 n=10+10)
BTreeIterNextOverlap-16                                      4.12ns ± 0%    4.37ns ± 1%   +5.87%  (p=0.000 n=10+8)
BTreeIterOverlapScan-16                                      11.0µs ± 1%    11.9µs ± 3%   +9.00%  (p=0.000 n=10+10)
```

In order to gain yet more testing and more validation of the implementation,
the `util/btree/gbtree` package provides an adapter of `util/btree.Set` to
`github.com/google/btree.BTreeG`, the generic btree from google's package.
This implementation generally dominates that one:

```
name                           old time/op    new time/op    delta
InsertG-16                        162ns ± 0%     137ns ± 1%   -15.87%  (p=0.000 n=9+10)
SeekG-16                          124ns ± 1%      90ns ± 1%   -27.61%  (p=0.000 n=10+10)
DeleteInsertG-16                  319ns ± 2%     258ns ± 2%   -19.06%  (p=0.000 n=10+10)
DeleteInsertCloneOnceG-16         318ns ± 2%     259ns ± 1%   -18.63%  (p=0.000 n=10+10)
DeleteInsertCloneEachTimeG-16     976ns ± 1%    1139ns ± 1%   +16.71%  (p=0.000 n=9+9)
DeleteG-16                        178ns ± 1%     146ns ± 2%   -18.31%  (p=0.000 n=10+10)
GetG-16                           138ns ± 0%     128ns ± 1%    -6.93%  (p=0.000 n=9+10)
GetCloneEachTimeG-16              220ns ± 1%     162ns ± 1%   -26.40%  (p=0.000 n=10+10)
AscendG-16                       49.2µs ± 1%    50.3µs ± 1%    +2.21%  (p=0.000 n=8+9)
DescendG-16                      45.8µs ± 1%    48.4µs ± 1%    +5.84%  (p=0.000 n=10+9)
AscendRangeG-16                  77.7µs ± 1%    61.9µs ± 0%   -20.32%  (p=0.000 n=9+9)
DescendRangeG-16                 95.4µs ± 0%    59.6µs ± 0%   -37.51%  (p=0.000 n=10+10)
AscendGreaterOrEqualG-16         56.5µs ± 1%    51.6µs ± 1%    -8.71%  (p=0.000 n=10+10)
DescendLessOrEqualG-16           78.1µs ± 1%    49.8µs ± 0%   -36.23%  (p=0.000 n=10+10)
DeleteAndRestoreG/Copy-16        4.48ms ± 0%    3.46ms ± 0%   -22.71%  (p=0.000 n=8+8)
DeleteAndRestoreG/Clear-16       2.84ms ± 0%    2.26ms ± 0%   -20.29%  (p=0.000 n=9+10)
```

Epic: None

Release note: None
ajwerner added a commit to ajwerner/cockroach that referenced this issue Dec 23, 2022
This commit introduces util/btree which offers a generic, sorted Map and
Set implementations.

In order to reduce overhead, and to allow the containers to be used without any
initialization, many configurations exist which exploit properties of the key,
such as the `WithKey` configuration which use a `Compare` method on the key,
`WithBytes` which uses byte-slice like types, and `WithOrdered` which utilize
natively ordered types.

The package also supports fully custom and stateful comparators. The package
provides convenient support for using a function to perform comparison. While
handy, this can incur allocation due to pointers escaping if the key is a
pointer and read operations other than complete iteration are performed.

In addition to the `Set` and `Map` implementations in the `btree` package,
there is a parallel pair of data structures in the `btree/interval` package.
These data structures require an expanded configuration to store intervals
of data. Their iterator offers the capability to perform overlapping scan
queries.

A helper package `util/btree/ordered` provides definitions of interfaces
relating to comparing ordered data.

All of these btrees are implemented in terms of an internal/aug package
which provides a generic augmented btree implementation. The interval tree
and the order statistic tree utilize augmentations to implement their extra
functionality. By structuring the logic this way, we're able to utilize the
same core tree logic for all of our tree-based data structures.

The logic for this PR was directly derived from the existing generic-templated
interval btree. This implementation is more generic than that one in that it
does not rely on intervals having byte keys or an ID value; the user supplies
the relevant comparison functions.

By targeting the same API, this package also ports the tests from that
implementation, and its benchmarks. The interval tree is competitive on the
benchmarks. Note that all benchmarks were run using `GOEXPERIMENT=unified`.
This is now the default in the go tip. Its usage is tracked in cockroachdb#93977.

Here are the benchmark results of the interval tree:

```
name                                                       old time/op    new time/op    delta
BTreeInsert/count=16-16                                      46.7ns ± 3%    53.2ns ± 3%  +13.93%  (p=0.000 n=10+9)
BTreeInsert/count=128-16                                     78.9ns ± 1%    81.5ns ± 2%   +3.25%  (p=0.000 n=6+9)
BTreeInsert/count=1024-16                                     158ns ± 1%     168ns ± 6%   +6.14%  (p=0.000 n=10+10)
BTreeInsert/count=8192-16                                     238ns ± 1%     245ns ± 1%   +2.97%  (p=0.000 n=10+10)
BTreeInsert/count=65536-16                                    405ns ± 2%     418ns ± 2%   +3.08%  (p=0.000 n=10+10)
BTreeDelete/count=16-16                                      67.1ns ±10%    66.8ns ±15%     ~     (p=0.986 n=10+10)
BTreeDelete/count=128-16                                      103ns ±21%      99ns ±13%     ~     (p=0.549 n=10+9)
BTreeDelete/count=1024-16                                     210ns ± 5%     220ns ± 3%   +4.70%  (p=0.001 n=10+9)
BTreeDelete/count=8192-16                                     320ns ± 2%     330ns ± 2%   +3.37%  (p=0.000 n=10+9)
BTreeDelete/count=65536-16                                    556ns ± 2%     571ns ± 2%   +2.67%  (p=0.003 n=10+10)
BTreeDeleteInsert/count=16-16                                86.0ns ± 1%    94.3ns ± 0%   +9.72%  (p=0.000 n=10+9)
BTreeDeleteInsert/count=128-16                                217ns ± 2%     228ns ± 2%   +4.87%  (p=0.000 n=9+10)
BTreeDeleteInsert/count=1024-16                               393ns ± 4%     405ns ± 3%   +3.05%  (p=0.008 n=9+8)
BTreeDeleteInsert/count=8192-16                               468ns ± 2%     480ns ± 1%   +2.75%  (p=0.000 n=10+10)
BTreeDeleteInsert/count=65536-16                             1.00µs ±27%    0.93µs ±19%     ~     (p=0.684 n=10+10)
BTreeDeleteInsertCloneOnce/count=16-16                       87.6ns ± 0%    94.1ns ± 0%   +7.43%  (p=0.000 n=9+8)
BTreeDeleteInsertCloneOnce/count=128-16                       218ns ± 3%     226ns ± 2%   +3.96%  (p=0.000 n=10+10)
BTreeDeleteInsertCloneOnce/count=1024-16                      393ns ± 2%     406ns ± 2%   +3.30%  (p=0.001 n=9+10)
BTreeDeleteInsertCloneOnce/count=8192-16                      470ns ± 0%     479ns ± 1%   +1.91%  (p=0.000 n=8+10)
BTreeDeleteInsertCloneOnce/count=65536-16                     979ns ±38%    1117ns ±24%     ~     (p=0.089 n=10+10)
BTreeDeleteInsertCloneEachTime/count=16/reset=false-16        306ns ± 0%     302ns ± 0%   -1.35%  (p=0.000 n=10+9)
BTreeDeleteInsertCloneEachTime/count=16/reset=true-16         131ns ± 1%     143ns ± 1%   +9.75%  (p=0.000 n=9+10)
BTreeDeleteInsertCloneEachTime/count=128/reset=false-16       754ns ± 2%     718ns ± 2%   -4.71%  (p=0.000 n=10+10)
BTreeDeleteInsertCloneEachTime/count=128/reset=true-16        374ns ± 3%     394ns ± 1%   +5.42%  (p=0.000 n=9+8)
BTreeDeleteInsertCloneEachTime/count=1024/reset=false-16     1.38µs ± 1%    1.28µs ± 3%   -7.00%  (p=0.000 n=8+10)
BTreeDeleteInsertCloneEachTime/count=1024/reset=true-16       864ns ± 3%     888ns ± 5%   +2.79%  (p=0.016 n=8+10)
BTreeDeleteInsertCloneEachTime/count=8192/reset=false-16     1.56µs ± 0%    1.62µs ± 1%   +3.87%  (p=0.000 n=10+10)
BTreeDeleteInsertCloneEachTime/count=8192/reset=true-16      1.09µs ± 1%    1.15µs ± 1%   +5.52%  (p=0.000 n=9+10)
BTreeDeleteInsertCloneEachTime/count=65536/reset=false-16    2.68µs ±20%    2.80µs ±20%     ~     (p=0.617 n=10+10)
BTreeDeleteInsertCloneEachTime/count=65536/reset=true-16     1.94µs ±21%    2.20µs ±21%     ~     (p=0.072 n=10+10)
BTreeMakeIter-16                                             5.03ns ± 0%    4.73ns ± 0%   -5.88%  (p=0.000 n=10+10)
BTreeIterSeekGE/count=16-16                                  53.2ns ± 0%    60.8ns ± 1%  +14.32%  (p=0.000 n=10+10)
BTreeIterSeekGE/count=128-16                                 90.7ns ± 0%   101.0ns ± 0%  +11.32%  (p=0.000 n=10+9)
BTreeIterSeekGE/count=1024-16                                 148ns ± 1%     159ns ± 1%   +7.53%  (p=0.000 n=9+10)
BTreeIterSeekGE/count=8192-16                                 226ns ± 1%     240ns ± 1%   +6.38%  (p=0.000 n=10+10)
BTreeIterSeekGE/count=65536-16                                459ns ± 4%     481ns ± 3%   +4.69%  (p=0.000 n=10+10)
BTreeIterSeekLT/count=16-16                                  54.6ns ± 0%    63.4ns ± 2%  +16.17%  (p=0.000 n=9+10)
BTreeIterSeekLT/count=128-16                                 95.5ns ± 1%   105.2ns ± 2%  +10.23%  (p=0.000 n=10+10)
BTreeIterSeekLT/count=1024-16                                 152ns ± 0%     166ns ± 1%   +9.04%  (p=0.000 n=9+10)
BTreeIterSeekLT/count=8192-16                                 229ns ± 1%     247ns ± 1%   +7.85%  (p=0.000 n=10+10)
BTreeIterSeekLT/count=65536-16                                473ns ± 2%     483ns ± 4%   +2.26%  (p=0.015 n=10+10)
BTreeIterFirstOverlap/count=16-16                             149ns ± 0%     158ns ± 1%   +5.77%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=128-16                            266ns ± 1%     267ns ± 0%   +0.59%  (p=0.006 n=10+9)
BTreeIterFirstOverlap/count=1024-16                           481ns ± 1%     486ns ± 1%   +1.00%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=8192-16                           761ns ± 1%     772ns ± 1%   +1.45%  (p=0.000 n=10+10)
BTreeIterFirstOverlap/count=65536-16                         1.14µs ± 2%    1.17µs ± 4%   +2.72%  (p=0.015 n=10+10)
BTreeIterNext-16                                             2.55ns ± 0%    2.67ns ± 0%   +4.65%  (p=0.000 n=10+9)
BTreeIterPrev-16                                             15.0ns ± 1%    14.9ns ± 1%   -1.17%  (p=0.000 n=10+9)
BTreeIterNextOverlap-16                                      4.12ns ± 0%    4.23ns ± 0%   +2.54%  (p=0.000 n=10+9)
BTreeIterOverlapScan-16                                      11.0µs ± 1%    11.7µs ± 1%   +7.03%  (p=0.000 n=10+10)
```

In order to gain yet more testing and more validation of the implementation,
the `util/btree/gbtree` package provides an adapter of `util/btree.Set` to
`github.com/google/btree.BTreeG`, the generic btree from google's package.
This implementation generally does well against that one.

```
name                           old time/op    new time/op    delta
InsertG-16                        162ns ± 0%     138ns ± 4%   -14.71%  (p=0.000 n=9+10)
SeekG-16                          124ns ± 1%      89ns ± 2%   -28.60%  (p=0.000 n=10+10)
DeleteInsertG-16                  319ns ± 2%     255ns ± 1%   -20.03%  (p=0.000 n=10+8)
DeleteInsertCloneOnceG-16         318ns ± 2%     255ns ± 0%   -20.01%  (p=0.000 n=10+8)
DeleteInsertCloneEachTimeG-16     976ns ± 1%    1322ns ± 3%   +35.46%  (p=0.000 n=9+8)
DeleteG-16                        178ns ± 1%     149ns ± 5%   -16.32%  (p=0.000 n=10+10)
GetG-16                           138ns ± 0%     126ns ± 3%    -8.82%  (p=0.000 n=9+10)
GetCloneEachTimeG-16              220ns ± 1%     171ns ± 3%   -22.18%  (p=0.000 n=10+10)
AscendG-16                       49.2µs ± 1%    51.2µs ± 1%    +3.92%  (p=0.000 n=8+8)
DescendG-16                      45.8µs ± 1%    50.1µs ± 2%    +9.51%  (p=0.000 n=10+9)
AscendRangeG-16                  77.7µs ± 1%    70.9µs ±29%    -8.85%  (p=0.028 n=9+10)
DescendRangeG-16                 95.4µs ± 0%    64.3µs ±14%   -32.67%  (p=0.000 n=10+10)
AscendGreaterOrEqualG-16         56.5µs ± 1%    55.9µs ±27%    -0.99%  (p=0.042 n=10+9)
DescendLessOrEqualG-16           78.1µs ± 1%    51.5µs ± 3%   -33.99%  (p=0.000 n=10+10)
DeleteAndRestoreG/Copy-16        4.48ms ± 0%    3.53ms ± 2%   -21.08%  (p=0.000 n=8+10)
DeleteAndRestoreG/Clear-16       2.84ms ± 0%    2.24ms ± 2%   -21.16%  (p=0.000 n=9+10)
```

Epic: None

Release note: None
@ajwerner ajwerner closed this as not planned Won't fix, can't repro, duplicate, stale Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

1 participant