-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Starlark: reuse positional array in native calls where possible #19708
Conversation
} | ||
|
||
@Test | ||
public void testNamedOnlyArgument() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this test to document why I think that #12782 (comment) isn't a safe optimization, but I may have misunderstood the original comment.
@brandjon Just curious: With progressing Starlarkification, is there a plan to reevaluate the "exectree" representation of Starlark modules at some point? |
@brandjon Friendly ping |
Based on a naive reading of the code, we will now skip the special-cased StringModule behavior below (lines 175-179 in the original version of BuiltinFunction.java), and therefore Starlark string object methods that don't take a Starlark thread (e.g. "foo".upper()`) may misbehave. Can you verify whether or not this is the case, and if it's not, add a comment why things nevertheless work? Do we have sufficient test coverage for this? |
This is a resubmission of bazelbuild#12782, with this description: ==== For certain calls like `type(x)` or `str(x)` or `l.append(x)` we can reuse `positional` array argument of `fastcall` to pass it to the `MethodDescriptor.call` and skip all runtime checks in `BuiltinFunction`. This optimization cannot be applied if * a function has extra positional or named arguments * has arguments guarded by flag * accepts extra `StarlarkThread` parameter * has unfilled arguments with default values So this optimation won't be used for calls `list(x)` or `bool()`. For `type(1)` called in loop, the result is 15% speedup: ``` A: n=30 mean=5.705 std=0.387 se=0.071 min=5.131 med=5.827 B: n=30 mean=4.860 std=0.345 se=0.064 min=4.396 med=4.946 B/A: 0.852 0.820..0.884 (95% conf) ``` For `bool()` (no argument) called in loop, it results in 1% slowdown: ``` A: n=29 mean=9.045 std=0.603 se=0.113 min=8.096 med=9.400 B: n=29 mean=9.134 std=0.520 se=0.098 min=8.248 med=9.448 B/A: 1.010 0.976..1.045 (95% conf) ``` ==== Beyond the original PR, this includes benchmarks, a test case addressing a review comment and small adaptions to the implementation to make it work with recent changes. Benchmark results with JDK 21 and `--seconds 5`: ``` before: File src/test/java/net/starlark/java/eval/testdata/bench_call.star: benchmark ops cpu/op wall/op steps/op alloc/op bench_call_bool 67108863 100ns 100ns 3 87B bench_call_dict_get 33554431 159ns 158ns 5 143B bench_call_dict_get_none 33554431 180ns 179ns 6 143B bench_call_list_append 33554431 170ns 169ns 5 207B bench_call_type 67108863 139ns 138ns 4 111B after: File src/test/java/net/starlark/java/eval/testdata/bench_call.star: benchmark ops cpu/op wall/op steps/op alloc/op bench_call_bool 67108863 100ns 100ns 3 87B bench_call_dict_get 33554431 155ns 154ns 5 143B bench_call_dict_get_none 33554431 174ns 174ns 6 143B bench_call_list_append 67108863 141ns 141ns 5 183B bench_call_type 67108863 115ns 114ns 4 87B ``` Co-authored-by: Fabian Meumertzheim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable. Verified that the benchmark seems to show a small improvement.
@bazel-io fork 7.1.0 |
This is a resubmission of @stepancheg's bazelbuild#12782, which became stale, with this original description: ==== For certain calls like `type(x)` or `str(x)` or `l.append(x)` we can reuse `positional` array argument of `fastcall` to pass it to the `MethodDescriptor.call` and skip all runtime checks in `BuiltinFunction`. This optimization cannot be applied if * a function has extra positional or named arguments * has arguments guarded by flag * accepts extra `StarlarkThread` parameter * has unfilled arguments with default values So this optimation won't be used for calls `list(x)` or `bool()`. For `type(1)` called in loop, the result is 15% speedup: ``` A: n=30 mean=5.705 std=0.387 se=0.071 min=5.131 med=5.827 B: n=30 mean=4.860 std=0.345 se=0.064 min=4.396 med=4.946 B/A: 0.852 0.820..0.884 (95% conf) ``` For `bool()` (no argument) called in loop, it results in 1% slowdown: ``` A: n=29 mean=9.045 std=0.603 se=0.113 min=8.096 med=9.400 B: n=29 mean=9.134 std=0.520 se=0.098 min=8.248 med=9.448 B/A: 1.010 0.976..1.045 (95% conf) ``` ==== Beyond the original PR, this includes benchmarks, a test case addressing a review comment and small adaptions to the implementation to make it work with recent changes. Benchmark results with JDK 21 and `--seconds 5`: ``` before: File src/test/java/net/starlark/java/eval/testdata/bench_call.star: benchmark ops cpu/op wall/op steps/op alloc/op bench_call_bool 67108863 100ns 100ns 3 87B bench_call_dict_get 33554431 159ns 158ns 5 143B bench_call_dict_get_none 33554431 180ns 179ns 6 143B bench_call_list_append 33554431 170ns 169ns 5 207B bench_call_type 67108863 139ns 138ns 4 111B after: File src/test/java/net/starlark/java/eval/testdata/bench_call.star: benchmark ops cpu/op wall/op steps/op alloc/op bench_call_bool 67108863 100ns 100ns 3 87B bench_call_dict_get 33554431 155ns 154ns 5 143B bench_call_dict_get_none 33554431 174ns 174ns 6 143B bench_call_list_append 67108863 141ns 141ns 5 183B bench_call_type 67108863 115ns 114ns 4 87B ``` Closes bazelbuild#19708. PiperOrigin-RevId: 602821608 Change-Id: If40e2eb1e09ec43d56b36ba13987aa5312fd47ec
…le (#21144) This is a resubmission of @stepancheg's #12782, which became stale, with this original description: ==== For certain calls like `type(x)` or `str(x)` or `l.append(x)` we can reuse `positional` array argument of `fastcall` to pass it to the `MethodDescriptor.call` and skip all runtime checks in `BuiltinFunction`. This optimization cannot be applied if * a function has extra positional or named arguments * has arguments guarded by flag * accepts extra `StarlarkThread` parameter * has unfilled arguments with default values So this optimation won't be used for calls `list(x)` or `bool()`. For `type(1)` called in loop, the result is 15% speedup: ``` A: n=30 mean=5.705 std=0.387 se=0.071 min=5.131 med=5.827 B: n=30 mean=4.860 std=0.345 se=0.064 min=4.396 med=4.946 B/A: 0.852 0.820..0.884 (95% conf) ``` For `bool()` (no argument) called in loop, it results in 1% slowdown: ``` A: n=29 mean=9.045 std=0.603 se=0.113 min=8.096 med=9.400 B: n=29 mean=9.134 std=0.520 se=0.098 min=8.248 med=9.448 B/A: 1.010 0.976..1.045 (95% conf) ``` ==== Beyond the original PR, this includes benchmarks, a test case addressing a review comment and small adaptions to the implementation to make it work with recent changes. Benchmark results with JDK 21 and `--seconds 5`: ``` before: File src/test/java/net/starlark/java/eval/testdata/bench_call.star: benchmark ops cpu/op wall/op steps/op alloc/op bench_call_bool 67108863 100ns 100ns 3 87B bench_call_dict_get 33554431 159ns 158ns 5 143B bench_call_dict_get_none 33554431 180ns 179ns 6 143B bench_call_list_append 33554431 170ns 169ns 5 207B bench_call_type 67108863 139ns 138ns 4 111B after: File src/test/java/net/starlark/java/eval/testdata/bench_call.star: benchmark ops cpu/op wall/op steps/op alloc/op bench_call_bool 67108863 100ns 100ns 3 87B bench_call_dict_get 33554431 155ns 154ns 5 143B bench_call_dict_get_none 33554431 174ns 174ns 6 143B bench_call_list_append 67108863 141ns 141ns 5 183B bench_call_type 67108863 115ns 114ns 4 87B ``` Closes #19708. Commit aded06b PiperOrigin-RevId: 602821608 Change-Id: If40e2eb1e09ec43d56b36ba13987aa5312fd47ec Co-authored-by: Fabian Meumertzheim <[email protected]>
The changes in this PR have been included in Bazel 7.1.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=last_rc. |
This is a resubmission of @stepancheg's #12782, which became stale, with this original description:
====
For certain calls like
type(x)
orstr(x)
orl.append(x)
we can reusepositional
array argument offastcall
to pass it to theMethodDescriptor.call
and skip all runtime checks inBuiltinFunction
.This optimization cannot be applied if
StarlarkThread
parameterSo this optimation won't be used for calls
list(x)
orbool()
.For
type(1)
called in loop, the result is 15% speedup:For
bool()
(no argument) called in loop, it results in 1% slowdown:====
Beyond the original PR, this includes benchmarks, a test case addressing a review comment and small adaptions to the implementation to make it work with recent changes.
Benchmark results with JDK 21 and
--seconds 5
: