Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize try_eval_bits to avoid layout queries #64673

Merged
merged 1 commit into from
Sep 30, 2019

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Sep 21, 2019

This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a size_of query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from not being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 21, 2019
@Mark-Simulacrum
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 21, 2019

⌛ Trying commit 511352c with merge 9b2e58c...

bors added a commit that referenced this pull request Sep 21, 2019
Optimize match checking to avoid layout queries

In code with large, single-value match statements, we were previously
spending a lot of time running layout_of for the primitive types
(integers, chars) -- which is essentially useless. This optimizes the
code to avoid those query calls by directly obtaining the size for these
types, when possible.

We fallback to the (slower) previous code if that fails, so this is not
a behavior change.

r? @Centril who I believe knows this code enough, but if not feel free to re-assign
@Centril
Copy link
Contributor

Centril commented Sep 21, 2019

r? @oli-obk cc @varkor @arielb1

@rust-highfive rust-highfive assigned oli-obk and unassigned Centril Sep 21, 2019
@bors
Copy link
Contributor

bors commented Sep 22, 2019

☀️ Try build successful - checks-azure
Build commit: 9b2e58c

@rust-timer
Copy link
Collaborator

Queued 9b2e58c with parent ed8b708, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 9b2e58c, comparison URL.

@bjorn3
Copy link
Member

bjorn3 commented Sep 22, 2019

unicode_normalization is ~30% faster! 3 other benchea got ~2% slower. The rest is stable.

@Mark-Simulacrum
Copy link
Member Author

Okay, moved into the try_eval_bits function -- locally that shows that this unicode_normalization is a bit slower (~0.05s) after doing so, but that might be noise (although I get pretty consistent results).

Let's get some official results though @bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 22, 2019

⌛ Trying commit 9c62117 with merge 64687bb...

bors added a commit that referenced this pull request Sep 22, 2019
Optimize match checking to avoid layout queries

In code with large, single-value match statements, we were previously
spending a lot of time running layout_of for the primitive types
(integers, chars) -- which is essentially useless. This optimizes the
code to avoid those query calls by directly obtaining the size for these
types, when possible.

We fallback to the (slower) previous code if that fails, so this is not
a behavior change.

r? @Centril who I believe knows this code enough, but if not feel free to re-assign
@Mark-Simulacrum Mark-Simulacrum changed the title Optimize match checking to avoid layout queries Optimize try_eval_bits to avoid layout queries Sep 22, 2019
@bors
Copy link
Contributor

bors commented Sep 22, 2019

☀️ Try build successful - checks-azure
Build commit: 64687bb

@rust-timer
Copy link
Collaborator

Queued 64687bb with parent 4ff32c0, future comparison URL.

@bjorn3
Copy link
Member

bjorn3 commented Sep 22, 2019

Compilation of four crates failed during benchmarking:

thread \'rustc\' panicked at \'assertion failed: pos.checked_add(num_bytes).unwrap() <= self.mapped_file.len()\', /cargo/registry/src/github.com-1ecc6299db9ec823/measureme-0.3.0/src/mmap_serialization_sink.rs:38:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.39.0-nightly (ed8b708c1 2019-09-21) running on x86_64-unknown-linux-gnu

note: compiler flags: -Z self-profile=/tmp/.tmp1D64N5/self-profile-output -Z self-profile-events=all -C debuginfo=2 --crate-type lib

note: some of the compiler flags provided by cargo are hidden

thread \'rustc\' panicked at \'index 1073741840 out of range for slice of length 1073741824\', src/libcore/slice/mod.rs:2583:5
stack backtrace:
   0:     0x7fd1648a2b04 - backtrace::backtrace::libunwind::trace::h61b15987b9420dc8
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/libunwind.rs:88
   1:     0x7fd1648a2b04 - backtrace::backtrace::trace_unsynchronized::h944547918bca7d09
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/mod.rs:66
   2:     0x7fd1648a2b04 - std::sys_common::backtrace::_print_fmt::hf631db7a19c7ecfe
                               at src/libstd/sys_common/backtrace.rs:76
   3:     0x7fd1648a2b04 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h356b821d79ead967
                               at src/libstd/sys_common/backtrace.rs:60
   4:     0x7fd1648db14c - core::fmt::write::haa7725ecee710b81
                               at src/libcore/fmt/mod.rs:1030
   5:     0x7fd164896d27 - std::io::Write::write_fmt::h677b5c4b9e48abad
                               at src/libstd/io/mod.rs:1412
   6:     0x7fd1648a7335 - std::sys_common::backtrace::_print::hdc27c79deedd181f
                               at src/libstd/sys_common/backtrace.rs:64
   7:     0x7fd1648a7335 - std::sys_common::backtrace::print::h0bb3a218c68a1b38
                               at src/libstd/sys_common/backtrace.rs:49
   8:     0x7fd1648a7335 - std::panicking::default_hook::{{closure}}::h9160d687734b4c2f
                               at src/libstd/panicking.rs:196
   9:     0x7fd1648a7026 - std::panicking::default_hook::h298b832ea14df44f
                               at src/libstd/panicking.rs:210
  10:     0x7fd164ddcf63 - rustc_driver::report_ice::hce2a6b74528a3743
  11:     0x7fd1648a7b1c - std::panicking::rust_panic_with_hook::h7c6406c2637b219f
                               at src/libstd/panicking.rs:477
  12:     0x7fd1648a75d2 - std::panicking::continue_panic_fmt::h8e5e175fd262b206
                               at src/libstd/panicking.rs:380
  13:     0x7fd1648a74c6 - rust_begin_unwind
                               at src/libstd/panicking.rs:307
  14:     0x7fd1648d4ada - core::panicking::panic_fmt::h864c751c34920017
                               at src/libcore/panicking.rs:85
  15:     0x7fd1648d5216 - core::slice::slice_index_len_fail::h5579666bd5db7c44
                               at src/libcore/slice/mod.rs:2583
  16:     0x7fd166856244 - <measureme::mmap_serialization_sink::MmapSerializationSink as core::ops::drop::Drop>::drop::hfdb5ca3a69a780c5
  17:     0x7fd164de1948 - alloc::sync::Arc<T>::drop_slow::h1e9b9b3bcf38be1c
  18:     0x7fd164de1c3e - alloc::sync::Arc<T>::drop_slow::h2d5c08f8d08065e0
  19:     0x7fd164df4f15 - <alloc::rc::Rc<T> as core::ops::drop::Drop>::drop::h226d778706bf29d2
  20:     0x7fd164db01bc - core::ptr::real_drop_in_place::h78ba5d2f1d1836e8
  21:     0x7fd164daa59c - rustc_interface::interface::run_compiler_in_existing_thread_pool::hf9ebc0f8adadba07
[...]

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 64687bb, comparison URL.

@Mark-Simulacrum
Copy link
Member Author

Failures are unrelated to this PR; they're due to newly introduced self-profile functionality for rustc.

@Mark-Simulacrum
Copy link
Member Author

Looks like ~no difference which is good -- more general code is probably better. It's probably more likely this would've had a difference with const generics being used in the set of crate benchmarks.

@oli-obk I believe this should be ready to merge.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 23, 2019

@bors r+

@bors
Copy link
Contributor

bors commented Sep 23, 2019

📌 Commit 9c62117 has been approved by oli-obk

@bors bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 23, 2019
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 23, 2019
@Centril
Copy link
Contributor

Centril commented Sep 23, 2019

@bors rollup=never

@Centril
Copy link
Contributor

Centril commented Sep 28, 2019

@bors p=3

@bors
Copy link
Contributor

bors commented Sep 29, 2019

⌛ Testing commit 9c62117 with merge eea282a8233db92be72b4de43c575023a00f07f8...

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-debug of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-09-29T13:50:20.2613831Z == clock drift check ==
2019-09-29T13:50:20.2613979Z   local time: Sun Sep 29 13:50:19 UTC 2019
2019-09-29T13:50:20.2614140Z   network time: Sun, 29 Sep 2019 13:50:19 GMT
2019-09-29T13:50:20.2614297Z == end clock drift check ==
2019-09-29T13:50:20.7406115Z ##[error]Bash exited with code '1'.
2019-09-29T13:50:20.7458169Z ##[section]Starting: Upload CPU usage statistics
2019-09-29T13:50:20.7461403Z ==============================================================================
2019-09-29T13:50:20.7461479Z Task         : Bash
2019-09-29T13:50:20.7461552Z Description  : Run a Bash script on macOS, Linux, or Windows

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors
Copy link
Contributor

bors commented Sep 29, 2019

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Sep 29, 2019
This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a `size_of` query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from *not* being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.
@Mark-Simulacrum
Copy link
Member Author

@bors r=oli-obk

@bors
Copy link
Contributor

bors commented Sep 29, 2019

📌 Commit 06c6e75 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 29, 2019
@bors
Copy link
Contributor

bors commented Sep 29, 2019

⌛ Testing commit 06c6e75 with merge d16ee89...

bors added a commit that referenced this pull request Sep 29, 2019
Optimize try_eval_bits to avoid layout queries

This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a `size_of` query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from *not* being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.
@bors
Copy link
Contributor

bors commented Sep 30, 2019

☀️ Test successful - checks-azure
Approved by: oli-obk
Pushing d16ee89 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 30, 2019
@bors bors merged commit 06c6e75 into rust-lang:master Sep 30, 2019
nnethercote added a commit to nnethercote/rust that referenced this pull request Oct 4, 2019
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is
very hot for the `unicode_normalization` benchmark.

This commit introduces a special-case alternative for scalars that avoids
`try_eval_bits()` and all the functions it calls (`Const::eval()`,
`ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and
`Scalar::to_bits()`), instead extracting the result immediately.

The type and value checking done by `Scalar::to_bits()` is replicated by moving
it into a new function `Scalar::check_raw()` and using that new function in the
special case.

PR rust-lang#64673 introduced some special-case handling of scalar types in
`Const::try_eval_bits()`. This handling is now moved out of that function into
the new `IntRange::integral_size_and_signed_bias` function.

This commit reduces the instruction count for
`unicode_normalization-check-clean` by about 10%.
@Mark-Simulacrum Mark-Simulacrum deleted the opt-match-ck branch October 8, 2019 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants