Use ZST for fmt unsafety #89139

camsteffen · 2021-09-21T06:03:04Z

as suggested here - #83302 (comment).

rust-highfive · 2021-09-21T06:03:07Z

Some changes occurred in src/tools/clippy.

cc @rust-lang/clippy

rust-highfive · 2021-09-21T06:03:08Z

r? @scottmcm

(rust-highfive has picked a reviewer for you, use r? to override)

camsteffen · 2021-09-21T06:03:14Z

r? @Mark-Simulacrum

camsteffen · 2021-09-21T06:03:36Z

@bors try @rust-timer queue

rust-timer · 2021-09-21T06:03:37Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-09-21T06:03:43Z

⌛ Trying commit dbfddbabc542aa6700738dc908423dfa924589b9 with merge f6eea4b87f48729f8ba6d9bc07003e4763f92b75...

bors · 2021-09-21T07:27:54Z

☀️ Try build successful - checks-actions
Build commit: f6eea4b87f48729f8ba6d9bc07003e4763f92b75 (f6eea4b87f48729f8ba6d9bc07003e4763f92b75)

rust-timer · 2021-09-21T07:27:55Z

Queued f6eea4b87f48729f8ba6d9bc07003e4763f92b75 with parent e7958d3, future comparison URL.

rust-timer · 2021-09-21T12:44:31Z

Finished benchmarking commit (f6eea4b87f48729f8ba6d9bc07003e4763f92b75): comparison url.

Summary: This change led to moderate relevant improvements 🎉 in compiler performance.

Moderate improvement in instruction counts (up to -1.5% on full builds of cranelift-codegen)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

library/core/src/fmt/mod.rs

Mark-Simulacrum · 2021-09-21T13:35:44Z

r=me with an expanded comment on the UnsafeArg constructor. I think we should basically say something like "must be used for at most one call to new_v1_formatted, and created only if the other arguments meet the preconditions", with ideally those preconditions documented somewhere but for now we can just put in a FIXME for that.

This allows the format_args! macro to keep the pre-expansion code out of the unsafe block without doing gymnastics with nested `match` expressions. This reduces codegen.

camsteffen · 2021-09-21T15:05:18Z

@bors r=Mark-Simulacrum

bors · 2021-09-21T15:05:20Z

📌 Commit 09b37d7 has been approved by Mark-Simulacrum

rust-log-analyzer · 2021-09-22T17:27:15Z

The job x86_64-gnu-tools failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

  IMAGE: x86_64-gnu-tools
##[endgroup]
From https://github.com/rust-lang/rust
 * branch              master     -> FETCH_HEAD
Searching for toolstate changes between cfff31bc833070a00578bd6178160aeed56f28ba and 6ebf85007fc8e7fb4581c7f2fb201a6aaae6d937
Clippy or rustfmt subtrees were updated
##[group]Run src/ci/scripts/verify-channel.sh
src/ci/scripts/verify-channel.sh
shell: /bin/bash --noprofile --norc -e -o pipefail {0}
env:
---
......... (40/42)
..         (42/42)


An exception occured: Failed to launch the browser process!
Inconsistency detected by ld.so: dl-tls.c: 493: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].gen <= GL(dl_tls_generation)' failed!


TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md
== STACKTRACE ==
Error
Error
    at innerRunTestCode (/node-v14.4.0-linux-x64/lib/node_modules/browser-ui-test/src/index.js:442:16)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)


command did not execute successfully: "/node-v14.4.0-linux-x64/bin/node" "/checkout/src/tools/rustdoc-gui/tester.js" "--jobs" "16" "--doc-folder" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/doc" "--tests-folder" "/checkout/src/test/rustdoc-gui"


Build completed unsuccessfully in 0:00:15

bors · 2021-09-23T02:10:29Z

⌛ Testing commit 2efa9d7 with merge 67365d6...

bors · 2021-09-23T04:51:06Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 67365d6 to master...

rust-timer · 2021-09-23T06:54:49Z

Finished benchmarking commit (67365d6): comparison url.

Summary: This change led to large relevant mixed results 🤷 in compiler performance.

Large improvement in instruction counts (up to -1.6% on full builds of cranelift-codegen)
Moderate regression in instruction counts (up to 0.7% on full builds of ctfe-stress-4)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

Mark-Simulacrum · 2021-09-25T22:49:15Z

Regressions in deeply-nested-async appear to be due to extra instructions in the type privacy visitor, presumably caused by the println! expanding to include an additional type after this PR. That then creates extra work for the visitor.

Regressions in ctfe-stress have no such obvious cause, and the test doesn't actually contain any println!s. The self-profile view shows this as increasing the number of specializes queries, which is somewhat surprising; it looks like all specializes queries on this benchmark are coming from typeck > evaluation_obligation > specializes based on self-profile results.

My best guess is that the code here is the cause of the regression: presumably, we're trying to eliminate a newly added impl when checking some ImplCandidate. I compiled locally with before/after and -Zself-profile-events=all to get the inputs to both queries which produces some pretty weird results.

For the record, this was done with rust-lang/rustc-perf#1031 and then using:

./target/release/collector diff_local self-profile +30278d3cf92b581550933546370443a5d5700002 +67365d64bcdfeae1334bf2ff49587c27d1c973f0 --include ctfe-stress-4 --builds Check --runs Full
mmview ./results/Zsp-30278d3cf92b581550933546370443a5d5700002-ctfe-stress-4-Check-Full/Zsp.mm_profdata | rg 'label: specializes' -A1 | rg additional_data > before
mmview ./results/Zsp-67365d64bcdfeae1334bf2ff49587c27d1c973f0-ctfe-stress-4-Check-Full/Zsp.mm_profdata | rg 'label: specializes' -A1 | rg additional_data > after
diff -u <(sort ./base) <(sort ./after) > query.diff

I put up the diff here, but it shows some pretty weird inconsistencies that I wouldn't have expected from this PR -- my expectation would be that the specializes() query here is invoked deterministically, but that doesn't seem to be the case, or these indices are off. That said, all the changes in invocation are from core::ops::bit, which is not changed by this PR (right?) -- not even impls should be getting added in there.

There's some discussion on this particular topic on Zulip - https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/defpath.20printing

Mark-Simulacrum · 2021-09-25T23:00:26Z

Zulip discussion indicates that the shifts in indices are not unexpected. I still don't have a good explanation for why there's extra queries though -- that seems like it shouldn't happen for the changes in this PR, perhaps suggesting that trait resolution is somehow being affected here :/

library/core/src/fmt/mod.rs

Mark-Simulacrum · 2021-09-27T14:12:01Z

Okay, upon rebuilding before/after locally with debug logging enabled and some investigation, I'm pretty sure that the delta in query counts here was due to this landing before #89016 which fixed some "noise" in the iteration order here. That PR also reduces the amount of specializes queries (though it's not expected that this has any direct relationship to this PR); we are just seeing that there is now a hopefully more stable order.

Regardless, this source of noise is now understood and we can mark this regression as triaged; we don't expect any fixes to the ctfe-stress to be possible. The previous comment already triaged the await-call-tree regression as something we're also not expecting to fix and consider acceptable.

@rustbot label +perf-regression-triaged

(cc @lcnr in case you wanted some additional evidence for why that was important...)

Use ZST for fmt unsafety as suggested here - rust-lang#83302 (comment).

…-Simulacrum Add private arg to fmt::UnsafeArg As discussed [here](rust-lang#89139 (comment)) r? `@Mark-Simulacrum`

rust-highfive assigned scottmcm Sep 21, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 21, 2021

rust-highfive assigned Mark-Simulacrum and unassigned scottmcm Sep 21, 2021

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 21, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 21, 2021

Mark-Simulacrum reviewed Sep 21, 2021

View reviewed changes

library/core/src/fmt/mod.rs Show resolved Hide resolved

camsteffen commented Sep 21, 2021

View reviewed changes

library/core/src/fmt/mod.rs Show resolved Hide resolved

camsteffen force-pushed the write-perf branch from dbfddba to 45a3f49 Compare September 21, 2021 15:03

Use ZST for fmt unsafety

09b37d7

This allows the format_args! macro to keep the pre-expansion code out of the unsafe block without doing gymnastics with nested `match` expressions. This reduces codegen.

camsteffen force-pushed the write-perf branch from 45a3f49 to 09b37d7 Compare September 21, 2021 15:04

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 21, 2021

the8472 added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 21, 2021

camsteffen mentioned this pull request Sep 21, 2021

Use UnsafeArg in Arguments::new_v1 #89155

Closed

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 22, 2021

camsteffen closed this Sep 22, 2021

camsteffen reopened this Sep 22, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 23, 2021

bors merged commit 67365d6 into rust-lang:master Sep 23, 2021

rustbot added this to the 1.57.0 milestone Sep 23, 2021

camsteffen deleted the write-perf branch September 23, 2021 04:58

rustbot added the perf-regression Performance regression. label Sep 23, 2021

cjgillot reviewed Sep 25, 2021

View reviewed changes

library/core/src/fmt/mod.rs Show resolved Hide resolved

rustbot added the perf-regression-triaged The performance regression has been triaged. label Sep 27, 2021

flip1995 pushed a commit to flip1995/rust that referenced this pull request Sep 28, 2021

Auto merge of rust-lang#89139 - camsteffen:write-perf, r=Mark-Simulacrum

edaeacf

Use ZST for fmt unsafety as suggested here - rust-lang#83302 (comment).

camsteffen mentioned this pull request Sep 30, 2021

Add private arg to fmt::UnsafeArg #89403

Merged

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 1, 2021

Auto merge of rust-lang#89403 - camsteffen:fmt-unsafe-private, r=Mark…

ed93759

…-Simulacrum Add private arg to fmt::UnsafeArg As discussed [here](rust-lang#89139 (comment)) r? `@Mark-Simulacrum`

Mark-Simulacrum mentioned this pull request Oct 4, 2021

Get piece unchecked in write #83302

Merged

camsteffen mentioned this pull request Oct 4, 2021

Perf experiment for "Get piece unchecked in write" #89521

Closed

Mark-Simulacrum mentioned this pull request Oct 13, 2021

Add a strange test for unsafe_code lint. #89821

Merged

davidlattimore mentioned this pull request Jul 8, 2022

Remove unnecessary unsafe from format_args expansion rust-lang/rust-analyzer#12719

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ZST for fmt unsafety #89139

Use ZST for fmt unsafety #89139

camsteffen commented Sep 21, 2021

rust-highfive commented Sep 21, 2021

rust-highfive commented Sep 21, 2021

camsteffen commented Sep 21, 2021

camsteffen commented Sep 21, 2021

rust-timer commented Sep 21, 2021

bors commented Sep 21, 2021

bors commented Sep 21, 2021

rust-timer commented Sep 21, 2021

rust-timer commented Sep 21, 2021

Mark-Simulacrum commented Sep 21, 2021

camsteffen commented Sep 21, 2021

bors commented Sep 21, 2021

rust-log-analyzer commented Sep 22, 2021

bors commented Sep 23, 2021

bors commented Sep 23, 2021

rust-timer commented Sep 23, 2021

Mark-Simulacrum commented Sep 25, 2021

Mark-Simulacrum commented Sep 25, 2021

Mark-Simulacrum commented Sep 27, 2021

Use ZST for fmt unsafety #89139

Use ZST for fmt unsafety #89139

Conversation

camsteffen commented Sep 21, 2021

rust-highfive commented Sep 21, 2021

rust-highfive commented Sep 21, 2021

camsteffen commented Sep 21, 2021

camsteffen commented Sep 21, 2021

rust-timer commented Sep 21, 2021

bors commented Sep 21, 2021

bors commented Sep 21, 2021

rust-timer commented Sep 21, 2021

rust-timer commented Sep 21, 2021

Mark-Simulacrum commented Sep 21, 2021

camsteffen commented Sep 21, 2021

bors commented Sep 21, 2021

rust-log-analyzer commented Sep 22, 2021

bors commented Sep 23, 2021

bors commented Sep 23, 2021

rust-timer commented Sep 23, 2021

Mark-Simulacrum commented Sep 25, 2021

Mark-Simulacrum commented Sep 25, 2021

Mark-Simulacrum commented Sep 27, 2021