Rollup of 9 pull requests #94333

Dylan-DPC · 2022-02-24T20:42:20Z

Successful merges:

resolve/metadata: Stop encoding macros as reexports #91795 (resolve/metadata: Stop encoding macros as reexports)
better ObligationCause for normalization errors in can_type_implement_copy #93714 (better ObligationCause for normalization errors in can_type_implement_copy)
Improve --check-cfg implementation #94175 (Improve --check-cfg implementation)
Stop manually SIMDing in swap_nonoverlapping #94212 (Stop manually SIMDing in swap_nonoverlapping)
properly handle fat pointers to uninhabitable types #94242 (properly handle fat pointers to uninhabitable types)
Normalize main return type during mono item collection & codegen #94308 (Normalize main return type during mono item collection & codegen)
update auto trait lint for PhantomData #94315 (update auto trait lint for PhantomData)
Improve string literal unescaping #94316 (Improve string literal unescaping)
Avoid emitting full macro body into JSON errors #94327 (Avoid emitting full macro body into JSON errors)

Failed merges:

r? @ghost
@rustbot modify labels: rollup

Like I previously did for `reverse`, this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have. It does still need logic to type-erase where appropriate, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`. As a bonus, this also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>

- Test the combinations of --check-cfg with partial values() and --cfg - Test that we detect unexpected value when none are expected

`scan_escape` currently has a fast path (for when the first char isn't '\\') and a slow path. This commit changes `scan_escape` so it only handles the slow path, i.e. the actual escaping code. The fast path is inlined into the two call sites. This change makes the code faster, because there is no function call overhead on the fast path. (`scan_escape` is a big function and doesn't get inlined.) This change also improves readability, because it removes a bunch of mode checks on the the fast paths.

The change looks big because `rustfmt` rearranges things, but the only real change is the inlining annotation.

…_copy

…-in macros Previously it always returned `MacroKind::Bang` while some of those macros are actually attributes and derives

To make the `macro_rules` flag more readily available without decoding everything else

resolve/metadata: Stop encoding macros as reexports Supersedes rust-lang#88335. r? `@cjgillot`

…error-span, r=jackh726 better ObligationCause for normalization errors in `can_type_implement_copy` Some logic is needed so we can point to the field when given totally nonsense types like `struct Foo(<u32 as Iterator>::Item);` Fixes rust-lang#93687

…rochenkov Improve `--check-cfg` implementation This pull-request is a mix of improvements regarding the `--check-cfg` implementation: - Simpler internal representation (usage of `Option` instead of separate bool) - Add --check-cfg to the unstable book (based on the RFC) - Improved diagnostics: * List possible values when the value is unexpected * Suggest if possible a name or value that is similar - Add more tests (well known names, mix of combinations, ...) r? ```@petrochenkov```

Stop manually SIMDing in `swap_nonoverlapping` Like I previously did for `reverse` (rust-lang#90821), this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have. A variety of codegen tests are included to confirm that the various cases are still being vectorized. It does still need logic to type-erase in some cases, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`. As a bonus, this change also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y> <details> <summary>ASM for this example</summary> ## Before (from godbolt) note the `push`/`pop`s and `memcpy` ```x86 swap_m256_slice: push r15 push r14 push r13 push r12 push rbx sub rsp, 32 cmp rsi, rcx jne .LBB0_6 mov r14, rsi shl r14, 5 je .LBB0_6 mov r15, rdx mov rbx, rdi xor eax, eax .LBB0_3: mov rcx, rax vmovaps ymm0, ymmword ptr [rbx + rax] vmovaps ymm1, ymmword ptr [r15 + rax] vmovaps ymmword ptr [rbx + rax], ymm1 vmovaps ymmword ptr [r15 + rax], ymm0 add rax, 32 add rcx, 64 cmp rcx, r14 jbe .LBB0_3 sub r14, rax jbe .LBB0_6 add rbx, rax add r15, rax mov r12, rsp mov r13, qword ptr [rip + memcpy@GOTPCREL] mov rdi, r12 mov rsi, rbx mov rdx, r14 vzeroupper call r13 mov rdi, rbx mov rsi, r15 mov rdx, r14 call r13 mov rdi, r15 mov rsi, r12 mov rdx, r14 call r13 .LBB0_6: add rsp, 32 pop rbx pop r12 pop r13 pop r14 pop r15 vzeroupper ret ``` ## After (from my machine) Note no `rsp` manipulation, sorry for different ASM syntax ```x86 swap_m256_slice: cmpq %r9, %rdx jne .LBB1_6 testq %rdx, %rdx je .LBB1_6 cmpq $1, %rdx jne .LBB1_7 xorl %r10d, %r10d jmp .LBB1_4 .LBB1_7: movq %rdx, %r9 andq $-2, %r9 movl $32, %eax xorl %r10d, %r10d .p2align 4, 0x90 .LBB1_8: vmovaps -32(%rcx,%rax), %ymm0 vmovaps -32(%r8,%rax), %ymm1 vmovaps %ymm1, -32(%rcx,%rax) vmovaps %ymm0, -32(%r8,%rax) vmovaps (%rcx,%rax), %ymm0 vmovaps (%r8,%rax), %ymm1 vmovaps %ymm1, (%rcx,%rax) vmovaps %ymm0, (%r8,%rax) addq $2, %r10 addq $64, %rax cmpq %r10, %r9 jne .LBB1_8 .LBB1_4: testb $1, %dl je .LBB1_6 shlq $5, %r10 vmovaps (%rcx,%r10), %ymm0 vmovaps (%r8,%r10), %ymm1 vmovaps %ymm1, (%rcx,%r10) vmovaps %ymm0, (%r8,%r10) .LBB1_6: vzeroupper retq ``` </details> This does all its copying operations as either the original type or as `MaybeUninit`s, so as far as I know there should be no potential abstract machine issues with reading padding bytes as integers. <details> <summary>Perf is essentially unchanged</summary> Though perhaps with more target features this would help more, if it could pick bigger chunks ## Before ``` running 10 tests test slice::swap_with_slice_4x_usize_30 ... bench: 894 ns/iter (+/- 11) test slice::swap_with_slice_4x_usize_3000 ... bench: 99,476 ns/iter (+/- 2,784) test slice::swap_with_slice_5x_usize_30 ... bench: 1,257 ns/iter (+/- 7) test slice::swap_with_slice_5x_usize_3000 ... bench: 139,922 ns/iter (+/- 959) test slice::swap_with_slice_rgb_30 ... bench: 328 ns/iter (+/- 27) test slice::swap_with_slice_rgb_3000 ... bench: 16,215 ns/iter (+/- 176) test slice::swap_with_slice_u8_30 ... bench: 312 ns/iter (+/- 9) test slice::swap_with_slice_u8_3000 ... bench: 5,401 ns/iter (+/- 123) test slice::swap_with_slice_usize_30 ... bench: 368 ns/iter (+/- 3) test slice::swap_with_slice_usize_3000 ... bench: 28,472 ns/iter (+/- 3,913) ``` ## After ``` running 10 tests test slice::swap_with_slice_4x_usize_30 ... bench: 868 ns/iter (+/- 36) test slice::swap_with_slice_4x_usize_3000 ... bench: 99,642 ns/iter (+/- 1,507) test slice::swap_with_slice_5x_usize_30 ... bench: 1,194 ns/iter (+/- 11) test slice::swap_with_slice_5x_usize_3000 ... bench: 139,761 ns/iter (+/- 5,018) test slice::swap_with_slice_rgb_30 ... bench: 324 ns/iter (+/- 6) test slice::swap_with_slice_rgb_3000 ... bench: 15,962 ns/iter (+/- 287) test slice::swap_with_slice_u8_30 ... bench: 281 ns/iter (+/- 5) test slice::swap_with_slice_u8_3000 ... bench: 5,324 ns/iter (+/- 40) test slice::swap_with_slice_usize_30 ... bench: 275 ns/iter (+/- 5) test slice::swap_with_slice_usize_3000 ... bench: 28,277 ns/iter (+/- 277) ``` </detail>

…ointer, r=michaelwoerister properly handle fat pointers to uninhabitable types Calculate the pointee metadata size by using `tcx.struct_tail_erasing_lifetimes` instead of duplicating the logic in `fat_pointer_kind`. Open to alternatively suggestions on how to fix this. Fixes rust-lang#94149 r? ````@michaelwoerister```` since you touched this code last, I think!

…i-obk Normalize main return type during mono item collection & codegen The issue can be observed with `-Zprint-mono-items=lazy` in: ```rust #![feature(termination_trait_lib)] fn main() -> impl std::process::Termination { } ``` ``` BEFORE: MONO_ITEM fn std::rt::lang_start::<impl std::process::Termination> ````@@```` t.93933fa2-cgu.2[External] AFTER: MONO_ITEM fn std::rt::lang_start::<()> ````@@```` t.df56e625-cgu.1[External] ```

update auto trait lint for `PhantomData` cc rust-lang#93367 (comment)

…unescaping, r=petrochenkov Improve string literal unescaping Some easy wins that affect a few popular crates. r? ```@matklad```

…etrochenkov Avoid emitting full macro body into JSON errors While investigating rust-lang#94322, it was noted that currently the JSON diagnostics for macro backtraces include the full def_site span -- the whole macro body. It seems like this shouldn't be necessary, so this PR adjusts the span to just be the "guessed head", typically the macro name. It doesn't look like we keep enough information to synthesize a nicer span here at this time. Atop rust-lang#92123, this reduces output for the src/test/ui/suggestions/missing-lifetime-specifier.rs test from 660 KB to 156 KB locally.

Dylan-DPC · 2022-02-24T20:44:03Z

@bors r+ rollup=never p=5

bors · 2022-02-24T20:44:04Z

📌 Commit 3bd163f has been approved by Dylan-DPC

bors · 2022-02-24T22:29:17Z

⌛ Testing commit 3bd163f with merge 4e82f35...

bors · 2022-02-25T00:45:59Z

☀️ Test successful - checks-actions
Approved by: Dylan-DPC
Pushing 4e82f35 to master...

rust-timer · 2022-02-25T02:17:22Z

Finished benchmarking commit (4e82f35): comparison url.

Summary: This benchmark run shows 14 relevant improvements 🎉 but 9 relevant regressions 😿 to instruction counts.

Arithmetic mean of relevant regressions: 0.8%
Arithmetic mean of relevant improvements: -5.9%
Arithmetic mean of all relevant changes: -3.3%
Largest improvement in instruction counts: -14.4% on incr-unchanged builds of encoding check
Largest regression in instruction counts: 1.1% on incr-patched: println builds of cargo opt

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

RalfJung · 2022-02-25T18:06:31Z

This rollup also had surprising consequences in Miri: starting with this PR, Miri complains about UB in the libtest harness (see rust-lang/miri#1986).

EDIT: The culprit appears to be #94212

scottmcm and others added 27 commits February 21, 2022 00:54

Improve CheckCfg internal representation

da896d3

Add test for well known names defined by rustc

fbe1c15

Improve diagnostic of the unexpected_cfgs lint

3d23477

Continue improvements on the --check-cfg implementation

8d3de56

- Test the combinations of --check-cfg with partial values() and --cfg - Test that we detect unexpected value when none are expected

Add compiler flag --check-cfg to the unstable book

a556a2a

properly handle fat pointers to uninhabitable types

c73a2f8

Normalize main return type during mono item collection & codegen

f047af2

Inline a hot closure in from_lit_token.

44308dc

The change looks big because `rustfmt` rearranges things, but the only real change is the inlining annotation.

update auto trait lint

70018c1

Avoid emitting full macro body into JSON

34319ff

better ObligationCause for normalization errors in can_type_implement…

8ba7436

…_copy

restore spans for issue-50480

ee98dc8

resolve: Fix incorrect results of opt_def_kind query for some built…

17b1afd

…-in macros Previously it always returned `MacroKind::Bang` while some of those macros are actually attributes and derives

metadata: Tweak the way in which declarative macros are encoded

50568b8

To make the `macro_rules` flag more readily available without decoding everything else

resolve/metadata: Stop encoding macros as reexports

179ce18

Update clippy tests

b91ec30

Rollup merge of rust-lang#91795 - petrochenkov:nomacreexport, r=cjgillot

6ba167a

resolve/metadata: Stop encoding macros as reexports Supersedes rust-lang#88335. r? `@cjgillot`

Rollup merge of rust-lang#94315 - lcnr:auto-trait-lint-update, r=oli-obk

9e7131a

update auto trait lint for `PhantomData` cc rust-lang#93367 (comment)

Rollup merge of rust-lang#94316 - nnethercote:improve-string-literal-…

ec44d48

…unescaping, r=petrochenkov Improve string literal unescaping Some easy wins that affect a few popular crates. r? ```@matklad```

rustbot added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. rollup A PR which is a rollup labels Feb 24, 2022

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Feb 24, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 25, 2022

bors merged commit 4e82f35 into rust-lang:master Feb 25, 2022

rustbot added this to the 1.61.0 milestone Feb 25, 2022

This was referenced Feb 25, 2022

Make AST->HIR lowering incremental #88186

Closed

Add debug assertions to some unsafe functions #92686

Merged

Include source file hash in crate_hash. #94301

Closed

Dylan-DPC deleted the rollup-7yxtywp branch February 25, 2022 01:59

rustbot added the perf-regression Performance regression. label Feb 25, 2022

RalfJung mentioned this pull request Feb 25, 2022

rustup rust-lang/miri#1986

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollup of 9 pull requests #94333

Rollup of 9 pull requests #94333

Dylan-DPC commented Feb 24, 2022

Dylan-DPC commented Feb 24, 2022

bors commented Feb 24, 2022

bors commented Feb 24, 2022

bors commented Feb 25, 2022

rust-timer commented Feb 25, 2022

RalfJung commented Feb 25, 2022 •

edited

Loading

Rollup of 9 pull requests #94333

Rollup of 9 pull requests #94333

Conversation

Dylan-DPC commented Feb 24, 2022

Dylan-DPC commented Feb 24, 2022

bors commented Feb 24, 2022

bors commented Feb 24, 2022

bors commented Feb 25, 2022

rust-timer commented Feb 25, 2022

RalfJung commented Feb 25, 2022 • edited Loading

RalfJung commented Feb 25, 2022 •

edited

Loading