Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

Closed
5 of 6 tasks
Tracked by #671
RalfJung opened this issue Nov 15, 2023 · 11 comments · Fixed by rust-lang/reference#1541
Closed
5 of 6 tasks
Tracked by #671

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

RalfJung opened this issue Nov 15, 2023 · 11 comments · Fixed by rust-lang/reference#1541
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature.

Comments

@RalfJung
Copy link
Member

RalfJung commented Nov 15, 2023

This issue tracks implementing the t-opsem decision in rust-lang/unsafe-code-guidelines#472. This will require adjustments in many places (codegen, Miri, library docs, reference, ...). The intention is to track here what needs to be done until the transition is complete.

Implementation history

@RalfJung RalfJung added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Nov 15, 2023
@bjorn3
Copy link
Member

bjorn3 commented Nov 16, 2023

cg_clif accepts ZST memory accesses and pointer offsets already. Pointer offsets are implemented as integer addition which doesn't have UB and ZST memory accesses never get turned into loads and stores in cranelift ir as there is no instruction that does so.

@RalfJung
Copy link
Member Author

memory accesses never get turned into loads and stores in cranelift ir as there is no instruction that does so.

Besides direct accesses, the other concerns are the copy, write_bytes, compare_bytes intrinsics. Those must be implemented in a way that they are not UB when elem_count*elem_size is 0.

@bjorn3
Copy link
Member

bjorn3 commented Nov 16, 2023

They are implemented by calling the respective libc functions which LLVM already expects to accept 0-sized accesses, right?

@RalfJung
Copy link
Member Author

GCC codegen might also need updating, Cc @antoyo @GuillaumeGomez

@RalfJung
Copy link
Member Author

They are implemented by calling the respective libc functions which LLVM already expects to accept 0-sized accesses, right?

Well what LLVM assumes doesn't matter for the cranelift backend, does it? ;) But more importantly, Rust explicitly assumes this itself as documented here.

@GuillaumeGomez
Copy link
Member

No problem. Please ping us when we need to update our part and thanks for the ping!

@RalfJung
Copy link
Member Author

Well I'm asking you if you need to update anything. :) You need to make sure that the Offset MIR binop is compiled in a way that offset by 0 bytes is always Defined Behavior even if the pointer operand is null or dangling or out of bounds or whatever.

I think zero-sized memory accesses disappear in the SSA codegen infrastructure before your backend even sees them so they should be fine.

And finally the copy, copy_nonoverlapping, write_bytes, compare_bytes intrinsics need to be lowered in a way that they are Defined Behavior when the size is 0, even if the pointers are null or dangling or whatever.

@antoyo
Copy link
Contributor

antoyo commented Nov 16, 2023

These intrinsics are implemented by calling the GCC builtin functions: memcmp, memset, memcpy, memmove.
I'll double-check, but it seems fine to have a count of zero, but not NULL pointers.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 16, 2023

Okay, something needs to change then in the backend because we'll allow null pointers for the Rust intrinsics.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 19, 2024

I updated #117329 to make it ready to land.

@rust-lang/opsem to address this concern, I made offset_from have library UB but not language UB on two pointers with the same address but provenance for different allocations. Please let me know what you think.

joshlf added a commit to joshlf/rust that referenced this issue May 11, 2024
Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue May 12, 2024
Update reference safety requirements

Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.

While we're here, we also require that references be non-null.

cc `@RalfJung`
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue May 12, 2024
Update reference safety requirements

Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.

While we're here, we also require that references be non-null.

cc ``@RalfJung``
rust-timer added a commit to rust-lang-ci/rust that referenced this issue May 12, 2024
Rollup merge of rust-lang#125021 - joshlf:patch-11, r=RalfJung

Update reference safety requirements

Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.

While we're here, we also require that references be non-null.

cc ``@RalfJung``
Billy-Sheppard added a commit to Billy-Sheppard/rust that referenced this issue May 13, 2024
reorganised attrs

removed OsStr impls

added backticks

Add note about possible allocation-sharing to Arc/Rc<str/[T]/CStr>::default.

Use shared statics for the ArcInner for Arc<str, CStr>::default, and for Arc<[T]>::default where alignof(T) <= 16.

fixed unsafe block

Revert "fixed unsafe block"

This reverts commit 6eb6aee.

Return coherent description for boolean instead of panicking

Improve check-cfg CLI errors with more structured diagnostics

Move various stdlib tests to library/std/tests

Run tidy on tests

Rename test for issue 21058

Implement `edition` method on `Rustdoc` type as well

Migrate `run-make/doctests-runtool` to rmake

Rename `run-make-support` library `output` method to `command_output`

Add new `output` method to `Rustc` and `Rustdoc` types

Migrate `run-make/rustdoc-error-lines` to `rmake.rs`

add f16 associated constants

NaN and infinity are not included as they require arithmetic.

add f128 associated constants

NaN and infinity are not included as they require arithmetic.

add constants in std::f16::consts

add constants in std::f128::consts

update error messages in ui tests

Document that `create_dir_all` calls `mkdir`/`CreateDirW` multiple times

Also mention that there might be leftover directories in the error case.

Prefer lower vtable candidates in select in new solver

Don't consider candidates with no failing where clauses

Use super_fold in RegionsToStatic visitor

Make check-cfg docs more user-friendly

Record impl args in the InsepctCandiate rather than rematching during select

Use correct ImplSource for alias bounds

BorrowckInferCtxt: infcx by value

borrowck: more eagerly prepopulate opaques

switch new solver to directly inject opaque types

Update books

Adjust dbg.value/dbg.declare checks for LLVM update

llvm/llvm-project#89799 changes llvm.dbg.value/declare intrinsics to be in a different, out-of-instruction-line representation. For example
  call void @llvm.dbg.declare(...)
becomes
  #dbg_declare(...)

Update tests accordingly to work with both the old and new way.

Adjust 64-bit ARM data layouts for LLVM update

LLVM has updated data layouts to specify `Fn32` on 64-bit ARM to avoid
C++ accidentally underaligning functions when trying to comply with
member function ABIs.

This should only affect Rust in cases where we had a similar bug (I
don't believe we have one), but our data layout must match to generate
code.

As a compatibility adaptatation, if LLVM is not version 19 yet, `Fn32`
gets voided from the data layout.

See llvm/llvm-project#90415

Update version of cc crate to v1.0.97

Reason:

In order to build the Windows version of the Rust toolchain for the Android platform, the following patch to the cc is crate is required to avoid incorrectly determining that we are building with the Android NDK: rust-lang/cc-rs@57853c4

This patch is present in version 1.0.80 and newer versions of the cc crate. The rustc source distribution currently has 3 different versions of cc in the vendor directory, only one of which has the necessary fix.

We (the Android Rust toolchain) are currently maintaining local patches to upgrade the cc crate dependency versions, which we would like to upstream.

Furthermore, beyond the specific reason, the cc crate in bootstrap is currently pinned at an old version due to problems in the past when trying to update it. It is worthwhile to figure out and resolve these problems so we can keep the dependency up-to-date.

Other fixes:

As of cc v1.0.78, object files are prefixed with a 16-character hash.
Update src/bootstrap/src/core/build_steps/llvm.rs to account for this to
avoid failures when building libunwind and libcrt. Note that while the hash
prefix was introduced in v1.0.78, in order to determine the names of the
object files without scanning the directory, we rely on the compile_intermediates
method, which was introduced in cc v1.0.86

As of cc v1.0.86, compilation on MacOS uses the -mmacosx-version-min flag.
A long-standing bug in the CMake rules for compiler-rt causes compilation
to fail when this flag is specified. So we add a workaround to suppress this
flag.

Updating to cc v1.0.91 and newer requires fixes to bootstrap unit tests.
The unit tests use targets named "A", "B", etc., which fail a validation
check introduced in 1.0.91 of the cc crate.

Implement lldb formattter for "clang encoded" enums (LLDB 18.1+)
Summary:
I landed a fix last year to enable `DW_TAG_variant_part` encoding in LLDBs (https://reviews.llvm.org/D149213). This PR is a corresponding fix in synthetic formatters to decode that information.
This is in no way perfect implementation but at least it improves the status quo. But most types of enums will be visible and debuggable in some way.
I've also updated most of the existing tests that touch enums and re-enabled test cases based on LLDB for enums.

Test Plan:
ran tests `./x test tests/debuginfo/`. Also tested manually in LLDB CLI and LLDB VSCode

Other Thoughs
A better approach would probably be adopting [formatters from codelldb](https://github.com/vadimcn/codelldb/blob/master/formatters/rust.py). There is some neat hack that hooks up summary provider via synthetic provider which can ultimately fix more display issues for Rust types and enums too. But getting it to work well might take more time that I have right now.

f16::is_sign_{positive,negative} were feature-gated on f128

Correct the const stabilization of `last_chunk` for slices

`<[T]>::last_chunk` should have become const stable as part of
<rust-lang#117561>. Update the const
stability gate to reflect this.

Add tests

Lower never patterns to Unreachable in mir

rustdoc: dedup search form HTML

This change constructs the search form HTML using JavaScript, instead of plain HTML. It uses a custom element because

- the [parser]'s insert algorithm runs the connected callback synchronously, so we won't get layout jank
- it requires very little HTML, so it's a real win in size

[parser]: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token

This shrinks the standard library by about 60MiB, by my test.

rustdoc: allow custom element rustdoc-search

generalize hr alias: avoid unconstrainable infer vars

narrow down visibilities in `rustc_parse::lexer`

replace another Option<Span> by DUMMY_SP

Don't ICE when we cannot eval a const to a valtree in the new solver

Do not ICE on `AnonConst`s in `diagnostic_hir_wf_check`

coverage: Add branch coverage support for let-else

coverage: Add branch coverage support for if-let and let-chains

Do not ICE on foreign malformed `diagnostic::on_unimplemented`

Fix rust-lang#124651.

Add test for rust-lang#124651

Update cargo

compiler: Privatize `Parser::current_closure`

This was added as pub in 2021 and remains only privately used in 2024!

compiler: derive Debug in parser

It's annoying to debug the parser if you have to stop every five seconds
to add a Debug impl.

compiler: add `Parser::debug_lookahead`

I tried debugging a parser-related issue but found it annoying to not be
able to easily peek into the Parser's token stream.

Add a convenience fn that offers an opinionated view into the parser,
but one that is useful for answering basic questions about parser state.

Fuchsia test runner: fixup script

This commit fixes several issues in the fuchsia-test-runner.py script:

1. Migrate from `pm` to `ffx` for package management, as `pm` is now
deprecated. Furthermore, the `pm` calls used in this script no longer
work at Fuchsia's HEAD. This is the largest change in this commit, and
impacts all steps around repository management (creation and
registration of the repo, as well as package publishing).

2. Allow for `libtest` to be either statically or dynamically linked.
The script assumed it was dynamically linked, but the current Rust
behavior at HEAD is to statically link it.

3. Minor cleanup to use `ffx --machine json` rather than string parsing.

4. Minor cleanup to the docs around the script.

std::net: Socket::new_raw set to SO_NOSIGPIPE on freebsd/netbsd/dragonfly.

add note about `AlreadyExists` to `create_new`

Apply suggestions from code review

Co-authored-by: Jubilee <[email protected]>

iOS/tvOS/watchOS/visionOS: Default to kernel-defined backlog in listen

This behavior is defined in general for the XNU kernel, not just macOS:
https://github.com/apple-oss-distributions/xnu/blob/rel/xnu-10002/bsd/kern/uipc_socket.c

iOS/tvOS/watchOS/visionOS: Set the main thread name

Tested in the iOS simulator that the thread name is not set by default,
and that setting it improves the debugging experience in lldb / Xcode.

iOS/tvOS/watchOS: Fix alloc w. large alignment on older versions

Tested on an old MacBook and the iOS simulator.

iOS/tvOS/watchOS/visionOS: Fix reading large files

Tested in the iOS simulator with something like:
```
let mut buf = vec![0; c_int::MAX as usize - 1 + 2];
let read_bytes = f.read(&mut buf).unwrap();
```

iOS/tvOS/watchOS/visionOS: Improve File Debug impl

This uses `libc::fcntl`, which, while not explicitly marked as available
in the headers, is already used by `File::sync_all` and `File::sync_data`
on these platforms, so should be fine to use here as well.

next_power_of_two: add a doctest to show what happens on 0

rustc: Change LLVM target for the wasm32-wasip2 Rust target

This commit changes the LLVM target of for the Rust `wasm32-wasip2`
target to `wasm32-wasip2` as well. LLVM does a bit of detection on the
target string to know when to call `wasm-component-ld` vs `wasm-ld` so
otherwise clang is invoking the wrong linker.

rustc: Don't pass `-fuse-ld=lld` on wasm targets

This argument isn't necessary for WebAssembly targets since `wasm-ld` is
the only linker for the targets. Passing it otherwise interferes with
Clang's linker selection on `wasm32-wasip2` so avoid it altogether.

rustc: Change wasm32-wasip2 to PIC-by-default

This commit changes the new `wasm32-wasip2` target to being PIC by
default rather than the previous non-PIC by default. This change is
intended to make it easier for the standard library to be used in a
shared object in its precompiled form. This comes with a hypothetical
modest slowdown but it's expected that this is quite minor in most use
cases or otherwise wasm compilers and/or optimizing runtimes can elide
the cost.

Handle normalization failure in `struct_tail_erasing_lifetimes`

Fixes an ICE that occurred when the struct in question has an error

Fix insufficient logic when searching for the underlying allocation

in the `invalid_reference_casting` lint, when trying to lint on
bigger memory layout casts.

rustdoc: use stability, instead of features, to decide what to show

To decide if internal items should be inlined in a doc page,
check if the crate is itself internal, rather than if it has
the rustc_private feature flag. The standard library uses
internal items, but is not itself internal and should not show
internal items on its docs pages.

Avoid a cast in `ptr::slice_from_raw_parts(_mut)`

Casting to `*const ()` or `*mut ()` just bloats the MIR, so let's not.

If ACP#362 goes through we can keep calling `ptr::from_raw_parts(_mut)` in these also without the cast, but that hasn't had any libs-api attention yet, so I'm not waiting on it.

add enum variant field names to make the code clearer

remove redundant flat vs nested distinction to simplify enum

turn all_nested_unused into used_childs

store the span of the nested part of the use tree in the ast

 remove braces when fixing a nested use tree into a single use

Use generic `NonZero` in examples.

Simplify `clippy` lint.

Simplify suggestion.

Use generic `NonZero`.

crashes: add lastest batch of crash tests

Make sure we don't deny macro vars w keyword names

Simplify `use crate::rustc_foo::bar` occurrences.

They can just be written as `use rustc_foo::bar`, which is far more
standard. (I didn't even know that a `crate::` prefix was valid.)

Update cc crate to v1.0.97

Ignore empty RUSTC_WRAPPER in bootstrap

This change ignores the RUSTC_WRAPPER_REAL environment variable if it's
set to the empty string. This matches cargo behaviour and allows users
to easily shadow a globally set RUSTC_WRAPPER (which they might have set
for non-rustc projects).

Handle normalization failure in `struct_tail_erasing_lifetimes`

Fixes an ICE that occurred when the struct in question has an error

Implement `as_chunks` with `split_at_unchecked`

Remove `macro_use` from `stable_hasher`.

Normal `use` items are nicer.

Reorder top-level crate items.

- `use` before `mod`
- `pub` before `non-pub`
- Alphabetical order within sections

Remove `extern crate tracing`.

`use` is a nicer way of doing things.

Document `Pu128`.

And move the `repr` line after the `derive` line, where it's harder to
overlook. (I overlooked it initially, and didn't understand how this
type worked.)

Remove `TinyList`.

It is optimized for lists with a single element, avoiding the need for
an allocation in that case. But `SmallVec<[T; 1]>` also avoids the
allocation, and is better in general: more standard, log2 number of
allocations if the list exceeds one item, and a much more capable API.

This commit removes `TinyList` and converts the two uses to
`SmallVec<[T; 1]>`. It also reorders the `use` items in the relevant
file so they are in just two sections (`pub` and non-`pub`), ordered
alphabetically, instead of many sections. (This is a relevant part of
the change because I had to decide where to add a `use` item for
`SmallVec`.)

Remove `vec_linked_list`.

It provides a way to effectively embed a linked list within an
`IndexVec` and also iterate over that list. It's written in a very
generic way, involving two traits `Links` and `LinkElem`. But the
`Links` trait is only impl'd for `IndexVec` and `&IndexVec`, and the
whole thing is only used in one module within `rustc_borrowck`. So I
think it's over-engineered and hard to read. Plus it has no comments.

This commit removes it, and adds a (non-generic) local iterator for the
use within `rustc_borrowck`. Much simpler.

Remove `enum_from_u32`.

It's a macro that just creates an enum with a `from_u32` method. It has
two arms. One is unused and the other has a single use.

This commit inlines that single use and removes the whole macro. This
increases readability because we don't have two different macros
interacting (`enum_from_u32` and `language_item_table`).

Update Tests

Fix Error Messages for `break` Inside Coroutines

Previously, `break` inside `gen` blocks and functions
were incorrectly identified to be enclosed by a closure.

This PR fixes it by displaying an appropriate error message
for async blocks, async closures, async functions, gen blocks,
gen closures, gen functions, async gen blocks, async gen closures
and async gen functions.

Note: gen closure and async gen closure are not supported by the
compiler yet but I have added an error message here assuming that
they might be implemented in the future.

Also, fixes grammar in a few places by replacing
`inside of a $coroutine` with `inside a $coroutine`.

Migrate `run-make/rustdoc-map-file` to rmake

Add more ICEs due to malformed diagnostic::on_unimplemented

Fix ICEs in diagnostic::on_unimplemented

Handle field projections like slice indexing in invalid_reference_casting

Fix typos

Do not add leading asterisk in the `PartialEq`

Adding leading asterisk can cause compilation failure for
the _types_ that don't implement the `Copy`.

Use sum type for `WorkflowRunType`

Parse try build CI job name from commit message

Make the regex more robust

Address review comments

CI: fix auto builds and make sure that we always have at least a single CI job

Include the line number in tidy's `iter_header`

Tidy check for test revisions that are mentioned but not declared

If a `[revision]` name appears in a test header directive or error annotation,
but isn't declared in the `//@ revisions:` header, that is almost always a
mistake.

In cases where a revision needs to be temporarily disabled, adding it to an
`//@ unused-revision-names:` header will suppress these checks for that name.

Adding the wildcard name `*` to the unused list will suppress these checks for
the entire file.

Fix test problems discovered by the revision check

Most of these changes either add revision names that were apparently missing,
or explicitly mark a revision name as currently unused.

fix rust-lang#124714 str.to_lowercase sigma handling

Make a minimal amount of region APIs public

Add `ErrorGuaranteed` to `Recovered::Yes` and use it more.

The starting point for this was identical comments on two different
fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`:
```
    // FIXME: investigate making this a `Option<ErrorGuaranteed>`
    recovered: bool
```
I tried that, and then found that I needed to add an `ErrorGuaranteed`
to `Recovered::Yes`. Then I ended up using `Recovered` instead of
`Option<ErrorGuaranteed>` for these two places and elsewhere, which
required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`.

This makes things more consistent, because `Recovered` is used in more
places, and there are fewer uses of `bool` and
`Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible
to set `recovered` to `Recovered::Yes` without having emitted an error.

interpret/miri: better errors on failing offset_from

chore: remove repetitive words

Make `#![feature]` suggestion MaybeIncorrect

Update Makefiles with explanatory comments

correct comments

add FIXME

Upgrade the version of Clang used in the build, move MSVC builds to Server 2022

Rename Generics::params to Generics::own_params

Add benchmarks for `impl Debug for str`

In order to inform future perf improvements and prevent regressions,
lets add some benchmarks that stress `impl Debug for str`.

Remove unused `step_trait` feature.

Also sort the features.

Remove unused `LinkSelfContainedDefault::is_linker_enabled` method.

Correct a comment.

I tried simplifying `RegionCtxt`, which led me to finding that the
fields are printed in `sccs_info`.

Fix up `DescriptionCtx::new`.

The comment mentions that `ReBound` and `ReVar` aren't expected here.
Experimentation with the full test suite indicates this is true, and
that `ReErased` also doesn't occur. So the commit introduces `bug!` for
those cases. (If any of them show up later on, at least we'll have a
test case.)

The commit also remove the first sentence in the comment.
`RePlaceholder` is now handled in the match arm above this comment and
nothing is printed for it, so that sentence is just wrong. Furthermore,
issue rust-lang#13998 was closed some time ago.

Fix out-of-date comment.

The type name has changed.

Remove `TyCtxt::try_normalize_erasing_late_bound_regions`.

It's unused.

Remove out-of-date comment.

The use of `Binder` was removed in the recent rust-lang#123900, but the comment
wasn't removed at the same time.

De-tuple two `vtable_trait_first_method_offset` args.

Thus eliminating a `FIXME` comment.

opt-dist: use xz2 instead of xz crate

xz crate consist of simple reexport of xz2 crate. Why? Idk.

analyse visitor: build proof tree in probe

update crashes

always use `GenericArgsRef`

Inline and remove unused methods.

`InferCtxt::next_{ty,const,int,float}_var_id` each have a single call
site, in `InferCtt::next_{ty,const,int,float}_var` respectively.

The only remaining method that creates a var_id is
`InferCtxt::next_ty_var_id_in_universe`, which has one use outside the
crate.

Use fewer origins when creating type variables.

`InferCtxt::next_{ty,const}_var*` all take an origin, but the
`param_def_id` is almost always `None`. This commit changes them to just
take a `Span` and build the origin within the method, and adds new
methods for the rare cases where `param_def_id` might not be `None`.
This avoids a lot of tedious origin building.

Specifically:
- next_ty_var{,_id_in_universe,_in_universe}: now take `Span` instead of
  `TypeVariableOrigin`
- next_ty_var_with_origin: added

- next_const_var{,_in_universe}: takes Span instead of ConstVariableOrigin
- next_const_var_with_origin: added

- next_region_var, next_region_var_in_universe: these are unchanged,
  still take RegionVariableOrigin

The API inconsistency (ty/const vs region) seems worth it for the
large conciseness improvements.

print walltime benchmarks with subnanosecond precision

example results when benchmarking 1-4 serialized ADD instructions

```
running 4 tests
test add  ... bench:           0.24 ns/iter (+/- 0.00)
test add2 ... bench:           0.48 ns/iter (+/- 0.01)
test add3 ... bench:           0.72 ns/iter (+/- 0.01)
test add4 ... bench:           0.96 ns/iter (+/- 0.01)
```

emit fractional benchmark nanoseconds in libtest's JSON output format

bootstrap should also render fractional nanoseconds for benchmarks

from_str_radix: outline only the panic function

codegen: memmove/memset cannot be non-temporal

coverage: Separately compute the set of BCBs with counter mappings

coverage: Make the special case for async functions exit early

coverage: Don't recompute the number of test vector bitmap bytes

The code in `extract_mcdc_mappings` that allocates these bytes already knows
how many are needed in total, so there's no need to immediately recompute that
value in the calling function.

coverage: Destructure the mappings struct to make sure we don't miss any

coverage: Rename `CoverageSpans` to `ExtractedMappings`

coverage: Tidy imports in `rustc_mir_transform::coverage`

Fix parse error message for meta items

Refactor float `Primitive`s to a separate `Float` type

Migrate `run-make/rustdoc-output-path` to rmake

Make builtin_deref just return a Ty

rename some variants in FulfillmentErrorCode

Remove glob imports for ObligationCauseCode

Rename some ObligationCauseCode variants

More rename fallout

Name tweaks

Add a codegen test for transparent aggregates

Aggregating arrays can always take the place path

Make SSA aggregates without needing an alloca

Lift `Lift`

Lift `TraitRef` into `rustc_type_ir`

Also debug

Apply nits, make some bounds into supertraits on inherent traits

Add `-lmingwex` second time in `mingw_libs`

Upcoming mingw-w64 releases will contain small math functions refactor which moved implementation around.
As a result functions like `lgamma`
now depend on libraries in this order:
`libmingwex.a` -> `libmsvcrt.a` -> `libmingwex.a`.

Fixes rust-lang#124221

ignore generics args in attribute paths

bootstrap: add comments for the automatic dry run

fix typo

Co-authored-by: jyn <[email protected]>

reachable computation: extend explanation of what this does, and why

Make sure we consume a generic arg when checking mistyped turbofish

Update cargo

std::rand: adding solaris/illumos for getrandom support.

To help solarish support for miri https://rust-lang/miri/issues/3567

Update ena to 0.14.3

Fix typo in ManuallyDrop's documentation

Add @saethlin to some triagebot groups

Refactor Apple `target_abi`

This was bundled together with `Arch`, which complicated a few code
paths and meant we had to do more string matching than necessary.

Match ergonomics 2024: let `&` patterns eat `&mut`

Various fixes:

- Only show error when move-check would not be triggered
- Add structured suggestion

Fix spans when macros are involved

Comments and fixes

Rename `explicit_ba`

No more `Option<Option<>>`

Remove redundant comment

Move all ref pat logic into `check_pat_ref`

Add comment on `cap_to_weakly_not`

Co-authored-by: Guillaume Boisseau <[email protected]>

Stabilize `byte_slice_trim_ascii` for `&[u8]`/`&str`

Remove feature from documentation examples
Add rustc_const_stable attribute to stabilized functions
Update intra-doc link for `u8::is_ascii_whitespace` on `&[u8]` functions

Document proper usage of `fmt::Error` and `fmt()`'s `Result`.

Documentation of these properties previously existed in a lone paragraph
in the `fmt` module's documentation:
<https://doc.rust-lang.org/1.78.0/std/fmt/index.html#formatting-traits>
However, users looking to implement a formatting trait won't necessarily
look there. Therefore, let's add the critical information (that
formatting per se is infallible) to all the involved items.

check if `x test tests` missing any test directory

Signed-off-by: onur-ozkan <[email protected]>

remap missing path `tests/crashes` to `tests`

Signed-off-by: onur-ozkan <[email protected]>

add "tidy-alphabetical" check on "tests" remap list

Signed-off-by: onur-ozkan <[email protected]>

Handle Deref expressions in invalid_reference_casting

unix/fs: a bit of cleanup around host-specific code

solaris support start.

reduce tokio features

remove rand test

the actual target-specific things we want to test are all in getrandom,
and rand already tests miri itself

getrandom: test with and without isolation

also add some comments for why we keep certain old obscure APIs supported

avoid code duplication between realloc and malloc

Implement wcslen

organize libc tests into a proper folder, and run some of them on Windows

README: update introduction

remove problems that I do not think we have seen in a while

io::Error handling: keep around the full io::Error for longer so we can give better errors

Implement non-null pointer for malloc(0)

Allow test targets to be set via CLI args

Update CI script for the miri-script test changes

Update documentation for miri-script test changes

minor tweaks

make MIRI_TEST_TARGET entirely an internal thing

make RUSTC_BLESS entirely an internal thing

do not run symlink tests on Windows hosts

rename 'extern-so' to 'native-lib'

Preparing for merge from rustc

alloc: update comments around malloc() alignment

separate windows heap functions from C heap shims

Add windows_i686_gnullvm to the list

Pin libc back to 0.2.153

Update Cargo.lock

fix few typo in filecheck annotations

Consolidate obligation cause codes for where clauses

Clean up users of rust_dbg_call

Enable profiler for armv7-unknown-linux-gnueabihf.

Always hide private fields in aliased type

Migrate `run-make/rustdoc-shared-flags` to rmake

Relax allocator requirements on some Rc APIs.

* Remove A: Clone bound from Rc::assume_init, Rc::downcast, and Rc::downcast_unchecked.
* Make From<Rc<[T; N]>> for Rc<[T]> allocator-aware.

Internal changes:

* Made Arc::internal_into_inner_with_allocator method into Arc::into_inner_with_allocator associated fn.
* Add private Rc::into_inner_with_allocator (to match Arc), so other fns don't have to juggle ManuallyDrop.

Relax A: Clone requirement on Rc/Arc::unwrap_or_clone.

Add test for rust-lang#122775

Refactoring after the `PlaceValue` addition

I added `PlaceValue` in 123775, but kept that one line-by-line simple because it touched so many places.

This goes through to add more helpers & docs, and change some `PlaceRef` to `PlaceValue` where the type didn't need to be included.

No behaviour changes.

Make it possible to derive Lift/TypeVisitable/TypeFoldable in rustc_type_ir

Uplift `TraitPredicate`

Uplift `ExistentialTraitRef`, `ExistentialProjection`, `ProjectionPredicate`

Uplift `NormalizesTo`, `CoercePredicate`, and `SubtypePredicate`

Apply nits, uplift ExistentialPredicate too

And `ImplPolarity` too

Expand on expr_requires_semi_to_be_stmt documentation

Mark expr_requires_semi_to_be_stmt call sites

For each of these, we need to decide whether they need to be using
`expr_requires_semi_to_be_stmt`, or `expr_requires_comma_to_be_match_arm`,
which are supposed to be 2 different behaviors. Previously they were
conflated into one, causing either too much or too little
parenthesization.

Macro call with braces does not require semicolon to be statement

This commit by itself is supposed to have no effect on behavior. All of
the call sites are updated to preserve their previous behavior.

The behavior changes are in the commits that follow.

Add ExprKind::MacCall statement boundary tests

Fix pretty printer statement boundaries after braced macro call

Delete MacCall case from pretty-printing semicolon after StmtKind::Expr

I didn't figure out how to reach this condition with `expr` containing
`ExprKind::MacCall`. All the approaches I tried ended up with the macro
call ending up in the `StmtKind::MacCall` case below instead.

In any case, from visual inspection this is a bugfix. If we do end up
with a `StmtKind::Expr` containing `ExprKind::MacCall` with brace
delimiter, it would not need ";" printed after it.

Add test of unused_parens lint involving macro calls

Document the situation with unused_parens lint and braced macro calls

Add parser tests for statement boundary insertion

Mark Parser::expr_is_complete call sites

Document MacCall special case in Parser::expr_is_complete

Document MacCall special case in Parser::parse_arm

Add macro calls to else-no-if parser test

Remove MacCall special case from recovery after missing 'if' after 'else'

The change to the test is a little goofy because the compiler was
guessing "correctly" before that `falsy! {}` is the condition as opposed
to the else body. But I believe this change is fundamentally correct.
Braced macro invocations in statement position are most often item-like
(`thread_local! {...}`) as opposed to parenthesized macro invocations
which are condition-like (`cfg!(...)`).

Remove MacCall special cases from Parser::parse_full_stmt

It is impossible for expr here to be a braced macro call. Expr comes
from `parse_stmt_without_recovery`, in which macro calls are parsed by
`parse_stmt_mac`. See this part:

    let kind = if (style == MacStmtStyle::Braces
        && self.token != token::Dot
        && self.token != token::Question)
        || self.token == token::Semi
        || self.token == token::Eof
    {
        StmtKind::MacCall(P(MacCallStmt { mac, style, attrs, tokens: None }))
    } else {
        // Since none of the above applied, this is an expression statement macro.
        let e = self.mk_expr(lo.to(hi), ExprKind::MacCall(mac));
        let e = self.maybe_recover_from_bad_qpath(e)?;
        let e = self.parse_expr_dot_or_call_with(e, lo, attrs)?;
        let e = self.parse_expr_assoc_with(
            0,
            LhsExpr::AlreadyParsed { expr: e, starts_statement: false },
        )?;
        StmtKind::Expr(e)
    };

A braced macro call at the head of a statement is always either extended
into ExprKind::Field / MethodCall / Await / Try / Binary, or else
returned as StmtKind::MacCall. We can never get a StmtKind::Expr
containing ExprKind::MacCall containing brace delimiter.

Add classify::expr_is_complete

Fix redundant parens around braced macro call in match arms

use key-value format in stage0 file

Currently, we are working on the python removal task on bootstrap. Which means
we have to extract some data from the stage0 file using shell scripts. However,
parsing values from the stage0.json file is painful because shell scripts don't
have a built-in way to parse json files.

This change simplifies the stage0 file format to key-value pairs, which makes
it easily readable from any environment.

Signed-off-by: onur-ozkan <[email protected]>

awk stage0 file on CI

Signed-off-by: onur-ozkan <[email protected]>

use stage0 file in `bootstrap.py`

Signed-off-by: onur-ozkan <[email protected]>

use shared stage0 parser from `build_helper`

Signed-off-by: onur-ozkan <[email protected]>

remove outdated stage0.json parts

Signed-off-by: onur-ozkan <[email protected]>

move comments position in `src/stage0`

Signed-off-by: onur-ozkan <[email protected]>

io::Write::write_fmt: panic if the formatter fails when the stream does not fail

std::alloc: using posix_memalign instead of memalign on solarish.

simpler code path since small alignments are already taking care of.
close rust-langGH-124787

Relax slice safety requirements

Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.

References must also be non-null

Add `crate_type` method to `Rustdoc`

Add `crate_name` method to `Rustdoc` and `Rustc`

Add `python_command` and `source_path` functions

Add `extern_` method to `Rustdoc`

Migrate `rustdoc-scrape-examples-ordering` to `rmake`

Fix some minor issues from the ui-test auto-porting

solve: replace all `debug` with `trace`

structurally important functions to `debug`

fix hidden title in command-line-arguments docs

Assert that MemCategorizationVisitor actually errors when it bails ungracefully

Inline MemCategorization into ExprUseVisitor

Remove unncessary mut ref

Introduce TypeInformationCtxt to abstract over LateCtxt/FnCtxt

Make LateCtxt be a type info delegate for EUV for clippy

Try structurally resolve

Apply nits

Propagate errors rather than using return_if_err

Match ergonomics 2024: migration lint

Unfortunately, we can't always offer a machine-applicable suggestion when there are subpatterns from macro expansion.

Co-Authored-By: Guillaume Boisseau <[email protected]>

Add AST pretty-printer tests for let-else

Pretty-print let-else with added parenthesization when needed

rename
Billy-Sheppard added a commit to Billy-Sheppard/rust that referenced this issue May 13, 2024
reorganised attrs

removed OsStr impls

added backticks

Add note about possible allocation-sharing to Arc/Rc<str/[T]/CStr>::default.

Use shared statics for the ArcInner for Arc<str, CStr>::default, and for Arc<[T]>::default where alignof(T) <= 16.

fixed unsafe block

Revert "fixed unsafe block"

This reverts commit 6eb6aee.

Return coherent description for boolean instead of panicking

Improve check-cfg CLI errors with more structured diagnostics

Move various stdlib tests to library/std/tests

Run tidy on tests

Rename test for issue 21058

Implement `edition` method on `Rustdoc` type as well

Migrate `run-make/doctests-runtool` to rmake

Rename `run-make-support` library `output` method to `command_output`

Add new `output` method to `Rustc` and `Rustdoc` types

Migrate `run-make/rustdoc-error-lines` to `rmake.rs`

add f16 associated constants

NaN and infinity are not included as they require arithmetic.

add f128 associated constants

NaN and infinity are not included as they require arithmetic.

add constants in std::f16::consts

add constants in std::f128::consts

update error messages in ui tests

Document that `create_dir_all` calls `mkdir`/`CreateDirW` multiple times

Also mention that there might be leftover directories in the error case.

Prefer lower vtable candidates in select in new solver

Don't consider candidates with no failing where clauses

Use super_fold in RegionsToStatic visitor

Make check-cfg docs more user-friendly

Record impl args in the InsepctCandiate rather than rematching during select

Use correct ImplSource for alias bounds

BorrowckInferCtxt: infcx by value

borrowck: more eagerly prepopulate opaques

switch new solver to directly inject opaque types

Update books

Adjust dbg.value/dbg.declare checks for LLVM update

llvm/llvm-project#89799 changes llvm.dbg.value/declare intrinsics to be in a different, out-of-instruction-line representation. For example
  call void @llvm.dbg.declare(...)
becomes
  #dbg_declare(...)

Update tests accordingly to work with both the old and new way.

Adjust 64-bit ARM data layouts for LLVM update

LLVM has updated data layouts to specify `Fn32` on 64-bit ARM to avoid
C++ accidentally underaligning functions when trying to comply with
member function ABIs.

This should only affect Rust in cases where we had a similar bug (I
don't believe we have one), but our data layout must match to generate
code.

As a compatibility adaptatation, if LLVM is not version 19 yet, `Fn32`
gets voided from the data layout.

See llvm/llvm-project#90415

Update version of cc crate to v1.0.97

Reason:

In order to build the Windows version of the Rust toolchain for the Android platform, the following patch to the cc is crate is required to avoid incorrectly determining that we are building with the Android NDK: rust-lang/cc-rs@57853c4

This patch is present in version 1.0.80 and newer versions of the cc crate. The rustc source distribution currently has 3 different versions of cc in the vendor directory, only one of which has the necessary fix.

We (the Android Rust toolchain) are currently maintaining local patches to upgrade the cc crate dependency versions, which we would like to upstream.

Furthermore, beyond the specific reason, the cc crate in bootstrap is currently pinned at an old version due to problems in the past when trying to update it. It is worthwhile to figure out and resolve these problems so we can keep the dependency up-to-date.

Other fixes:

As of cc v1.0.78, object files are prefixed with a 16-character hash.
Update src/bootstrap/src/core/build_steps/llvm.rs to account for this to
avoid failures when building libunwind and libcrt. Note that while the hash
prefix was introduced in v1.0.78, in order to determine the names of the
object files without scanning the directory, we rely on the compile_intermediates
method, which was introduced in cc v1.0.86

As of cc v1.0.86, compilation on MacOS uses the -mmacosx-version-min flag.
A long-standing bug in the CMake rules for compiler-rt causes compilation
to fail when this flag is specified. So we add a workaround to suppress this
flag.

Updating to cc v1.0.91 and newer requires fixes to bootstrap unit tests.
The unit tests use targets named "A", "B", etc., which fail a validation
check introduced in 1.0.91 of the cc crate.

Implement lldb formattter for "clang encoded" enums (LLDB 18.1+)
Summary:
I landed a fix last year to enable `DW_TAG_variant_part` encoding in LLDBs (https://reviews.llvm.org/D149213). This PR is a corresponding fix in synthetic formatters to decode that information.
This is in no way perfect implementation but at least it improves the status quo. But most types of enums will be visible and debuggable in some way.
I've also updated most of the existing tests that touch enums and re-enabled test cases based on LLDB for enums.

Test Plan:
ran tests `./x test tests/debuginfo/`. Also tested manually in LLDB CLI and LLDB VSCode

Other Thoughs
A better approach would probably be adopting [formatters from codelldb](https://github.com/vadimcn/codelldb/blob/master/formatters/rust.py). There is some neat hack that hooks up summary provider via synthetic provider which can ultimately fix more display issues for Rust types and enums too. But getting it to work well might take more time that I have right now.

f16::is_sign_{positive,negative} were feature-gated on f128

Correct the const stabilization of `last_chunk` for slices

`<[T]>::last_chunk` should have become const stable as part of
<rust-lang#117561>. Update the const
stability gate to reflect this.

Add tests

Lower never patterns to Unreachable in mir

rustdoc: dedup search form HTML

This change constructs the search form HTML using JavaScript, instead of plain HTML. It uses a custom element because

- the [parser]'s insert algorithm runs the connected callback synchronously, so we won't get layout jank
- it requires very little HTML, so it's a real win in size

[parser]: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token

This shrinks the standard library by about 60MiB, by my test.

rustdoc: allow custom element rustdoc-search

generalize hr alias: avoid unconstrainable infer vars

narrow down visibilities in `rustc_parse::lexer`

replace another Option<Span> by DUMMY_SP

Don't ICE when we cannot eval a const to a valtree in the new solver

Do not ICE on `AnonConst`s in `diagnostic_hir_wf_check`

coverage: Add branch coverage support for let-else

coverage: Add branch coverage support for if-let and let-chains

Do not ICE on foreign malformed `diagnostic::on_unimplemented`

Fix rust-lang#124651.

Add test for rust-lang#124651

Update cargo

compiler: Privatize `Parser::current_closure`

This was added as pub in 2021 and remains only privately used in 2024!

compiler: derive Debug in parser

It's annoying to debug the parser if you have to stop every five seconds
to add a Debug impl.

compiler: add `Parser::debug_lookahead`

I tried debugging a parser-related issue but found it annoying to not be
able to easily peek into the Parser's token stream.

Add a convenience fn that offers an opinionated view into the parser,
but one that is useful for answering basic questions about parser state.

Fuchsia test runner: fixup script

This commit fixes several issues in the fuchsia-test-runner.py script:

1. Migrate from `pm` to `ffx` for package management, as `pm` is now
deprecated. Furthermore, the `pm` calls used in this script no longer
work at Fuchsia's HEAD. This is the largest change in this commit, and
impacts all steps around repository management (creation and
registration of the repo, as well as package publishing).

2. Allow for `libtest` to be either statically or dynamically linked.
The script assumed it was dynamically linked, but the current Rust
behavior at HEAD is to statically link it.

3. Minor cleanup to use `ffx --machine json` rather than string parsing.

4. Minor cleanup to the docs around the script.

std::net: Socket::new_raw set to SO_NOSIGPIPE on freebsd/netbsd/dragonfly.

add note about `AlreadyExists` to `create_new`

Apply suggestions from code review

Co-authored-by: Jubilee <[email protected]>

iOS/tvOS/watchOS/visionOS: Default to kernel-defined backlog in listen

This behavior is defined in general for the XNU kernel, not just macOS:
https://github.com/apple-oss-distributions/xnu/blob/rel/xnu-10002/bsd/kern/uipc_socket.c

iOS/tvOS/watchOS/visionOS: Set the main thread name

Tested in the iOS simulator that the thread name is not set by default,
and that setting it improves the debugging experience in lldb / Xcode.

iOS/tvOS/watchOS: Fix alloc w. large alignment on older versions

Tested on an old MacBook and the iOS simulator.

iOS/tvOS/watchOS/visionOS: Fix reading large files

Tested in the iOS simulator with something like:
```
let mut buf = vec![0; c_int::MAX as usize - 1 + 2];
let read_bytes = f.read(&mut buf).unwrap();
```

iOS/tvOS/watchOS/visionOS: Improve File Debug impl

This uses `libc::fcntl`, which, while not explicitly marked as available
in the headers, is already used by `File::sync_all` and `File::sync_data`
on these platforms, so should be fine to use here as well.

next_power_of_two: add a doctest to show what happens on 0

rustc: Change LLVM target for the wasm32-wasip2 Rust target

This commit changes the LLVM target of for the Rust `wasm32-wasip2`
target to `wasm32-wasip2` as well. LLVM does a bit of detection on the
target string to know when to call `wasm-component-ld` vs `wasm-ld` so
otherwise clang is invoking the wrong linker.

rustc: Don't pass `-fuse-ld=lld` on wasm targets

This argument isn't necessary for WebAssembly targets since `wasm-ld` is
the only linker for the targets. Passing it otherwise interferes with
Clang's linker selection on `wasm32-wasip2` so avoid it altogether.

rustc: Change wasm32-wasip2 to PIC-by-default

This commit changes the new `wasm32-wasip2` target to being PIC by
default rather than the previous non-PIC by default. This change is
intended to make it easier for the standard library to be used in a
shared object in its precompiled form. This comes with a hypothetical
modest slowdown but it's expected that this is quite minor in most use
cases or otherwise wasm compilers and/or optimizing runtimes can elide
the cost.

Handle normalization failure in `struct_tail_erasing_lifetimes`

Fixes an ICE that occurred when the struct in question has an error

Fix insufficient logic when searching for the underlying allocation

in the `invalid_reference_casting` lint, when trying to lint on
bigger memory layout casts.

rustdoc: use stability, instead of features, to decide what to show

To decide if internal items should be inlined in a doc page,
check if the crate is itself internal, rather than if it has
the rustc_private feature flag. The standard library uses
internal items, but is not itself internal and should not show
internal items on its docs pages.

Avoid a cast in `ptr::slice_from_raw_parts(_mut)`

Casting to `*const ()` or `*mut ()` just bloats the MIR, so let's not.

If ACP#362 goes through we can keep calling `ptr::from_raw_parts(_mut)` in these also without the cast, but that hasn't had any libs-api attention yet, so I'm not waiting on it.

add enum variant field names to make the code clearer

remove redundant flat vs nested distinction to simplify enum

turn all_nested_unused into used_childs

store the span of the nested part of the use tree in the ast

 remove braces when fixing a nested use tree into a single use

Use generic `NonZero` in examples.

Simplify `clippy` lint.

Simplify suggestion.

Use generic `NonZero`.

crashes: add lastest batch of crash tests

Make sure we don't deny macro vars w keyword names

Simplify `use crate::rustc_foo::bar` occurrences.

They can just be written as `use rustc_foo::bar`, which is far more
standard. (I didn't even know that a `crate::` prefix was valid.)

Update cc crate to v1.0.97

Ignore empty RUSTC_WRAPPER in bootstrap

This change ignores the RUSTC_WRAPPER_REAL environment variable if it's
set to the empty string. This matches cargo behaviour and allows users
to easily shadow a globally set RUSTC_WRAPPER (which they might have set
for non-rustc projects).

Handle normalization failure in `struct_tail_erasing_lifetimes`

Fixes an ICE that occurred when the struct in question has an error

Implement `as_chunks` with `split_at_unchecked`

Remove `macro_use` from `stable_hasher`.

Normal `use` items are nicer.

Reorder top-level crate items.

- `use` before `mod`
- `pub` before `non-pub`
- Alphabetical order within sections

Remove `extern crate tracing`.

`use` is a nicer way of doing things.

Document `Pu128`.

And move the `repr` line after the `derive` line, where it's harder to
overlook. (I overlooked it initially, and didn't understand how this
type worked.)

Remove `TinyList`.

It is optimized for lists with a single element, avoiding the need for
an allocation in that case. But `SmallVec<[T; 1]>` also avoids the
allocation, and is better in general: more standard, log2 number of
allocations if the list exceeds one item, and a much more capable API.

This commit removes `TinyList` and converts the two uses to
`SmallVec<[T; 1]>`. It also reorders the `use` items in the relevant
file so they are in just two sections (`pub` and non-`pub`), ordered
alphabetically, instead of many sections. (This is a relevant part of
the change because I had to decide where to add a `use` item for
`SmallVec`.)

Remove `vec_linked_list`.

It provides a way to effectively embed a linked list within an
`IndexVec` and also iterate over that list. It's written in a very
generic way, involving two traits `Links` and `LinkElem`. But the
`Links` trait is only impl'd for `IndexVec` and `&IndexVec`, and the
whole thing is only used in one module within `rustc_borrowck`. So I
think it's over-engineered and hard to read. Plus it has no comments.

This commit removes it, and adds a (non-generic) local iterator for the
use within `rustc_borrowck`. Much simpler.

Remove `enum_from_u32`.

It's a macro that just creates an enum with a `from_u32` method. It has
two arms. One is unused and the other has a single use.

This commit inlines that single use and removes the whole macro. This
increases readability because we don't have two different macros
interacting (`enum_from_u32` and `language_item_table`).

Update Tests

Fix Error Messages for `break` Inside Coroutines

Previously, `break` inside `gen` blocks and functions
were incorrectly identified to be enclosed by a closure.

This PR fixes it by displaying an appropriate error message
for async blocks, async closures, async functions, gen blocks,
gen closures, gen functions, async gen blocks, async gen closures
and async gen functions.

Note: gen closure and async gen closure are not supported by the
compiler yet but I have added an error message here assuming that
they might be implemented in the future.

Also, fixes grammar in a few places by replacing
`inside of a $coroutine` with `inside a $coroutine`.

Migrate `run-make/rustdoc-map-file` to rmake

Add more ICEs due to malformed diagnostic::on_unimplemented

Fix ICEs in diagnostic::on_unimplemented

Handle field projections like slice indexing in invalid_reference_casting

Fix typos

Do not add leading asterisk in the `PartialEq`

Adding leading asterisk can cause compilation failure for
the _types_ that don't implement the `Copy`.

Use sum type for `WorkflowRunType`

Parse try build CI job name from commit message

Make the regex more robust

Address review comments

CI: fix auto builds and make sure that we always have at least a single CI job

Include the line number in tidy's `iter_header`

Tidy check for test revisions that are mentioned but not declared

If a `[revision]` name appears in a test header directive or error annotation,
but isn't declared in the `//@ revisions:` header, that is almost always a
mistake.

In cases where a revision needs to be temporarily disabled, adding it to an
`//@ unused-revision-names:` header will suppress these checks for that name.

Adding the wildcard name `*` to the unused list will suppress these checks for
the entire file.

Fix test problems discovered by the revision check

Most of these changes either add revision names that were apparently missing,
or explicitly mark a revision name as currently unused.

fix rust-lang#124714 str.to_lowercase sigma handling

Make a minimal amount of region APIs public

Add `ErrorGuaranteed` to `Recovered::Yes` and use it more.

The starting point for this was identical comments on two different
fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`:
```
    // FIXME: investigate making this a `Option<ErrorGuaranteed>`
    recovered: bool
```
I tried that, and then found that I needed to add an `ErrorGuaranteed`
to `Recovered::Yes`. Then I ended up using `Recovered` instead of
`Option<ErrorGuaranteed>` for these two places and elsewhere, which
required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`.

This makes things more consistent, because `Recovered` is used in more
places, and there are fewer uses of `bool` and
`Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible
to set `recovered` to `Recovered::Yes` without having emitted an error.

interpret/miri: better errors on failing offset_from

chore: remove repetitive words

Make `#![feature]` suggestion MaybeIncorrect

Update Makefiles with explanatory comments

correct comments

add FIXME

Upgrade the version of Clang used in the build, move MSVC builds to Server 2022

Rename Generics::params to Generics::own_params

Add benchmarks for `impl Debug for str`

In order to inform future perf improvements and prevent regressions,
lets add some benchmarks that stress `impl Debug for str`.

Remove unused `step_trait` feature.

Also sort the features.

Remove unused `LinkSelfContainedDefault::is_linker_enabled` method.

Correct a comment.

I tried simplifying `RegionCtxt`, which led me to finding that the
fields are printed in `sccs_info`.

Fix up `DescriptionCtx::new`.

The comment mentions that `ReBound` and `ReVar` aren't expected here.
Experimentation with the full test suite indicates this is true, and
that `ReErased` also doesn't occur. So the commit introduces `bug!` for
those cases. (If any of them show up later on, at least we'll have a
test case.)

The commit also remove the first sentence in the comment.
`RePlaceholder` is now handled in the match arm above this comment and
nothing is printed for it, so that sentence is just wrong. Furthermore,
issue rust-lang#13998 was closed some time ago.

Fix out-of-date comment.

The type name has changed.

Remove `TyCtxt::try_normalize_erasing_late_bound_regions`.

It's unused.

Remove out-of-date comment.

The use of `Binder` was removed in the recent rust-lang#123900, but the comment
wasn't removed at the same time.

De-tuple two `vtable_trait_first_method_offset` args.

Thus eliminating a `FIXME` comment.

opt-dist: use xz2 instead of xz crate

xz crate consist of simple reexport of xz2 crate. Why? Idk.

analyse visitor: build proof tree in probe

update crashes

always use `GenericArgsRef`

Inline and remove unused methods.

`InferCtxt::next_{ty,const,int,float}_var_id` each have a single call
site, in `InferCtt::next_{ty,const,int,float}_var` respectively.

The only remaining method that creates a var_id is
`InferCtxt::next_ty_var_id_in_universe`, which has one use outside the
crate.

Use fewer origins when creating type variables.

`InferCtxt::next_{ty,const}_var*` all take an origin, but the
`param_def_id` is almost always `None`. This commit changes them to just
take a `Span` and build the origin within the method, and adds new
methods for the rare cases where `param_def_id` might not be `None`.
This avoids a lot of tedious origin building.

Specifically:
- next_ty_var{,_id_in_universe,_in_universe}: now take `Span` instead of
  `TypeVariableOrigin`
- next_ty_var_with_origin: added

- next_const_var{,_in_universe}: takes Span instead of ConstVariableOrigin
- next_const_var_with_origin: added

- next_region_var, next_region_var_in_universe: these are unchanged,
  still take RegionVariableOrigin

The API inconsistency (ty/const vs region) seems worth it for the
large conciseness improvements.

print walltime benchmarks with subnanosecond precision

example results when benchmarking 1-4 serialized ADD instructions

```
running 4 tests
test add  ... bench:           0.24 ns/iter (+/- 0.00)
test add2 ... bench:           0.48 ns/iter (+/- 0.01)
test add3 ... bench:           0.72 ns/iter (+/- 0.01)
test add4 ... bench:           0.96 ns/iter (+/- 0.01)
```

emit fractional benchmark nanoseconds in libtest's JSON output format

bootstrap should also render fractional nanoseconds for benchmarks

from_str_radix: outline only the panic function

codegen: memmove/memset cannot be non-temporal

coverage: Separately compute the set of BCBs with counter mappings

coverage: Make the special case for async functions exit early

coverage: Don't recompute the number of test vector bitmap bytes

The code in `extract_mcdc_mappings` that allocates these bytes already knows
how many are needed in total, so there's no need to immediately recompute that
value in the calling function.

coverage: Destructure the mappings struct to make sure we don't miss any

coverage: Rename `CoverageSpans` to `ExtractedMappings`

coverage: Tidy imports in `rustc_mir_transform::coverage`

Fix parse error message for meta items

Refactor float `Primitive`s to a separate `Float` type

Migrate `run-make/rustdoc-output-path` to rmake

Make builtin_deref just return a Ty

rename some variants in FulfillmentErrorCode

Remove glob imports for ObligationCauseCode

Rename some ObligationCauseCode variants

More rename fallout

Name tweaks

Add a codegen test for transparent aggregates

Aggregating arrays can always take the place path

Make SSA aggregates without needing an alloca

Lift `Lift`

Lift `TraitRef` into `rustc_type_ir`

Also debug

Apply nits, make some bounds into supertraits on inherent traits

Add `-lmingwex` second time in `mingw_libs`

Upcoming mingw-w64 releases will contain small math functions refactor which moved implementation around.
As a result functions like `lgamma`
now depend on libraries in this order:
`libmingwex.a` -> `libmsvcrt.a` -> `libmingwex.a`.

Fixes rust-lang#124221

ignore generics args in attribute paths

bootstrap: add comments for the automatic dry run

fix typo

Co-authored-by: jyn <[email protected]>

reachable computation: extend explanation of what this does, and why

Make sure we consume a generic arg when checking mistyped turbofish

Update cargo

std::rand: adding solaris/illumos for getrandom support.

To help solarish support for miri https://rust-lang/miri/issues/3567

Update ena to 0.14.3

Fix typo in ManuallyDrop's documentation

Add @saethlin to some triagebot groups

Refactor Apple `target_abi`

This was bundled together with `Arch`, which complicated a few code
paths and meant we had to do more string matching than necessary.

Match ergonomics 2024: let `&` patterns eat `&mut`

Various fixes:

- Only show error when move-check would not be triggered
- Add structured suggestion

Fix spans when macros are involved

Comments and fixes

Rename `explicit_ba`

No more `Option<Option<>>`

Remove redundant comment

Move all ref pat logic into `check_pat_ref`

Add comment on `cap_to_weakly_not`

Co-authored-by: Guillaume Boisseau <[email protected]>

Stabilize `byte_slice_trim_ascii` for `&[u8]`/`&str`

Remove feature from documentation examples
Add rustc_const_stable attribute to stabilized functions
Update intra-doc link for `u8::is_ascii_whitespace` on `&[u8]` functions

Document proper usage of `fmt::Error` and `fmt()`'s `Result`.

Documentation of these properties previously existed in a lone paragraph
in the `fmt` module's documentation:
<https://doc.rust-lang.org/1.78.0/std/fmt/index.html#formatting-traits>
However, users looking to implement a formatting trait won't necessarily
look there. Therefore, let's add the critical information (that
formatting per se is infallible) to all the involved items.

check if `x test tests` missing any test directory

Signed-off-by: onur-ozkan <[email protected]>

remap missing path `tests/crashes` to `tests`

Signed-off-by: onur-ozkan <[email protected]>

add "tidy-alphabetical" check on "tests" remap list

Signed-off-by: onur-ozkan <[email protected]>

Handle Deref expressions in invalid_reference_casting

unix/fs: a bit of cleanup around host-specific code

solaris support start.

reduce tokio features

remove rand test

the actual target-specific things we want to test are all in getrandom,
and rand already tests miri itself

getrandom: test with and without isolation

also add some comments for why we keep certain old obscure APIs supported

avoid code duplication between realloc and malloc

Implement wcslen

organize libc tests into a proper folder, and run some of them on Windows

README: update introduction

remove problems that I do not think we have seen in a while

io::Error handling: keep around the full io::Error for longer so we can give better errors

Implement non-null pointer for malloc(0)

Allow test targets to be set via CLI args

Update CI script for the miri-script test changes

Update documentation for miri-script test changes

minor tweaks

make MIRI_TEST_TARGET entirely an internal thing

make RUSTC_BLESS entirely an internal thing

do not run symlink tests on Windows hosts

rename 'extern-so' to 'native-lib'

Preparing for merge from rustc

alloc: update comments around malloc() alignment

separate windows heap functions from C heap shims

Add windows_i686_gnullvm to the list

Pin libc back to 0.2.153

Update Cargo.lock

fix few typo in filecheck annotations

Consolidate obligation cause codes for where clauses

Clean up users of rust_dbg_call

Enable profiler for armv7-unknown-linux-gnueabihf.

Always hide private fields in aliased type

Migrate `run-make/rustdoc-shared-flags` to rmake

Relax allocator requirements on some Rc APIs.

* Remove A: Clone bound from Rc::assume_init, Rc::downcast, and Rc::downcast_unchecked.
* Make From<Rc<[T; N]>> for Rc<[T]> allocator-aware.

Internal changes:

* Made Arc::internal_into_inner_with_allocator method into Arc::into_inner_with_allocator associated fn.
* Add private Rc::into_inner_with_allocator (to match Arc), so other fns don't have to juggle ManuallyDrop.

Relax A: Clone requirement on Rc/Arc::unwrap_or_clone.

Add test for rust-lang#122775

Refactoring after the `PlaceValue` addition

I added `PlaceValue` in 123775, but kept that one line-by-line simple because it touched so many places.

This goes through to add more helpers & docs, and change some `PlaceRef` to `PlaceValue` where the type didn't need to be included.

No behaviour changes.

Make it possible to derive Lift/TypeVisitable/TypeFoldable in rustc_type_ir

Uplift `TraitPredicate`

Uplift `ExistentialTraitRef`, `ExistentialProjection`, `ProjectionPredicate`

Uplift `NormalizesTo`, `CoercePredicate`, and `SubtypePredicate`

Apply nits, uplift ExistentialPredicate too

And `ImplPolarity` too

Expand on expr_requires_semi_to_be_stmt documentation

Mark expr_requires_semi_to_be_stmt call sites

For each of these, we need to decide whether they need to be using
`expr_requires_semi_to_be_stmt`, or `expr_requires_comma_to_be_match_arm`,
which are supposed to be 2 different behaviors. Previously they were
conflated into one, causing either too much or too little
parenthesization.

Macro call with braces does not require semicolon to be statement

This commit by itself is supposed to have no effect on behavior. All of
the call sites are updated to preserve their previous behavior.

The behavior changes are in the commits that follow.

Add ExprKind::MacCall statement boundary tests

Fix pretty printer statement boundaries after braced macro call

Delete MacCall case from pretty-printing semicolon after StmtKind::Expr

I didn't figure out how to reach this condition with `expr` containing
`ExprKind::MacCall`. All the approaches I tried ended up with the macro
call ending up in the `StmtKind::MacCall` case below instead.

In any case, from visual inspection this is a bugfix. If we do end up
with a `StmtKind::Expr` containing `ExprKind::MacCall` with brace
delimiter, it would not need ";" printed after it.

Add test of unused_parens lint involving macro calls

Document the situation with unused_parens lint and braced macro calls

Add parser tests for statement boundary insertion

Mark Parser::expr_is_complete call sites

Document MacCall special case in Parser::expr_is_complete

Document MacCall special case in Parser::parse_arm

Add macro calls to else-no-if parser test

Remove MacCall special case from recovery after missing 'if' after 'else'

The change to the test is a little goofy because the compiler was
guessing "correctly" before that `falsy! {}` is the condition as opposed
to the else body. But I believe this change is fundamentally correct.
Braced macro invocations in statement position are most often item-like
(`thread_local! {...}`) as opposed to parenthesized macro invocations
which are condition-like (`cfg!(...)`).

Remove MacCall special cases from Parser::parse_full_stmt

It is impossible for expr here to be a braced macro call. Expr comes
from `parse_stmt_without_recovery`, in which macro calls are parsed by
`parse_stmt_mac`. See this part:

    let kind = if (style == MacStmtStyle::Braces
        && self.token != token::Dot
        && self.token != token::Question)
        || self.token == token::Semi
        || self.token == token::Eof
    {
        StmtKind::MacCall(P(MacCallStmt { mac, style, attrs, tokens: None }))
    } else {
        // Since none of the above applied, this is an expression statement macro.
        let e = self.mk_expr(lo.to(hi), ExprKind::MacCall(mac));
        let e = self.maybe_recover_from_bad_qpath(e)?;
        let e = self.parse_expr_dot_or_call_with(e, lo, attrs)?;
        let e = self.parse_expr_assoc_with(
            0,
            LhsExpr::AlreadyParsed { expr: e, starts_statement: false },
        )?;
        StmtKind::Expr(e)
    };

A braced macro call at the head of a statement is always either extended
into ExprKind::Field / MethodCall / Await / Try / Binary, or else
returned as StmtKind::MacCall. We can never get a StmtKind::Expr
containing ExprKind::MacCall containing brace delimiter.

Add classify::expr_is_complete

Fix redundant parens around braced macro call in match arms

use key-value format in stage0 file

Currently, we are working on the python removal task on bootstrap. Which means
we have to extract some data from the stage0 file using shell scripts. However,
parsing values from the stage0.json file is painful because shell scripts don't
have a built-in way to parse json files.

This change simplifies the stage0 file format to key-value pairs, which makes
it easily readable from any environment.

Signed-off-by: onur-ozkan <[email protected]>

awk stage0 file on CI

Signed-off-by: onur-ozkan <[email protected]>

use stage0 file in `bootstrap.py`

Signed-off-by: onur-ozkan <[email protected]>

use shared stage0 parser from `build_helper`

Signed-off-by: onur-ozkan <[email protected]>

remove outdated stage0.json parts

Signed-off-by: onur-ozkan <[email protected]>

move comments position in `src/stage0`

Signed-off-by: onur-ozkan <[email protected]>

io::Write::write_fmt: panic if the formatter fails when the stream does not fail

std::alloc: using posix_memalign instead of memalign on solarish.

simpler code path since small alignments are already taking care of.
close rust-langGH-124787

Relax slice safety requirements

Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.

References must also be non-null

Add `crate_type` method to `Rustdoc`

Add `crate_name` method to `Rustdoc` and `Rustc`

Add `python_command` and `source_path` functions

Add `extern_` method to `Rustdoc`

Migrate `rustdoc-scrape-examples-ordering` to `rmake`

Fix some minor issues from the ui-test auto-porting

solve: replace all `debug` with `trace`

structurally important functions to `debug`

fix hidden title in command-line-arguments docs

Assert that MemCategorizationVisitor actually errors when it bails ungracefully

Inline MemCategorization into ExprUseVisitor

Remove unncessary mut ref

Introduce TypeInformationCtxt to abstract over LateCtxt/FnCtxt

Make LateCtxt be a type info delegate for EUV for clippy

Try structurally resolve

Apply nits

Propagate errors rather than using return_if_err

Match ergonomics 2024: migration lint

Unfortunately, we can't always offer a machine-applicable suggestion when there are subpatterns from macro expansion.

Co-Authored-By: Guillaume Boisseau <[email protected]>

Add AST pretty-printer tests for let-else

Pretty-print let-else with added parenthesization when needed

rename
bors added a commit to rust-lang-ci/rust that referenced this issue May 22, 2024
…cottmcm

offset: allow zero-byte offset on arbitrary pointers

As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)):

- Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer
- Inbounds-offset-by-zero is allowed on all pointers, including the null pointer
- `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address

This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid).

The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this.

The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here.

I think we should change offset_from further, but that's a separate discussion.

Fixes rust-lang#65108
[Tracking issue](rust-lang#117945) | [T-lang summary](rust-lang#117329 (comment))

Cc `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue May 22, 2024
…cottmcm

offset: allow zero-byte offset on arbitrary pointers

As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)):

- Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer
- Inbounds-offset-by-zero is allowed on all pointers, including the null pointer
- `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address

This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid).

The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this.

The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here.

I think we should change offset_from further, but that's a separate discussion.

Fixes rust-lang#65108
[Tracking issue](rust-lang#117945) | [T-lang summary](rust-lang#117329 (comment))

Cc `@nikic`
github-actions bot pushed a commit to rust-lang/miri that referenced this issue May 23, 2024
offset: allow zero-byte offset on arbitrary pointers

As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)):

- Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer
- Inbounds-offset-by-zero is allowed on all pointers, including the null pointer
- `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address

This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid).

The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this.

The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here.

I think we should change offset_from further, but that's a separate discussion.

Fixes rust-lang/rust#65108
[Tracking issue](rust-lang/rust#117945) | [T-lang summary](rust-lang/rust#117329 (comment))

Cc `@nikic`
flip1995 pushed a commit to flip1995/rust-clippy that referenced this issue May 24, 2024
offset: allow zero-byte offset on arbitrary pointers

As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)):

- Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer
- Inbounds-offset-by-zero is allowed on all pointers, including the null pointer
- `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address

This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid).

The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this.

The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here.

I think we should change offset_from further, but that's a separate discussion.

Fixes rust-lang/rust#65108
[Tracking issue](rust-lang/rust#117945) | [T-lang summary](rust-lang/rust#117329 (comment))

Cc `@nikic`
bors added a commit to rust-lang/rust-analyzer that referenced this issue Jun 20, 2024
offset: allow zero-byte offset on arbitrary pointers

As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)):

- Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer
- Inbounds-offset-by-zero is allowed on all pointers, including the null pointer
- `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address

This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid).

The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this.

The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here.

I think we should change offset_from further, but that's a separate discussion.

Fixes rust-lang/rust#65108
[Tracking issue](rust-lang/rust#117945) | [T-lang summary](rust-lang/rust#117329 (comment))

Cc `@nikic`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jul 15, 2024
…oli-obk

offset_from: always allow pointers to point to the same address

This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.

Tracking issue: rust-lang#117945

### What is provenance monotonicity and why does it matter?

Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.

We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.

We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes  `*ptr = *ptr`  a provenance-stripping operation  (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)

### What does `offset_from` have to do with provenance monotonicity?

With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.

To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.

### What further consequences does this have?

It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.

The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.

### What about the backend?

LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.

If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang#124921 (comment)).
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jul 15, 2024
…oli-obk

offset_from: always allow pointers to point to the same address

This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.

Tracking issue: rust-lang#117945

### What is provenance monotonicity and why does it matter?

Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.

We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.

We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes  `*ptr = *ptr`  a provenance-stripping operation  (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)

### What does `offset_from` have to do with provenance monotonicity?

With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.

To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.

### What further consequences does this have?

It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.

The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.

### What about the backend?

LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.

If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang#124921 (comment)).
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jul 15, 2024
…oli-obk

offset_from: always allow pointers to point to the same address

This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.

Tracking issue: rust-lang#117945

### What is provenance monotonicity and why does it matter?

Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.

We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.

We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes  `*ptr = *ptr`  a provenance-stripping operation  (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)

### What does `offset_from` have to do with provenance monotonicity?

With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.

To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.

### What further consequences does this have?

It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.

The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.

### What about the backend?

LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.

If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang#124921 (comment)).
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jul 15, 2024
Rollup merge of rust-lang#124921 - RalfJung:offset-from-same-addr, r=oli-obk

offset_from: always allow pointers to point to the same address

This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.

Tracking issue: rust-lang#117945

### What is provenance monotonicity and why does it matter?

Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.

We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.

We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes  `*ptr = *ptr`  a provenance-stripping operation  (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)

### What does `offset_from` have to do with provenance monotonicity?

With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.

To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.

### What further consequences does this have?

It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.

The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.

### What about the backend?

LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.

If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang#124921 (comment)).
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Jul 16, 2024
offset_from: always allow pointers to point to the same address

This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.

Tracking issue: rust-lang/rust#117945

### What is provenance monotonicity and why does it matter?

Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.

We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.

We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes  `*ptr = *ptr`  a provenance-stripping operation  (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)

### What does `offset_from` have to do with provenance monotonicity?

With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.

To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.

### What further consequences does this have?

It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.

The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang/rust#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.

### What about the backend?

LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.

If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang/rust#124921 (comment)).
@RalfJung
Copy link
Member Author

I created a reference PR at rust-lang/reference#1541. That should complete the implementation of this feature. :) (Aside from the GCC backend, which is tracked separately at rust-lang/rustc_codegen_gcc#516.)

joshlf added a commit to google/zerocopy that referenced this issue Sep 7, 2024
Now that [1] is completed, zero-sized accesses no longer require
provenance. Per [2], zero-sized references are no longer required to be
dereferenceable, and so may not carry provenance.

This commit updates `Ptr`'s invariants to not require provenance or a
valid allocation when its referent is zero-sized.

[1] rust-lang/rust#117945
[2] rust-lang/rust#125021
joshlf added a commit to google/zerocopy that referenced this issue Sep 7, 2024
Now that [1] is completed, zero-sized accesses no longer require
provenance. Per [2], zero-sized references are no longer required to be
dereferenceable, and so may not carry provenance.

This commit updates `Ptr`'s invariants to not require provenance or a
valid allocation when its referent is zero-sized.

[1] rust-lang/rust#117945
[2] rust-lang/rust#125021

Closes #874
github-merge-queue bot pushed a commit to google/zerocopy that referenced this issue Sep 7, 2024
Now that [1] is completed, zero-sized accesses no longer require
provenance. Per [2], zero-sized references are no longer required to be
dereferenceable, and so may not carry provenance.

This commit updates `Ptr`'s invariants to not require provenance or a
valid allocation when its referent is zero-sized.

[1] rust-lang/rust#117945
[2] rust-lang/rust#125021

Closes #874
mattheww added a commit to mattheww/nomicon that referenced this issue Oct 15, 2024
The new rules were tracked in
rust-lang/rust#117945

The corresponding update to the Reference was
rust-lang/reference#1541
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants