Tracking Issue for allowing zero-sized memory accesses and offsets #117945

RalfJung · 2023-11-15T18:04:41Z

This issue tracks implementing the t-opsem decision in rust-lang/unsafe-code-guidelines#472. This will require adjustments in many places (codegen, Miri, library docs, reference, ...). The intention is to track here what needs to be done until the transition is complete.

update LLVM codegen
update cranelift codegen (not needed)
update GCC codegen: deferred to Make sure memcpy/memmove/memset with size 0 behave correctly rustc_codegen_gcc#516
update Miri
update library docs
update the reference: update 'dangling pointers' to new zero-sized rules reference#1541

Implementation history

bjorn3 · 2023-11-16T15:47:17Z

cg_clif accepts ZST memory accesses and pointer offsets already. Pointer offsets are implemented as integer addition which doesn't have UB and ZST memory accesses never get turned into loads and stores in cranelift ir as there is no instruction that does so.

RalfJung · 2023-11-16T16:00:15Z

memory accesses never get turned into loads and stores in cranelift ir as there is no instruction that does so.

Besides direct accesses, the other concerns are the copy, write_bytes, compare_bytes intrinsics. Those must be implemented in a way that they are not UB when elem_count*elem_size is 0.

bjorn3 · 2023-11-16T16:02:43Z

They are implemented by calling the respective libc functions which LLVM already expects to accept 0-sized accesses, right?

RalfJung · 2023-11-16T16:04:39Z

GCC codegen might also need updating, Cc @antoyo @GuillaumeGomez

RalfJung · 2023-11-16T16:05:19Z

They are implemented by calling the respective libc functions which LLVM already expects to accept 0-sized accesses, right?

Well what LLVM assumes doesn't matter for the cranelift backend, does it? ;) But more importantly, Rust explicitly assumes this itself as documented here.

GuillaumeGomez · 2023-11-16T19:32:23Z

No problem. Please ping us when we need to update our part and thanks for the ping!

RalfJung · 2023-11-16T20:28:55Z

Well I'm asking you if you need to update anything. :) You need to make sure that the Offset MIR binop is compiled in a way that offset by 0 bytes is always Defined Behavior even if the pointer operand is null or dangling or out of bounds or whatever.

I think zero-sized memory accesses disappear in the SSA codegen infrastructure before your backend even sees them so they should be fine.

And finally the copy, copy_nonoverlapping, write_bytes, compare_bytes intrinsics need to be lowered in a way that they are Defined Behavior when the size is 0, even if the pointers are null or dangling or whatever.

antoyo · 2023-11-16T21:17:51Z

These intrinsics are implemented by calling the GCC builtin functions: memcmp, memset, memcpy, memmove.
I'll double-check, but it seems fine to have a count of zero, but not NULL pointers.

RalfJung · 2023-11-16T21:24:46Z

Okay, something needs to change then in the backend because we'll allow null pointers for the Rust intrinsics.

RalfJung · 2024-02-19T09:01:05Z

I updated #117329 to make it ready to land.

@rust-lang/opsem to address this concern, I made offset_from have library UB but not language UB on two pointers with the same address but provenance for different allocations. Please let me know what you think.

Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language.

Update reference safety requirements Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language. While we're here, we also require that references be non-null. cc `@RalfJung`

Update reference safety requirements Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language. While we're here, we also require that references be non-null. cc ``@RalfJung``

Rollup merge of rust-lang#125021 - joshlf:patch-11, r=RalfJung Update reference safety requirements Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language. While we're here, we also require that references be non-null. cc ``@RalfJung``

@saethlin

reorganised attrs removed OsStr impls added backticks Add note about possible allocation-sharing to Arc/Rc<str/[T]/CStr>::default. Use shared statics for the ArcInner for Arc<str, CStr>::default, and for Arc<[T]>::default where alignof(T) <= 16. fixed unsafe block Revert "fixed unsafe block" This reverts commit 6eb6aee. Return coherent description for boolean instead of panicking Improve check-cfg CLI errors with more structured diagnostics Move various stdlib tests to library/std/tests Run tidy on tests Rename test for issue 21058 Implement `edition` method on `Rustdoc` type as well Migrate `run-make/doctests-runtool` to rmake Rename `run-make-support` library `output` method to `command_output` Add new `output` method to `Rustc` and `Rustdoc` types Migrate `run-make/rustdoc-error-lines` to `rmake.rs` add f16 associated constants NaN and infinity are not included as they require arithmetic. add f128 associated constants NaN and infinity are not included as they require arithmetic. add constants in std::f16::consts add constants in std::f128::consts update error messages in ui tests Document that `create_dir_all` calls `mkdir`/`CreateDirW` multiple times Also mention that there might be leftover directories in the error case. Prefer lower vtable candidates in select in new solver Don't consider candidates with no failing where clauses Use super_fold in RegionsToStatic visitor Make check-cfg docs more user-friendly Record impl args in the InsepctCandiate rather than rematching during select Use correct ImplSource for alias bounds BorrowckInferCtxt: infcx by value borrowck: more eagerly prepopulate opaques switch new solver to directly inject opaque types Update books Adjust dbg.value/dbg.declare checks for LLVM update llvm/llvm-project#89799 changes llvm.dbg.value/declare intrinsics to be in a different, out-of-instruction-line representation. For example call void @llvm.dbg.declare(...) becomes #dbg_declare(...) Update tests accordingly to work with both the old and new way. Adjust 64-bit ARM data layouts for LLVM update LLVM has updated data layouts to specify `Fn32` on 64-bit ARM to avoid C++ accidentally underaligning functions when trying to comply with member function ABIs. This should only affect Rust in cases where we had a similar bug (I don't believe we have one), but our data layout must match to generate code. As a compatibility adaptatation, if LLVM is not version 19 yet, `Fn32` gets voided from the data layout. See llvm/llvm-project#90415 Update version of cc crate to v1.0.97 Reason: In order to build the Windows version of the Rust toolchain for the Android platform, the following patch to the cc is crate is required to avoid incorrectly determining that we are building with the Android NDK: rust-lang/cc-rs@57853c4 This patch is present in version 1.0.80 and newer versions of the cc crate. The rustc source distribution currently has 3 different versions of cc in the vendor directory, only one of which has the necessary fix. We (the Android Rust toolchain) are currently maintaining local patches to upgrade the cc crate dependency versions, which we would like to upstream. Furthermore, beyond the specific reason, the cc crate in bootstrap is currently pinned at an old version due to problems in the past when trying to update it. It is worthwhile to figure out and resolve these problems so we can keep the dependency up-to-date. Other fixes: As of cc v1.0.78, object files are prefixed with a 16-character hash. Update src/bootstrap/src/core/build_steps/llvm.rs to account for this to avoid failures when building libunwind and libcrt. Note that while the hash prefix was introduced in v1.0.78, in order to determine the names of the object files without scanning the directory, we rely on the compile_intermediates method, which was introduced in cc v1.0.86 As of cc v1.0.86, compilation on MacOS uses the -mmacosx-version-min flag. A long-standing bug in the CMake rules for compiler-rt causes compilation to fail when this flag is specified. So we add a workaround to suppress this flag. Updating to cc v1.0.91 and newer requires fixes to bootstrap unit tests. The unit tests use targets named "A", "B", etc., which fail a validation check introduced in 1.0.91 of the cc crate. Implement lldb formattter for "clang encoded" enums (LLDB 18.1+) Summary: I landed a fix last year to enable `DW_TAG_variant_part` encoding in LLDBs (https://reviews.llvm.org/D149213). This PR is a corresponding fix in synthetic formatters to decode that information. This is in no way perfect implementation but at least it improves the status quo. But most types of enums will be visible and debuggable in some way. I've also updated most of the existing tests that touch enums and re-enabled test cases based on LLDB for enums. Test Plan: ran tests `./x test tests/debuginfo/`. Also tested manually in LLDB CLI and LLDB VSCode Other Thoughs A better approach would probably be adopting [formatters from codelldb](https://github.com/vadimcn/codelldb/blob/master/formatters/rust.py). There is some neat hack that hooks up summary provider via synthetic provider which can ultimately fix more display issues for Rust types and enums too. But getting it to work well might take more time that I have right now. f16::is_sign_{positive,negative} were feature-gated on f128 Correct the const stabilization of `last_chunk` for slices `<[T]>::last_chunk` should have become const stable as part of <rust-lang#117561>. Update the const stability gate to reflect this. Add tests Lower never patterns to Unreachable in mir rustdoc: dedup search form HTML This change constructs the search form HTML using JavaScript, instead of plain HTML. It uses a custom element because - the [parser]'s insert algorithm runs the connected callback synchronously, so we won't get layout jank - it requires very little HTML, so it's a real win in size [parser]: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token This shrinks the standard library by about 60MiB, by my test. rustdoc: allow custom element rustdoc-search generalize hr alias: avoid unconstrainable infer vars narrow down visibilities in `rustc_parse::lexer` replace another Option<Span> by DUMMY_SP Don't ICE when we cannot eval a const to a valtree in the new solver Do not ICE on `AnonConst`s in `diagnostic_hir_wf_check` coverage: Add branch coverage support for let-else coverage: Add branch coverage support for if-let and let-chains Do not ICE on foreign malformed `diagnostic::on_unimplemented` Fix rust-lang#124651. Add test for rust-lang#124651 Update cargo compiler: Privatize `Parser::current_closure` This was added as pub in 2021 and remains only privately used in 2024! compiler: derive Debug in parser It's annoying to debug the parser if you have to stop every five seconds to add a Debug impl. compiler: add `Parser::debug_lookahead` I tried debugging a parser-related issue but found it annoying to not be able to easily peek into the Parser's token stream. Add a convenience fn that offers an opinionated view into the parser, but one that is useful for answering basic questions about parser state. Fuchsia test runner: fixup script This commit fixes several issues in the fuchsia-test-runner.py script: 1. Migrate from `pm` to `ffx` for package management, as `pm` is now deprecated. Furthermore, the `pm` calls used in this script no longer work at Fuchsia's HEAD. This is the largest change in this commit, and impacts all steps around repository management (creation and registration of the repo, as well as package publishing). 2. Allow for `libtest` to be either statically or dynamically linked. The script assumed it was dynamically linked, but the current Rust behavior at HEAD is to statically link it. 3. Minor cleanup to use `ffx --machine json` rather than string parsing. 4. Minor cleanup to the docs around the script. std::net: Socket::new_raw set to SO_NOSIGPIPE on freebsd/netbsd/dragonfly. add note about `AlreadyExists` to `create_new` Apply suggestions from code review Co-authored-by: Jubilee <[email protected]> iOS/tvOS/watchOS/visionOS: Default to kernel-defined backlog in listen This behavior is defined in general for the XNU kernel, not just macOS: https://github.com/apple-oss-distributions/xnu/blob/rel/xnu-10002/bsd/kern/uipc_socket.c iOS/tvOS/watchOS/visionOS: Set the main thread name Tested in the iOS simulator that the thread name is not set by default, and that setting it improves the debugging experience in lldb / Xcode. iOS/tvOS/watchOS: Fix alloc w. large alignment on older versions Tested on an old MacBook and the iOS simulator. iOS/tvOS/watchOS/visionOS: Fix reading large files Tested in the iOS simulator with something like: ``` let mut buf = vec![0; c_int::MAX as usize - 1 + 2]; let read_bytes = f.read(&mut buf).unwrap(); ``` iOS/tvOS/watchOS/visionOS: Improve File Debug impl This uses `libc::fcntl`, which, while not explicitly marked as available in the headers, is already used by `File::sync_all` and `File::sync_data` on these platforms, so should be fine to use here as well. next_power_of_two: add a doctest to show what happens on 0 rustc: Change LLVM target for the wasm32-wasip2 Rust target This commit changes the LLVM target of for the Rust `wasm32-wasip2` target to `wasm32-wasip2` as well. LLVM does a bit of detection on the target string to know when to call `wasm-component-ld` vs `wasm-ld` so otherwise clang is invoking the wrong linker. rustc: Don't pass `-fuse-ld=lld` on wasm targets This argument isn't necessary for WebAssembly targets since `wasm-ld` is the only linker for the targets. Passing it otherwise interferes with Clang's linker selection on `wasm32-wasip2` so avoid it altogether. rustc: Change wasm32-wasip2 to PIC-by-default This commit changes the new `wasm32-wasip2` target to being PIC by default rather than the previous non-PIC by default. This change is intended to make it easier for the standard library to be used in a shared object in its precompiled form. This comes with a hypothetical modest slowdown but it's expected that this is quite minor in most use cases or otherwise wasm compilers and/or optimizing runtimes can elide the cost. Handle normalization failure in `struct_tail_erasing_lifetimes` Fixes an ICE that occurred when the struct in question has an error Fix insufficient logic when searching for the underlying allocation in the `invalid_reference_casting` lint, when trying to lint on bigger memory layout casts. rustdoc: use stability, instead of features, to decide what to show To decide if internal items should be inlined in a doc page, check if the crate is itself internal, rather than if it has the rustc_private feature flag. The standard library uses internal items, but is not itself internal and should not show internal items on its docs pages. Avoid a cast in `ptr::slice_from_raw_parts(_mut)` Casting to `*const ()` or `*mut ()` just bloats the MIR, so let's not. If ACP#362 goes through we can keep calling `ptr::from_raw_parts(_mut)` in these also without the cast, but that hasn't had any libs-api attention yet, so I'm not waiting on it. add enum variant field names to make the code clearer remove redundant flat vs nested distinction to simplify enum turn all_nested_unused into used_childs store the span of the nested part of the use tree in the ast remove braces when fixing a nested use tree into a single use Use generic `NonZero` in examples. Simplify `clippy` lint. Simplify suggestion. Use generic `NonZero`. crashes: add lastest batch of crash tests Make sure we don't deny macro vars w keyword names Simplify `use crate::rustc_foo::bar` occurrences. They can just be written as `use rustc_foo::bar`, which is far more standard. (I didn't even know that a `crate::` prefix was valid.) Update cc crate to v1.0.97 Ignore empty RUSTC_WRAPPER in bootstrap This change ignores the RUSTC_WRAPPER_REAL environment variable if it's set to the empty string. This matches cargo behaviour and allows users to easily shadow a globally set RUSTC_WRAPPER (which they might have set for non-rustc projects). Handle normalization failure in `struct_tail_erasing_lifetimes` Fixes an ICE that occurred when the struct in question has an error Implement `as_chunks` with `split_at_unchecked` Remove `macro_use` from `stable_hasher`. Normal `use` items are nicer. Reorder top-level crate items. - `use` before `mod` - `pub` before `non-pub` - Alphabetical order within sections Remove `extern crate tracing`. `use` is a nicer way of doing things. Document `Pu128`. And move the `repr` line after the `derive` line, where it's harder to overlook. (I overlooked it initially, and didn't understand how this type worked.) Remove `TinyList`. It is optimized for lists with a single element, avoiding the need for an allocation in that case. But `SmallVec<[T; 1]>` also avoids the allocation, and is better in general: more standard, log2 number of allocations if the list exceeds one item, and a much more capable API. This commit removes `TinyList` and converts the two uses to `SmallVec<[T; 1]>`. It also reorders the `use` items in the relevant file so they are in just two sections (`pub` and non-`pub`), ordered alphabetically, instead of many sections. (This is a relevant part of the change because I had to decide where to add a `use` item for `SmallVec`.) Remove `vec_linked_list`. It provides a way to effectively embed a linked list within an `IndexVec` and also iterate over that list. It's written in a very generic way, involving two traits `Links` and `LinkElem`. But the `Links` trait is only impl'd for `IndexVec` and `&IndexVec`, and the whole thing is only used in one module within `rustc_borrowck`. So I think it's over-engineered and hard to read. Plus it has no comments. This commit removes it, and adds a (non-generic) local iterator for the use within `rustc_borrowck`. Much simpler. Remove `enum_from_u32`. It's a macro that just creates an enum with a `from_u32` method. It has two arms. One is unused and the other has a single use. This commit inlines that single use and removes the whole macro. This increases readability because we don't have two different macros interacting (`enum_from_u32` and `language_item_table`). Update Tests Fix Error Messages for `break` Inside Coroutines Previously, `break` inside `gen` blocks and functions were incorrectly identified to be enclosed by a closure. This PR fixes it by displaying an appropriate error message for async blocks, async closures, async functions, gen blocks, gen closures, gen functions, async gen blocks, async gen closures and async gen functions. Note: gen closure and async gen closure are not supported by the compiler yet but I have added an error message here assuming that they might be implemented in the future. Also, fixes grammar in a few places by replacing `inside of a $coroutine` with `inside a $coroutine`. Migrate `run-make/rustdoc-map-file` to rmake Add more ICEs due to malformed diagnostic::on_unimplemented Fix ICEs in diagnostic::on_unimplemented Handle field projections like slice indexing in invalid_reference_casting Fix typos Do not add leading asterisk in the `PartialEq` Adding leading asterisk can cause compilation failure for the _types_ that don't implement the `Copy`. Use sum type for `WorkflowRunType` Parse try build CI job name from commit message Make the regex more robust Address review comments CI: fix auto builds and make sure that we always have at least a single CI job Include the line number in tidy's `iter_header` Tidy check for test revisions that are mentioned but not declared If a `[revision]` name appears in a test header directive or error annotation, but isn't declared in the `//@ revisions:` header, that is almost always a mistake. In cases where a revision needs to be temporarily disabled, adding it to an `//@ unused-revision-names:` header will suppress these checks for that name. Adding the wildcard name `*` to the unused list will suppress these checks for the entire file. Fix test problems discovered by the revision check Most of these changes either add revision names that were apparently missing, or explicitly mark a revision name as currently unused. fix rust-lang#124714 str.to_lowercase sigma handling Make a minimal amount of region APIs public Add `ErrorGuaranteed` to `Recovered::Yes` and use it more. The starting point for this was identical comments on two different fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`: ``` // FIXME: investigate making this a `Option<ErrorGuaranteed>` recovered: bool ``` I tried that, and then found that I needed to add an `ErrorGuaranteed` to `Recovered::Yes`. Then I ended up using `Recovered` instead of `Option<ErrorGuaranteed>` for these two places and elsewhere, which required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`. This makes things more consistent, because `Recovered` is used in more places, and there are fewer uses of `bool` and `Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible to set `recovered` to `Recovered::Yes` without having emitted an error. interpret/miri: better errors on failing offset_from chore: remove repetitive words Make `#![feature]` suggestion MaybeIncorrect Update Makefiles with explanatory comments correct comments add FIXME Upgrade the version of Clang used in the build, move MSVC builds to Server 2022 Rename Generics::params to Generics::own_params Add benchmarks for `impl Debug for str` In order to inform future perf improvements and prevent regressions, lets add some benchmarks that stress `impl Debug for str`. Remove unused `step_trait` feature. Also sort the features. Remove unused `LinkSelfContainedDefault::is_linker_enabled` method. Correct a comment. I tried simplifying `RegionCtxt`, which led me to finding that the fields are printed in `sccs_info`. Fix up `DescriptionCtx::new`. The comment mentions that `ReBound` and `ReVar` aren't expected here. Experimentation with the full test suite indicates this is true, and that `ReErased` also doesn't occur. So the commit introduces `bug!` for those cases. (If any of them show up later on, at least we'll have a test case.) The commit also remove the first sentence in the comment. `RePlaceholder` is now handled in the match arm above this comment and nothing is printed for it, so that sentence is just wrong. Furthermore, issue rust-lang#13998 was closed some time ago. Fix out-of-date comment. The type name has changed. Remove `TyCtxt::try_normalize_erasing_late_bound_regions`. It's unused. Remove out-of-date comment. The use of `Binder` was removed in the recent rust-lang#123900, but the comment wasn't removed at the same time. De-tuple two `vtable_trait_first_method_offset` args. Thus eliminating a `FIXME` comment. opt-dist: use xz2 instead of xz crate xz crate consist of simple reexport of xz2 crate. Why? Idk. analyse visitor: build proof tree in probe update crashes always use `GenericArgsRef` Inline and remove unused methods. `InferCtxt::next_{ty,const,int,float}_var_id` each have a single call site, in `InferCtt::next_{ty,const,int,float}_var` respectively. The only remaining method that creates a var_id is `InferCtxt::next_ty_var_id_in_universe`, which has one use outside the crate. Use fewer origins when creating type variables. `InferCtxt::next_{ty,const}_var*` all take an origin, but the `param_def_id` is almost always `None`. This commit changes them to just take a `Span` and build the origin within the method, and adds new methods for the rare cases where `param_def_id` might not be `None`. This avoids a lot of tedious origin building. Specifically: - next_ty_var{,_id_in_universe,_in_universe}: now take `Span` instead of `TypeVariableOrigin` - next_ty_var_with_origin: added - next_const_var{,_in_universe}: takes Span instead of ConstVariableOrigin - next_const_var_with_origin: added - next_region_var, next_region_var_in_universe: these are unchanged, still take RegionVariableOrigin The API inconsistency (ty/const vs region) seems worth it for the large conciseness improvements. print walltime benchmarks with subnanosecond precision example results when benchmarking 1-4 serialized ADD instructions ``` running 4 tests test add ... bench: 0.24 ns/iter (+/- 0.00) test add2 ... bench: 0.48 ns/iter (+/- 0.01) test add3 ... bench: 0.72 ns/iter (+/- 0.01) test add4 ... bench: 0.96 ns/iter (+/- 0.01) ``` emit fractional benchmark nanoseconds in libtest's JSON output format bootstrap should also render fractional nanoseconds for benchmarks from_str_radix: outline only the panic function codegen: memmove/memset cannot be non-temporal coverage: Separately compute the set of BCBs with counter mappings coverage: Make the special case for async functions exit early coverage: Don't recompute the number of test vector bitmap bytes The code in `extract_mcdc_mappings` that allocates these bytes already knows how many are needed in total, so there's no need to immediately recompute that value in the calling function. coverage: Destructure the mappings struct to make sure we don't miss any coverage: Rename `CoverageSpans` to `ExtractedMappings` coverage: Tidy imports in `rustc_mir_transform::coverage` Fix parse error message for meta items Refactor float `Primitive`s to a separate `Float` type Migrate `run-make/rustdoc-output-path` to rmake Make builtin_deref just return a Ty rename some variants in FulfillmentErrorCode Remove glob imports for ObligationCauseCode Rename some ObligationCauseCode variants More rename fallout Name tweaks Add a codegen test for transparent aggregates Aggregating arrays can always take the place path Make SSA aggregates without needing an alloca Lift `Lift` Lift `TraitRef` into `rustc_type_ir` Also debug Apply nits, make some bounds into supertraits on inherent traits Add `-lmingwex` second time in `mingw_libs` Upcoming mingw-w64 releases will contain small math functions refactor which moved implementation around. As a result functions like `lgamma` now depend on libraries in this order: `libmingwex.a` -> `libmsvcrt.a` -> `libmingwex.a`. Fixes rust-lang#124221 ignore generics args in attribute paths bootstrap: add comments for the automatic dry run fix typo Co-authored-by: jyn <[email protected]> reachable computation: extend explanation of what this does, and why Make sure we consume a generic arg when checking mistyped turbofish Update cargo std::rand: adding solaris/illumos for getrandom support. To help solarish support for miri https://rust-lang/miri/issues/3567 Update ena to 0.14.3 Fix typo in ManuallyDrop's documentation Add @saethlin to some triagebot groups Refactor Apple `target_abi` This was bundled together with `Arch`, which complicated a few code paths and meant we had to do more string matching than necessary. Match ergonomics 2024: let `&` patterns eat `&mut` Various fixes: - Only show error when move-check would not be triggered - Add structured suggestion Fix spans when macros are involved Comments and fixes Rename `explicit_ba` No more `Option<Option<>>` Remove redundant comment Move all ref pat logic into `check_pat_ref` Add comment on `cap_to_weakly_not` Co-authored-by: Guillaume Boisseau <[email protected]> Stabilize `byte_slice_trim_ascii` for `&[u8]`/`&str` Remove feature from documentation examples Add rustc_const_stable attribute to stabilized functions Update intra-doc link for `u8::is_ascii_whitespace` on `&[u8]` functions Document proper usage of `fmt::Error` and `fmt()`'s `Result`. Documentation of these properties previously existed in a lone paragraph in the `fmt` module's documentation: <https://doc.rust-lang.org/1.78.0/std/fmt/index.html#formatting-traits> However, users looking to implement a formatting trait won't necessarily look there. Therefore, let's add the critical information (that formatting per se is infallible) to all the involved items. check if `x test tests` missing any test directory Signed-off-by: onur-ozkan <[email protected]> remap missing path `tests/crashes` to `tests` Signed-off-by: onur-ozkan <[email protected]> add "tidy-alphabetical" check on "tests" remap list Signed-off-by: onur-ozkan <[email protected]> Handle Deref expressions in invalid_reference_casting unix/fs: a bit of cleanup around host-specific code solaris support start. reduce tokio features remove rand test the actual target-specific things we want to test are all in getrandom, and rand already tests miri itself getrandom: test with and without isolation also add some comments for why we keep certain old obscure APIs supported avoid code duplication between realloc and malloc Implement wcslen organize libc tests into a proper folder, and run some of them on Windows README: update introduction remove problems that I do not think we have seen in a while io::Error handling: keep around the full io::Error for longer so we can give better errors Implement non-null pointer for malloc(0) Allow test targets to be set via CLI args Update CI script for the miri-script test changes Update documentation for miri-script test changes minor tweaks make MIRI_TEST_TARGET entirely an internal thing make RUSTC_BLESS entirely an internal thing do not run symlink tests on Windows hosts rename 'extern-so' to 'native-lib' Preparing for merge from rustc alloc: update comments around malloc() alignment separate windows heap functions from C heap shims Add windows_i686_gnullvm to the list Pin libc back to 0.2.153 Update Cargo.lock fix few typo in filecheck annotations Consolidate obligation cause codes for where clauses Clean up users of rust_dbg_call Enable profiler for armv7-unknown-linux-gnueabihf. Always hide private fields in aliased type Migrate `run-make/rustdoc-shared-flags` to rmake Relax allocator requirements on some Rc APIs. * Remove A: Clone bound from Rc::assume_init, Rc::downcast, and Rc::downcast_unchecked. * Make From<Rc<[T; N]>> for Rc<[T]> allocator-aware. Internal changes: * Made Arc::internal_into_inner_with_allocator method into Arc::into_inner_with_allocator associated fn. * Add private Rc::into_inner_with_allocator (to match Arc), so other fns don't have to juggle ManuallyDrop. Relax A: Clone requirement on Rc/Arc::unwrap_or_clone. Add test for rust-lang#122775 Refactoring after the `PlaceValue` addition I added `PlaceValue` in 123775, but kept that one line-by-line simple because it touched so many places. This goes through to add more helpers & docs, and change some `PlaceRef` to `PlaceValue` where the type didn't need to be included. No behaviour changes. Make it possible to derive Lift/TypeVisitable/TypeFoldable in rustc_type_ir Uplift `TraitPredicate` Uplift `ExistentialTraitRef`, `ExistentialProjection`, `ProjectionPredicate` Uplift `NormalizesTo`, `CoercePredicate`, and `SubtypePredicate` Apply nits, uplift ExistentialPredicate too And `ImplPolarity` too Expand on expr_requires_semi_to_be_stmt documentation Mark expr_requires_semi_to_be_stmt call sites For each of these, we need to decide whether they need to be using `expr_requires_semi_to_be_stmt`, or `expr_requires_comma_to_be_match_arm`, which are supposed to be 2 different behaviors. Previously they were conflated into one, causing either too much or too little parenthesization. Macro call with braces does not require semicolon to be statement This commit by itself is supposed to have no effect on behavior. All of the call sites are updated to preserve their previous behavior. The behavior changes are in the commits that follow. Add ExprKind::MacCall statement boundary tests Fix pretty printer statement boundaries after braced macro call Delete MacCall case from pretty-printing semicolon after StmtKind::Expr I didn't figure out how to reach this condition with `expr` containing `ExprKind::MacCall`. All the approaches I tried ended up with the macro call ending up in the `StmtKind::MacCall` case below instead. In any case, from visual inspection this is a bugfix. If we do end up with a `StmtKind::Expr` containing `ExprKind::MacCall` with brace delimiter, it would not need ";" printed after it. Add test of unused_parens lint involving macro calls Document the situation with unused_parens lint and braced macro calls Add parser tests for statement boundary insertion Mark Parser::expr_is_complete call sites Document MacCall special case in Parser::expr_is_complete Document MacCall special case in Parser::parse_arm Add macro calls to else-no-if parser test Remove MacCall special case from recovery after missing 'if' after 'else' The change to the test is a little goofy because the compiler was guessing "correctly" before that `falsy! {}` is the condition as opposed to the else body. But I believe this change is fundamentally correct. Braced macro invocations in statement position are most often item-like (`thread_local! {...}`) as opposed to parenthesized macro invocations which are condition-like (`cfg!(...)`). Remove MacCall special cases from Parser::parse_full_stmt It is impossible for expr here to be a braced macro call. Expr comes from `parse_stmt_without_recovery`, in which macro calls are parsed by `parse_stmt_mac`. See this part: let kind = if (style == MacStmtStyle::Braces && self.token != token::Dot && self.token != token::Question) || self.token == token::Semi || self.token == token::Eof { StmtKind::MacCall(P(MacCallStmt { mac, style, attrs, tokens: None })) } else { // Since none of the above applied, this is an expression statement macro. let e = self.mk_expr(lo.to(hi), ExprKind::MacCall(mac)); let e = self.maybe_recover_from_bad_qpath(e)?; let e = self.parse_expr_dot_or_call_with(e, lo, attrs)?; let e = self.parse_expr_assoc_with( 0, LhsExpr::AlreadyParsed { expr: e, starts_statement: false }, )?; StmtKind::Expr(e) }; A braced macro call at the head of a statement is always either extended into ExprKind::Field / MethodCall / Await / Try / Binary, or else returned as StmtKind::MacCall. We can never get a StmtKind::Expr containing ExprKind::MacCall containing brace delimiter. Add classify::expr_is_complete Fix redundant parens around braced macro call in match arms use key-value format in stage0 file Currently, we are working on the python removal task on bootstrap. Which means we have to extract some data from the stage0 file using shell scripts. However, parsing values from the stage0.json file is painful because shell scripts don't have a built-in way to parse json files. This change simplifies the stage0 file format to key-value pairs, which makes it easily readable from any environment. Signed-off-by: onur-ozkan <[email protected]> awk stage0 file on CI Signed-off-by: onur-ozkan <[email protected]> use stage0 file in `bootstrap.py` Signed-off-by: onur-ozkan <[email protected]> use shared stage0 parser from `build_helper` Signed-off-by: onur-ozkan <[email protected]> remove outdated stage0.json parts Signed-off-by: onur-ozkan <[email protected]> move comments position in `src/stage0` Signed-off-by: onur-ozkan <[email protected]> io::Write::write_fmt: panic if the formatter fails when the stream does not fail std::alloc: using posix_memalign instead of memalign on solarish. simpler code path since small alignments are already taking care of. close rust-langGH-124787 Relax slice safety requirements Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language. References must also be non-null Add `crate_type` method to `Rustdoc` Add `crate_name` method to `Rustdoc` and `Rustc` Add `python_command` and `source_path` functions Add `extern_` method to `Rustdoc` Migrate `rustdoc-scrape-examples-ordering` to `rmake` Fix some minor issues from the ui-test auto-porting solve: replace all `debug` with `trace` structurally important functions to `debug` fix hidden title in command-line-arguments docs Assert that MemCategorizationVisitor actually errors when it bails ungracefully Inline MemCategorization into ExprUseVisitor Remove unncessary mut ref Introduce TypeInformationCtxt to abstract over LateCtxt/FnCtxt Make LateCtxt be a type info delegate for EUV for clippy Try structurally resolve Apply nits Propagate errors rather than using return_if_err Match ergonomics 2024: migration lint Unfortunately, we can't always offer a machine-applicable suggestion when there are subpatterns from macro expansion. Co-Authored-By: Guillaume Boisseau <[email protected]> Add AST pretty-printer tests for let-else Pretty-print let-else with added parenthesization when needed rename

@saethlin

reorganised attrs removed OsStr impls added backticks Add note about possible allocation-sharing to Arc/Rc<str/[T]/CStr>::default. Use shared statics for the ArcInner for Arc<str, CStr>::default, and for Arc<[T]>::default where alignof(T) <= 16. fixed unsafe block Revert "fixed unsafe block" This reverts commit 6eb6aee. Return coherent description for boolean instead of panicking Improve check-cfg CLI errors with more structured diagnostics Move various stdlib tests to library/std/tests Run tidy on tests Rename test for issue 21058 Implement `edition` method on `Rustdoc` type as well Migrate `run-make/doctests-runtool` to rmake Rename `run-make-support` library `output` method to `command_output` Add new `output` method to `Rustc` and `Rustdoc` types Migrate `run-make/rustdoc-error-lines` to `rmake.rs` add f16 associated constants NaN and infinity are not included as they require arithmetic. add f128 associated constants NaN and infinity are not included as they require arithmetic. add constants in std::f16::consts add constants in std::f128::consts update error messages in ui tests Document that `create_dir_all` calls `mkdir`/`CreateDirW` multiple times Also mention that there might be leftover directories in the error case. Prefer lower vtable candidates in select in new solver Don't consider candidates with no failing where clauses Use super_fold in RegionsToStatic visitor Make check-cfg docs more user-friendly Record impl args in the InsepctCandiate rather than rematching during select Use correct ImplSource for alias bounds BorrowckInferCtxt: infcx by value borrowck: more eagerly prepopulate opaques switch new solver to directly inject opaque types Update books Adjust dbg.value/dbg.declare checks for LLVM update llvm/llvm-project#89799 changes llvm.dbg.value/declare intrinsics to be in a different, out-of-instruction-line representation. For example call void @llvm.dbg.declare(...) becomes #dbg_declare(...) Update tests accordingly to work with both the old and new way. Adjust 64-bit ARM data layouts for LLVM update LLVM has updated data layouts to specify `Fn32` on 64-bit ARM to avoid C++ accidentally underaligning functions when trying to comply with member function ABIs. This should only affect Rust in cases where we had a similar bug (I don't believe we have one), but our data layout must match to generate code. As a compatibility adaptatation, if LLVM is not version 19 yet, `Fn32` gets voided from the data layout. See llvm/llvm-project#90415 Update version of cc crate to v1.0.97 Reason: In order to build the Windows version of the Rust toolchain for the Android platform, the following patch to the cc is crate is required to avoid incorrectly determining that we are building with the Android NDK: rust-lang/cc-rs@57853c4 This patch is present in version 1.0.80 and newer versions of the cc crate. The rustc source distribution currently has 3 different versions of cc in the vendor directory, only one of which has the necessary fix. We (the Android Rust toolchain) are currently maintaining local patches to upgrade the cc crate dependency versions, which we would like to upstream. Furthermore, beyond the specific reason, the cc crate in bootstrap is currently pinned at an old version due to problems in the past when trying to update it. It is worthwhile to figure out and resolve these problems so we can keep the dependency up-to-date. Other fixes: As of cc v1.0.78, object files are prefixed with a 16-character hash. Update src/bootstrap/src/core/build_steps/llvm.rs to account for this to avoid failures when building libunwind and libcrt. Note that while the hash prefix was introduced in v1.0.78, in order to determine the names of the object files without scanning the directory, we rely on the compile_intermediates method, which was introduced in cc v1.0.86 As of cc v1.0.86, compilation on MacOS uses the -mmacosx-version-min flag. A long-standing bug in the CMake rules for compiler-rt causes compilation to fail when this flag is specified. So we add a workaround to suppress this flag. Updating to cc v1.0.91 and newer requires fixes to bootstrap unit tests. The unit tests use targets named "A", "B", etc., which fail a validation check introduced in 1.0.91 of the cc crate. Implement lldb formattter for "clang encoded" enums (LLDB 18.1+) Summary: I landed a fix last year to enable `DW_TAG_variant_part` encoding in LLDBs (https://reviews.llvm.org/D149213). This PR is a corresponding fix in synthetic formatters to decode that information. This is in no way perfect implementation but at least it improves the status quo. But most types of enums will be visible and debuggable in some way. I've also updated most of the existing tests that touch enums and re-enabled test cases based on LLDB for enums. Test Plan: ran tests `./x test tests/debuginfo/`. Also tested manually in LLDB CLI and LLDB VSCode Other Thoughs A better approach would probably be adopting [formatters from codelldb](https://github.com/vadimcn/codelldb/blob/master/formatters/rust.py). There is some neat hack that hooks up summary provider via synthetic provider which can ultimately fix more display issues for Rust types and enums too. But getting it to work well might take more time that I have right now. f16::is_sign_{positive,negative} were feature-gated on f128 Correct the const stabilization of `last_chunk` for slices `<[T]>::last_chunk` should have become const stable as part of <rust-lang#117561>. Update the const stability gate to reflect this. Add tests Lower never patterns to Unreachable in mir rustdoc: dedup search form HTML This change constructs the search form HTML using JavaScript, instead of plain HTML. It uses a custom element because - the [parser]'s insert algorithm runs the connected callback synchronously, so we won't get layout jank - it requires very little HTML, so it's a real win in size [parser]: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token This shrinks the standard library by about 60MiB, by my test. rustdoc: allow custom element rustdoc-search generalize hr alias: avoid unconstrainable infer vars narrow down visibilities in `rustc_parse::lexer` replace another Option<Span> by DUMMY_SP Don't ICE when we cannot eval a const to a valtree in the new solver Do not ICE on `AnonConst`s in `diagnostic_hir_wf_check` coverage: Add branch coverage support for let-else coverage: Add branch coverage support for if-let and let-chains Do not ICE on foreign malformed `diagnostic::on_unimplemented` Fix rust-lang#124651. Add test for rust-lang#124651 Update cargo compiler: Privatize `Parser::current_closure` This was added as pub in 2021 and remains only privately used in 2024! compiler: derive Debug in parser It's annoying to debug the parser if you have to stop every five seconds to add a Debug impl. compiler: add `Parser::debug_lookahead` I tried debugging a parser-related issue but found it annoying to not be able to easily peek into the Parser's token stream. Add a convenience fn that offers an opinionated view into the parser, but one that is useful for answering basic questions about parser state. Fuchsia test runner: fixup script This commit fixes several issues in the fuchsia-test-runner.py script: 1. Migrate from `pm` to `ffx` for package management, as `pm` is now deprecated. Furthermore, the `pm` calls used in this script no longer work at Fuchsia's HEAD. This is the largest change in this commit, and impacts all steps around repository management (creation and registration of the repo, as well as package publishing). 2. Allow for `libtest` to be either statically or dynamically linked. The script assumed it was dynamically linked, but the current Rust behavior at HEAD is to statically link it. 3. Minor cleanup to use `ffx --machine json` rather than string parsing. 4. Minor cleanup to the docs around the script. std::net: Socket::new_raw set to SO_NOSIGPIPE on freebsd/netbsd/dragonfly. add note about `AlreadyExists` to `create_new` Apply suggestions from code review Co-authored-by: Jubilee <[email protected]> iOS/tvOS/watchOS/visionOS: Default to kernel-defined backlog in listen This behavior is defined in general for the XNU kernel, not just macOS: https://github.com/apple-oss-distributions/xnu/blob/rel/xnu-10002/bsd/kern/uipc_socket.c iOS/tvOS/watchOS/visionOS: Set the main thread name Tested in the iOS simulator that the thread name is not set by default, and that setting it improves the debugging experience in lldb / Xcode. iOS/tvOS/watchOS: Fix alloc w. large alignment on older versions Tested on an old MacBook and the iOS simulator. iOS/tvOS/watchOS/visionOS: Fix reading large files Tested in the iOS simulator with something like: ``` let mut buf = vec![0; c_int::MAX as usize - 1 + 2]; let read_bytes = f.read(&mut buf).unwrap(); ``` iOS/tvOS/watchOS/visionOS: Improve File Debug impl This uses `libc::fcntl`, which, while not explicitly marked as available in the headers, is already used by `File::sync_all` and `File::sync_data` on these platforms, so should be fine to use here as well. next_power_of_two: add a doctest to show what happens on 0 rustc: Change LLVM target for the wasm32-wasip2 Rust target This commit changes the LLVM target of for the Rust `wasm32-wasip2` target to `wasm32-wasip2` as well. LLVM does a bit of detection on the target string to know when to call `wasm-component-ld` vs `wasm-ld` so otherwise clang is invoking the wrong linker. rustc: Don't pass `-fuse-ld=lld` on wasm targets This argument isn't necessary for WebAssembly targets since `wasm-ld` is the only linker for the targets. Passing it otherwise interferes with Clang's linker selection on `wasm32-wasip2` so avoid it altogether. rustc: Change wasm32-wasip2 to PIC-by-default This commit changes the new `wasm32-wasip2` target to being PIC by default rather than the previous non-PIC by default. This change is intended to make it easier for the standard library to be used in a shared object in its precompiled form. This comes with a hypothetical modest slowdown but it's expected that this is quite minor in most use cases or otherwise wasm compilers and/or optimizing runtimes can elide the cost. Handle normalization failure in `struct_tail_erasing_lifetimes` Fixes an ICE that occurred when the struct in question has an error Fix insufficient logic when searching for the underlying allocation in the `invalid_reference_casting` lint, when trying to lint on bigger memory layout casts. rustdoc: use stability, instead of features, to decide what to show To decide if internal items should be inlined in a doc page, check if the crate is itself internal, rather than if it has the rustc_private feature flag. The standard library uses internal items, but is not itself internal and should not show internal items on its docs pages. Avoid a cast in `ptr::slice_from_raw_parts(_mut)` Casting to `*const ()` or `*mut ()` just bloats the MIR, so let's not. If ACP#362 goes through we can keep calling `ptr::from_raw_parts(_mut)` in these also without the cast, but that hasn't had any libs-api attention yet, so I'm not waiting on it. add enum variant field names to make the code clearer remove redundant flat vs nested distinction to simplify enum turn all_nested_unused into used_childs store the span of the nested part of the use tree in the ast remove braces when fixing a nested use tree into a single use Use generic `NonZero` in examples. Simplify `clippy` lint. Simplify suggestion. Use generic `NonZero`. crashes: add lastest batch of crash tests Make sure we don't deny macro vars w keyword names Simplify `use crate::rustc_foo::bar` occurrences. They can just be written as `use rustc_foo::bar`, which is far more standard. (I didn't even know that a `crate::` prefix was valid.) Update cc crate to v1.0.97 Ignore empty RUSTC_WRAPPER in bootstrap This change ignores the RUSTC_WRAPPER_REAL environment variable if it's set to the empty string. This matches cargo behaviour and allows users to easily shadow a globally set RUSTC_WRAPPER (which they might have set for non-rustc projects). Handle normalization failure in `struct_tail_erasing_lifetimes` Fixes an ICE that occurred when the struct in question has an error Implement `as_chunks` with `split_at_unchecked` Remove `macro_use` from `stable_hasher`. Normal `use` items are nicer. Reorder top-level crate items. - `use` before `mod` - `pub` before `non-pub` - Alphabetical order within sections Remove `extern crate tracing`. `use` is a nicer way of doing things. Document `Pu128`. And move the `repr` line after the `derive` line, where it's harder to overlook. (I overlooked it initially, and didn't understand how this type worked.) Remove `TinyList`. It is optimized for lists with a single element, avoiding the need for an allocation in that case. But `SmallVec<[T; 1]>` also avoids the allocation, and is better in general: more standard, log2 number of allocations if the list exceeds one item, and a much more capable API. This commit removes `TinyList` and converts the two uses to `SmallVec<[T; 1]>`. It also reorders the `use` items in the relevant file so they are in just two sections (`pub` and non-`pub`), ordered alphabetically, instead of many sections. (This is a relevant part of the change because I had to decide where to add a `use` item for `SmallVec`.) Remove `vec_linked_list`. It provides a way to effectively embed a linked list within an `IndexVec` and also iterate over that list. It's written in a very generic way, involving two traits `Links` and `LinkElem`. But the `Links` trait is only impl'd for `IndexVec` and `&IndexVec`, and the whole thing is only used in one module within `rustc_borrowck`. So I think it's over-engineered and hard to read. Plus it has no comments. This commit removes it, and adds a (non-generic) local iterator for the use within `rustc_borrowck`. Much simpler. Remove `enum_from_u32`. It's a macro that just creates an enum with a `from_u32` method. It has two arms. One is unused and the other has a single use. This commit inlines that single use and removes the whole macro. This increases readability because we don't have two different macros interacting (`enum_from_u32` and `language_item_table`). Update Tests Fix Error Messages for `break` Inside Coroutines Previously, `break` inside `gen` blocks and functions were incorrectly identified to be enclosed by a closure. This PR fixes it by displaying an appropriate error message for async blocks, async closures, async functions, gen blocks, gen closures, gen functions, async gen blocks, async gen closures and async gen functions. Note: gen closure and async gen closure are not supported by the compiler yet but I have added an error message here assuming that they might be implemented in the future. Also, fixes grammar in a few places by replacing `inside of a $coroutine` with `inside a $coroutine`. Migrate `run-make/rustdoc-map-file` to rmake Add more ICEs due to malformed diagnostic::on_unimplemented Fix ICEs in diagnostic::on_unimplemented Handle field projections like slice indexing in invalid_reference_casting Fix typos Do not add leading asterisk in the `PartialEq` Adding leading asterisk can cause compilation failure for the _types_ that don't implement the `Copy`. Use sum type for `WorkflowRunType` Parse try build CI job name from commit message Make the regex more robust Address review comments CI: fix auto builds and make sure that we always have at least a single CI job Include the line number in tidy's `iter_header` Tidy check for test revisions that are mentioned but not declared If a `[revision]` name appears in a test header directive or error annotation, but isn't declared in the `//@ revisions:` header, that is almost always a mistake. In cases where a revision needs to be temporarily disabled, adding it to an `//@ unused-revision-names:` header will suppress these checks for that name. Adding the wildcard name `*` to the unused list will suppress these checks for the entire file. Fix test problems discovered by the revision check Most of these changes either add revision names that were apparently missing, or explicitly mark a revision name as currently unused. fix rust-lang#124714 str.to_lowercase sigma handling Make a minimal amount of region APIs public Add `ErrorGuaranteed` to `Recovered::Yes` and use it more. The starting point for this was identical comments on two different fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`: ``` // FIXME: investigate making this a `Option<ErrorGuaranteed>` recovered: bool ``` I tried that, and then found that I needed to add an `ErrorGuaranteed` to `Recovered::Yes`. Then I ended up using `Recovered` instead of `Option<ErrorGuaranteed>` for these two places and elsewhere, which required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`. This makes things more consistent, because `Recovered` is used in more places, and there are fewer uses of `bool` and `Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible to set `recovered` to `Recovered::Yes` without having emitted an error. interpret/miri: better errors on failing offset_from chore: remove repetitive words Make `#![feature]` suggestion MaybeIncorrect Update Makefiles with explanatory comments correct comments add FIXME Upgrade the version of Clang used in the build, move MSVC builds to Server 2022 Rename Generics::params to Generics::own_params Add benchmarks for `impl Debug for str` In order to inform future perf improvements and prevent regressions, lets add some benchmarks that stress `impl Debug for str`. Remove unused `step_trait` feature. Also sort the features. Remove unused `LinkSelfContainedDefault::is_linker_enabled` method. Correct a comment. I tried simplifying `RegionCtxt`, which led me to finding that the fields are printed in `sccs_info`. Fix up `DescriptionCtx::new`. The comment mentions that `ReBound` and `ReVar` aren't expected here. Experimentation with the full test suite indicates this is true, and that `ReErased` also doesn't occur. So the commit introduces `bug!` for those cases. (If any of them show up later on, at least we'll have a test case.) The commit also remove the first sentence in the comment. `RePlaceholder` is now handled in the match arm above this comment and nothing is printed for it, so that sentence is just wrong. Furthermore, issue rust-lang#13998 was closed some time ago. Fix out-of-date comment. The type name has changed. Remove `TyCtxt::try_normalize_erasing_late_bound_regions`. It's unused. Remove out-of-date comment. The use of `Binder` was removed in the recent rust-lang#123900, but the comment wasn't removed at the same time. De-tuple two `vtable_trait_first_method_offset` args. Thus eliminating a `FIXME` comment. opt-dist: use xz2 instead of xz crate xz crate consist of simple reexport of xz2 crate. Why? Idk. analyse visitor: build proof tree in probe update crashes always use `GenericArgsRef` Inline and remove unused methods. `InferCtxt::next_{ty,const,int,float}_var_id` each have a single call site, in `InferCtt::next_{ty,const,int,float}_var` respectively. The only remaining method that creates a var_id is `InferCtxt::next_ty_var_id_in_universe`, which has one use outside the crate. Use fewer origins when creating type variables. `InferCtxt::next_{ty,const}_var*` all take an origin, but the `param_def_id` is almost always `None`. This commit changes them to just take a `Span` and build the origin within the method, and adds new methods for the rare cases where `param_def_id` might not be `None`. This avoids a lot of tedious origin building. Specifically: - next_ty_var{,_id_in_universe,_in_universe}: now take `Span` instead of `TypeVariableOrigin` - next_ty_var_with_origin: added - next_const_var{,_in_universe}: takes Span instead of ConstVariableOrigin - next_const_var_with_origin: added - next_region_var, next_region_var_in_universe: these are unchanged, still take RegionVariableOrigin The API inconsistency (ty/const vs region) seems worth it for the large conciseness improvements. print walltime benchmarks with subnanosecond precision example results when benchmarking 1-4 serialized ADD instructions ``` running 4 tests test add ... bench: 0.24 ns/iter (+/- 0.00) test add2 ... bench: 0.48 ns/iter (+/- 0.01) test add3 ... bench: 0.72 ns/iter (+/- 0.01) test add4 ... bench: 0.96 ns/iter (+/- 0.01) ``` emit fractional benchmark nanoseconds in libtest's JSON output format bootstrap should also render fractional nanoseconds for benchmarks from_str_radix: outline only the panic function codegen: memmove/memset cannot be non-temporal coverage: Separately compute the set of BCBs with counter mappings coverage: Make the special case for async functions exit early coverage: Don't recompute the number of test vector bitmap bytes The code in `extract_mcdc_mappings` that allocates these bytes already knows how many are needed in total, so there's no need to immediately recompute that value in the calling function. coverage: Destructure the mappings struct to make sure we don't miss any coverage: Rename `CoverageSpans` to `ExtractedMappings` coverage: Tidy imports in `rustc_mir_transform::coverage` Fix parse error message for meta items Refactor float `Primitive`s to a separate `Float` type Migrate `run-make/rustdoc-output-path` to rmake Make builtin_deref just return a Ty rename some variants in FulfillmentErrorCode Remove glob imports for ObligationCauseCode Rename some ObligationCauseCode variants More rename fallout Name tweaks Add a codegen test for transparent aggregates Aggregating arrays can always take the place path Make SSA aggregates without needing an alloca Lift `Lift` Lift `TraitRef` into `rustc_type_ir` Also debug Apply nits, make some bounds into supertraits on inherent traits Add `-lmingwex` second time in `mingw_libs` Upcoming mingw-w64 releases will contain small math functions refactor which moved implementation around. As a result functions like `lgamma` now depend on libraries in this order: `libmingwex.a` -> `libmsvcrt.a` -> `libmingwex.a`. Fixes rust-lang#124221 ignore generics args in attribute paths bootstrap: add comments for the automatic dry run fix typo Co-authored-by: jyn <[email protected]> reachable computation: extend explanation of what this does, and why Make sure we consume a generic arg when checking mistyped turbofish Update cargo std::rand: adding solaris/illumos for getrandom support. To help solarish support for miri https://rust-lang/miri/issues/3567 Update ena to 0.14.3 Fix typo in ManuallyDrop's documentation Add @saethlin to some triagebot groups Refactor Apple `target_abi` This was bundled together with `Arch`, which complicated a few code paths and meant we had to do more string matching than necessary. Match ergonomics 2024: let `&` patterns eat `&mut` Various fixes: - Only show error when move-check would not be triggered - Add structured suggestion Fix spans when macros are involved Comments and fixes Rename `explicit_ba` No more `Option<Option<>>` Remove redundant comment Move all ref pat logic into `check_pat_ref` Add comment on `cap_to_weakly_not` Co-authored-by: Guillaume Boisseau <[email protected]> Stabilize `byte_slice_trim_ascii` for `&[u8]`/`&str` Remove feature from documentation examples Add rustc_const_stable attribute to stabilized functions Update intra-doc link for `u8::is_ascii_whitespace` on `&[u8]` functions Document proper usage of `fmt::Error` and `fmt()`'s `Result`. Documentation of these properties previously existed in a lone paragraph in the `fmt` module's documentation: <https://doc.rust-lang.org/1.78.0/std/fmt/index.html#formatting-traits> However, users looking to implement a formatting trait won't necessarily look there. Therefore, let's add the critical information (that formatting per se is infallible) to all the involved items. check if `x test tests` missing any test directory Signed-off-by: onur-ozkan <[email protected]> remap missing path `tests/crashes` to `tests` Signed-off-by: onur-ozkan <[email protected]> add "tidy-alphabetical" check on "tests" remap list Signed-off-by: onur-ozkan <[email protected]> Handle Deref expressions in invalid_reference_casting unix/fs: a bit of cleanup around host-specific code solaris support start. reduce tokio features remove rand test the actual target-specific things we want to test are all in getrandom, and rand already tests miri itself getrandom: test with and without isolation also add some comments for why we keep certain old obscure APIs supported avoid code duplication between realloc and malloc Implement wcslen organize libc tests into a proper folder, and run some of them on Windows README: update introduction remove problems that I do not think we have seen in a while io::Error handling: keep around the full io::Error for longer so we can give better errors Implement non-null pointer for malloc(0) Allow test targets to be set via CLI args Update CI script for the miri-script test changes Update documentation for miri-script test changes minor tweaks make MIRI_TEST_TARGET entirely an internal thing make RUSTC_BLESS entirely an internal thing do not run symlink tests on Windows hosts rename 'extern-so' to 'native-lib' Preparing for merge from rustc alloc: update comments around malloc() alignment separate windows heap functions from C heap shims Add windows_i686_gnullvm to the list Pin libc back to 0.2.153 Update Cargo.lock fix few typo in filecheck annotations Consolidate obligation cause codes for where clauses Clean up users of rust_dbg_call Enable profiler for armv7-unknown-linux-gnueabihf. Always hide private fields in aliased type Migrate `run-make/rustdoc-shared-flags` to rmake Relax allocator requirements on some Rc APIs. * Remove A: Clone bound from Rc::assume_init, Rc::downcast, and Rc::downcast_unchecked. * Make From<Rc<[T; N]>> for Rc<[T]> allocator-aware. Internal changes: * Made Arc::internal_into_inner_with_allocator method into Arc::into_inner_with_allocator associated fn. * Add private Rc::into_inner_with_allocator (to match Arc), so other fns don't have to juggle ManuallyDrop. Relax A: Clone requirement on Rc/Arc::unwrap_or_clone. Add test for rust-lang#122775 Refactoring after the `PlaceValue` addition I added `PlaceValue` in 123775, but kept that one line-by-line simple because it touched so many places. This goes through to add more helpers & docs, and change some `PlaceRef` to `PlaceValue` where the type didn't need to be included. No behaviour changes. Make it possible to derive Lift/TypeVisitable/TypeFoldable in rustc_type_ir Uplift `TraitPredicate` Uplift `ExistentialTraitRef`, `ExistentialProjection`, `ProjectionPredicate` Uplift `NormalizesTo`, `CoercePredicate`, and `SubtypePredicate` Apply nits, uplift ExistentialPredicate too And `ImplPolarity` too Expand on expr_requires_semi_to_be_stmt documentation Mark expr_requires_semi_to_be_stmt call sites For each of these, we need to decide whether they need to be using `expr_requires_semi_to_be_stmt`, or `expr_requires_comma_to_be_match_arm`, which are supposed to be 2 different behaviors. Previously they were conflated into one, causing either too much or too little parenthesization. Macro call with braces does not require semicolon to be statement This commit by itself is supposed to have no effect on behavior. All of the call sites are updated to preserve their previous behavior. The behavior changes are in the commits that follow. Add ExprKind::MacCall statement boundary tests Fix pretty printer statement boundaries after braced macro call Delete MacCall case from pretty-printing semicolon after StmtKind::Expr I didn't figure out how to reach this condition with `expr` containing `ExprKind::MacCall`. All the approaches I tried ended up with the macro call ending up in the `StmtKind::MacCall` case below instead. In any case, from visual inspection this is a bugfix. If we do end up with a `StmtKind::Expr` containing `ExprKind::MacCall` with brace delimiter, it would not need ";" printed after it. Add test of unused_parens lint involving macro calls Document the situation with unused_parens lint and braced macro calls Add parser tests for statement boundary insertion Mark Parser::expr_is_complete call sites Document MacCall special case in Parser::expr_is_complete Document MacCall special case in Parser::parse_arm Add macro calls to else-no-if parser test Remove MacCall special case from recovery after missing 'if' after 'else' The change to the test is a little goofy because the compiler was guessing "correctly" before that `falsy! {}` is the condition as opposed to the else body. But I believe this change is fundamentally correct. Braced macro invocations in statement position are most often item-like (`thread_local! {...}`) as opposed to parenthesized macro invocations which are condition-like (`cfg!(...)`). Remove MacCall special cases from Parser::parse_full_stmt It is impossible for expr here to be a braced macro call. Expr comes from `parse_stmt_without_recovery`, in which macro calls are parsed by `parse_stmt_mac`. See this part: let kind = if (style == MacStmtStyle::Braces && self.token != token::Dot && self.token != token::Question) || self.token == token::Semi || self.token == token::Eof { StmtKind::MacCall(P(MacCallStmt { mac, style, attrs, tokens: None })) } else { // Since none of the above applied, this is an expression statement macro. let e = self.mk_expr(lo.to(hi), ExprKind::MacCall(mac)); let e = self.maybe_recover_from_bad_qpath(e)?; let e = self.parse_expr_dot_or_call_with(e, lo, attrs)?; let e = self.parse_expr_assoc_with( 0, LhsExpr::AlreadyParsed { expr: e, starts_statement: false }, )?; StmtKind::Expr(e) }; A braced macro call at the head of a statement is always either extended into ExprKind::Field / MethodCall / Await / Try / Binary, or else returned as StmtKind::MacCall. We can never get a StmtKind::Expr containing ExprKind::MacCall containing brace delimiter. Add classify::expr_is_complete Fix redundant parens around braced macro call in match arms use key-value format in stage0 file Currently, we are working on the python removal task on bootstrap. Which means we have to extract some data from the stage0 file using shell scripts. However, parsing values from the stage0.json file is painful because shell scripts don't have a built-in way to parse json files. This change simplifies the stage0 file format to key-value pairs, which makes it easily readable from any environment. Signed-off-by: onur-ozkan <[email protected]> awk stage0 file on CI Signed-off-by: onur-ozkan <[email protected]> use stage0 file in `bootstrap.py` Signed-off-by: onur-ozkan <[email protected]> use shared stage0 parser from `build_helper` Signed-off-by: onur-ozkan <[email protected]> remove outdated stage0.json parts Signed-off-by: onur-ozkan <[email protected]> move comments position in `src/stage0` Signed-off-by: onur-ozkan <[email protected]> io::Write::write_fmt: panic if the formatter fails when the stream does not fail std::alloc: using posix_memalign instead of memalign on solarish. simpler code path since small alignments are already taking care of. close rust-langGH-124787 Relax slice safety requirements Per rust-lang#116677 (comment), the language as written promises too much. This PR relaxes the language to be consistent with current semantics. If and when rust-lang#117945 is implemented, we can revert to the old language. References must also be non-null Add `crate_type` method to `Rustdoc` Add `crate_name` method to `Rustdoc` and `Rustc` Add `python_command` and `source_path` functions Add `extern_` method to `Rustdoc` Migrate `rustdoc-scrape-examples-ordering` to `rmake` Fix some minor issues from the ui-test auto-porting solve: replace all `debug` with `trace` structurally important functions to `debug` fix hidden title in command-line-arguments docs Assert that MemCategorizationVisitor actually errors when it bails ungracefully Inline MemCategorization into ExprUseVisitor Remove unncessary mut ref Introduce TypeInformationCtxt to abstract over LateCtxt/FnCtxt Make LateCtxt be a type info delegate for EUV for clippy Try structurally resolve Apply nits Propagate errors rather than using return_if_err Match ergonomics 2024: migration lint Unfortunately, we can't always offer a machine-applicable suggestion when there are subpatterns from macro expansion. Co-Authored-By: Guillaume Boisseau <[email protected]> Add AST pretty-printer tests for let-else Pretty-print let-else with added parenthesization when needed rename

…cottmcm offset: allow zero-byte offset on arbitrary pointers As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)): - Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer - Inbounds-offset-by-zero is allowed on all pointers, including the null pointer - `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid). The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this. The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here. I think we should change offset_from further, but that's a separate discussion. Fixes rust-lang#65108 [Tracking issue](rust-lang#117945) | [T-lang summary](rust-lang#117329 (comment)) Cc `@nikic`

offset: allow zero-byte offset on arbitrary pointers As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)): - Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer - Inbounds-offset-by-zero is allowed on all pointers, including the null pointer - `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid). The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this. The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here. I think we should change offset_from further, but that's a separate discussion. Fixes rust-lang/rust#65108 [Tracking issue](rust-lang/rust#117945) | [T-lang summary](rust-lang/rust#117329 (comment)) Cc `@nikic`

…oli-obk offset_from: always allow pointers to point to the same address This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*. Tracking issue: rust-lang#117945 ### What is provenance monotonicity and why does it matter? Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program. We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do. We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes `*ptr = *ptr` a provenance-stripping operation (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.) ### What does `offset_from` have to do with provenance monotonicity? With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`. To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined. ### What further consequences does this have? It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`. The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation. ### What about the backend? LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply. If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang#124921 (comment)).

Rollup merge of rust-lang#124921 - RalfJung:offset-from-same-addr, r=oli-obk offset_from: always allow pointers to point to the same address This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*. Tracking issue: rust-lang#117945 ### What is provenance monotonicity and why does it matter? Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program. We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do. We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes `*ptr = *ptr` a provenance-stripping operation (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.) ### What does `offset_from` have to do with provenance monotonicity? With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`. To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined. ### What further consequences does this have? It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`. The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation. ### What about the backend? LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply. If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang#124921 (comment)).

offset_from: always allow pointers to point to the same address This PR implements the last remaining part of the t-opsem consensus in rust-lang/unsafe-code-guidelines#472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*. Tracking issue: rust-lang/rust#117945 ### What is provenance monotonicity and why does it matter? Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program. We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do. We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes `*ptr = *ptr` a provenance-stripping operation (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.) ### What does `offset_from` have to do with provenance monotonicity? With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`. To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined. ### What further consequences does this have? It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`. The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in rust-lang/rust#117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation. ### What about the backend? LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply. If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](rust-lang/rust#124921 (comment)).

RalfJung · 2024-07-22T14:03:12Z

I created a reference PR at rust-lang/reference#1541. That should complete the implementation of this feature. :) (Aside from the GCC backend, which is tracked separately at rust-lang/rustc_codegen_gcc#516.)

Now that [1] is completed, zero-sized accesses no longer require provenance. Per [2], zero-sized references are no longer required to be dereferenceable, and so may not carry provenance. This commit updates `Ptr`'s invariants to not require provenance or a valid allocation when its referent is zero-sized. [1] rust-lang/rust#117945 [2] rust-lang/rust#125021

Now that [1] is completed, zero-sized accesses no longer require provenance. Per [2], zero-sized references are no longer required to be dereferenceable, and so may not carry provenance. This commit updates `Ptr`'s invariants to not require provenance or a valid allocation when its referent is zero-sized. [1] rust-lang/rust#117945 [2] rust-lang/rust#125021 Closes #874

The new rules were tracked in rust-lang/rust#117945 The corresponding update to the Reference was rust-lang/reference#1541

RalfJung added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Nov 15, 2023

RalfJung mentioned this issue Nov 15, 2023

Decide on zero-sized offsets and memory accesses rust-lang/unsafe-code-guidelines#472

Closed

RalfJung mentioned this issue Nov 29, 2023

"Dangling" means multiple things rust-lang/unsafe-code-guidelines#478

Open

RalfJung mentioned this issue Jan 22, 2024

Optimize large array creation in const-eval #120069

Merged

This was referenced Feb 14, 2024

Provenance for zero-sized accesses? rust-lang/unsafe-code-guidelines#490

Closed

References refer to allocated objects #116677

Merged

offset: allow zero-byte offset on arbitrary pointers #117329

Merged

tautschnig mentioned this issue Apr 9, 2024

Do not assume that ZST-typed symbols refer to unique objects model-checking/kani#3134

Merged

RalfJung mentioned this issue Apr 17, 2024

What are the guarantees over ZST pointers rust-lang/unsafe-code-guidelines#503

Closed

RalfJung mentioned this issue May 9, 2024

offset_from: always allow pointers to point to the same address #124921

Merged

joshlf mentioned this issue May 11, 2024

Update reference safety requirements #125021

Merged

RalfJung mentioned this issue Jul 22, 2024

update 'dangling pointers' to new zero-sized rules rust-lang/reference#1541

Merged

traviscross closed this as completed in rust-lang/reference#1541 Jul 23, 2024

joshlf mentioned this issue Aug 9, 2024

0.8 Release Roadmap google/zerocopy#671

Open

87 tasks

RalfJung mentioned this issue Aug 29, 2024

Do ZST Boxes violate provenance monotonicity? rust-lang/unsafe-code-guidelines#529

Closed

workingjubilee mentioned this issue Sep 2, 2024

update the safety preconditions of from_raw_parts #129483

Open

joshlf mentioned this issue Sep 7, 2024

[pointer] Update requirements for zero-sized types google/zerocopy#1614

Merged

mattheww added a commit to mattheww/nomicon that referenced this issue Oct 15, 2024

Say that dereferencing a pointer to a ZST is no longer undefined

2d896fa

The new rules were tracked in rust-lang/rust#117945 The corresponding update to the Reference was rust-lang/reference#1541

mattheww mentioned this issue Oct 15, 2024

Say that dereferencing a pointer to a ZST is no longer undefined rust-lang/nomicon#467

Open

RalfJung mentioned this issue Oct 19, 2024

zero-sized accesses are fine on null pointers #131919

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

RalfJung commented Nov 15, 2023 •

edited

Loading

bjorn3 commented Nov 16, 2023

RalfJung commented Nov 16, 2023

bjorn3 commented Nov 16, 2023

RalfJung commented Nov 16, 2023

RalfJung commented Nov 16, 2023

GuillaumeGomez commented Nov 16, 2023

RalfJung commented Nov 16, 2023

antoyo commented Nov 16, 2023

RalfJung commented Nov 16, 2023 •

edited

Loading

RalfJung commented Feb 19, 2024 •

edited

Loading

RalfJung commented Jul 22, 2024

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

Comments

RalfJung commented Nov 15, 2023 • edited Loading

Implementation history

bjorn3 commented Nov 16, 2023

RalfJung commented Nov 16, 2023

bjorn3 commented Nov 16, 2023

RalfJung commented Nov 16, 2023

RalfJung commented Nov 16, 2023

GuillaumeGomez commented Nov 16, 2023

RalfJung commented Nov 16, 2023

antoyo commented Nov 16, 2023

RalfJung commented Nov 16, 2023 • edited Loading

RalfJung commented Feb 19, 2024 • edited Loading

RalfJung commented Jul 22, 2024

RalfJung commented Nov 15, 2023 •

edited

Loading

RalfJung commented Nov 16, 2023 •

edited

Loading

RalfJung commented Feb 19, 2024 •

edited

Loading