Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite docs for pointer methods #53783

Merged
merged 34 commits into from
Sep 24, 2018
Merged

Rewrite docs for pointer methods #53783

merged 34 commits into from
Sep 24, 2018

Conversation

RalfJung
Copy link
Member

This takes over #51016 by @ecstatic-morse. They did most of the work, I just did some editing.

However, I realized one problem: This updates the docs for the "free functions" in core::ptr, but it does not update the copies of these docs for the inherent methods of the *const T and *mut T types. These getting out-of-sync is certainly bad, but I also don't feel like copying all this stuff around. Instead, we should remove this redundancy. Any good ideas?

ecstatic-morse and others added 14 commits August 29, 2018 10:10
- Add links to the GNU libc docs for `memmove`, `memcpy`, and
  `memset`, as well as internally linking to other functions in `std::ptr`
- List invariants which, when violated, cause UB for all functions
- Add example to `ptr::drop_in_place` and compares it to `ptr::read`.
- Add examples which more closely mirror real world uses for the
  functions in `std::ptr`. Also, move the reimplementation of `mem::swap`
  to the examples of `ptr::read` and use a more interesting example for
  `copy_nonoverlapping`.
- Change module level description
- Define what constitutes a "valid" pointer.
- Centralize discussion of ownership of bitwise copies in `ptr::read` and
  provide an example.
This also removes the overlong link that failed tidy xD.
They closely mirror the docs for `copy_nonoverlapping`
The enumerated list of conditions is replaced by an explanation that
rust doesn't have a formal memory model. It does say that pointers
created directly from references are guaranteed to be valid, and links
to both the "Unsafe Code" section of the book and the "Undefined
Behavior" section of the reference.
Uses `x.offset(i)` must be valid for all `i` in `0..count`.
This splits "valid" into "valid for reads" and "valid for writes", and
also adds the concept of operation size to validity. Now functions which
operate on sequences state that e.g. pointer args must be "valid for reads of
size x".
Also rewrites the reads/writes section to be less reliant on `*const`,
`*mut`
@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 29, 2018
@RalfJung
Copy link
Member Author

RalfJung commented Aug 29, 2018

Cc @rust-lang/docs any advice on how to avoid duplicating all that information?

My own ideas revolve around either using a macro to define the two/three functions together, or picking one as the "canonical place" and making the others link there. In terms of implementation effort, I'd prefer the latter...

@RalfJung
Copy link
Member Author

Also Cc @rust-lang/wg-unsafe-code-guidelines @gnzlbg for the actual content of the docs. Notice that we deliberately do not fully define validity, so no decision is made -- but this also nicely gives us a single spot where to put such a decision eventually, and it does define how all the various memory access methods relate to the idea of "validity of a memory access". But I think nothing there is controversial, this just documents the footprint of these functions (i.e., which memory they will read/write).

src/libcore/intrinsics.rs Outdated Show resolved Hide resolved
src/libcore/intrinsics.rs Outdated Show resolved Hide resolved
src/libcore/intrinsics.rs Outdated Show resolved Hide resolved
src/libcore/intrinsics.rs Outdated Show resolved Hide resolved
src/libcore/intrinsics.rs Outdated Show resolved Hide resolved
src/libcore/intrinsics.rs Outdated Show resolved Hide resolved
src/libcore/ptr.rs Outdated Show resolved Hide resolved
src/libcore/ptr.rs Outdated Show resolved Hide resolved
bors added a commit that referenced this pull request Sep 16, 2018
fix some uses of pointer intrinsics with invalid pointers

[Found by miri](rust-lang/miri#446):

* `Vec::into_iter` calls `ptr::read` (and the underlying `copy_nonoverlapping`) with an unaligned pointer to a ZST. [According to LLVM devs](https://bugs.llvm.org/show_bug.cgi?id=38583), this is UB because it contradicts the metadata we are attaching to that pointer.
* `HashMap` creation calls `ptr:.write_bytes` on a NULL pointer with a count of 0. This is likely not currently UB *currently*, but it violates the rules we are setting in #53783, and we might want to exploit those rules later (e.g. with more `nonnull` attributes for LLVM).

    Probably what `HashMap` really should do is use `NonNull::dangling()` instead of 0 for the empty case, but that would require a more careful analysis of the code.

It seems like ideally, we should do a review of usage of such intrinsics all over libstd to ensure that they use valid pointers even when the size is 0. Is it worth opening an issue for that?
@RalfJung
Copy link
Member Author

Does this PR give any new guarantees wrt. the memory model?

It is intended not to.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 17, 2018

I resolved the minor nits there were brought up. Remaining open discussions:

  • "loopy memory" -- aka different virtual addresses mapping to the same physical address (I see no "loop" here so the term some use here confuses the heck out of me). I think this is a topic that needs broader consideration, should maybe be put onto the UCG WG agenda, but is too deep and too much new policy to be handled "on the side" in what should mostly be a PR clarifying existing policy.
  • Reads/writes from/to NULL. Same as above.
  • ptr::swap, and what to say about what happens when the ranges overlap.
  • The new example for write_bytes is quite confusing, and I am tempted to just remove it. Not sure what would be a better example though?

There is also one comment by @QuietMisdreavus that GitHub shows in this PR, but that I can neither reply to nor mark as resolved... no idea WTF is going on there. But anyway it is a duplicate of @arielb1's concern about ptr::swap.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 18, 2018

@RalfJung

ptr::swap, and what to say about what happens when the ranges overlap.

FWIW the C++ std requires that the ranges do not overlap:

[alg.swap]/1: "Requires: The two ranges [first1, last1) and [first2, first2 + (last1 - first1)) shall not overlap."

but I guess the whole point of ptr::swap is to support overlapping ranges?

@RalfJung
Copy link
Member Author

@gnzlbg existing docs indicate pretty clearly that that is a supported case, yes.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 18, 2018

The only idea I have would be to specify it as a sequential byte-wise swap:

let x = x as *mut u8;
let y = y as *mut u8;
for i in 0..mem::size_of::<T>() {
    let tmp = *x;
    *x = *y;
    *y = tmp;
}

but I don't know if one can construct a situation in which, if two valid values partially overlap, swapping them in this creates invalid values.

@RalfJung
Copy link
Member Author

I expanded fn swap docs to clarify which choice is made, and what that means.

@RalfJung
Copy link
Member Author

@arielb1 does the updated spawn documentation work for you?

@gnzlbg @ubsan any opinion on the write_bytes example?

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 20, 2018

@gnzlbg @ubsan any opinion on the write_bytes example?

LGTM.

@strega-nil
Copy link
Contributor

Me too!

@arielb1
Copy link
Contributor

arielb1 commented Sep 21, 2018

I'm OK with the new mem::swap doc

@RalfJung
Copy link
Member Author

Okay seems like we got consensus. @alexcrichton you are the officially assigned reviewer, how do we proceed?

@alexcrichton
Copy link
Member

Nice! This is quite a PR with quite the discussion! I'm happy though if all stakeholders involved are happy, so let's...

@bors: r+

@bors
Copy link
Contributor

bors commented Sep 24, 2018

📌 Commit c197dc4 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 24, 2018
@QuietMisdreavus
Copy link
Member

@RalfJung my comment that has no reply/resolve indicators is technically both part of that bulk review and this conversation, which hasn't been folded yet.

@bors
Copy link
Contributor

bors commented Sep 24, 2018

⌛ Testing commit c197dc4 with merge 70073ec...

bors added a commit that referenced this pull request Sep 24, 2018
Rewrite docs for pointer methods

This takes over #51016 by @ecstatic-morse. They did most of the work, I just did some editing.

However, I realized one problem: This updates the docs for the "free functions" in `core::ptr`, but it does not update the copies of these docs for the inherent methods of the `*const T` and `*mut T` types. These getting out-of-sync is certainly bad, but I also don't feel like copying all this stuff around. Instead, we should remove this redundancy. Any good ideas?
@bors
Copy link
Contributor

bors commented Sep 24, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 70073ec to master...

@bors bors merged commit c197dc4 into rust-lang:master Sep 24, 2018
@RalfJung RalfJung deleted the ptr-docs branch September 29, 2018 17:28
cuviper pushed a commit to cuviper/rayon-hash that referenced this pull request Nov 15, 2018
fix some uses of pointer intrinsics with invalid pointers

[Found by miri](rust-lang/miri#446):

* `Vec::into_iter` calls `ptr::read` (and the underlying `copy_nonoverlapping`) with an unaligned pointer to a ZST. [According to LLVM devs](https://bugs.llvm.org/show_bug.cgi?id=38583), this is UB because it contradicts the metadata we are attaching to that pointer.
* `HashMap` creation calls `ptr:.write_bytes` on a NULL pointer with a count of 0. This is likely not currently UB *currently*, but it violates the rules we are setting in rust-lang/rust#53783, and we might want to exploit those rules later (e.g. with more `nonnull` attributes for LLVM).

    Probably what `HashMap` really should do is use `NonNull::dangling()` instead of 0 for the empty case, but that would require a more careful analysis of the code.

It seems like ideally, we should do a review of usage of such intrinsics all over libstd to ensure that they use valid pointers even when the size is 0. Is it worth opening an issue for that?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.