-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsafe fields #381
Comments
I think there is a use for such unsafe fields even when they are private, see https://internals.rust-lang.org/t/pre-rfc-unsafe-types/3073. |
Honestly speaking, none of these RFCs really convinced me we want unsafe public fields. The further away an access is from where the invariant "is", the more likely it is for that access to be done without enough consideration of the invariant. I'm already worried about modules growing big, I don't think there should ever be direct access to fields carrying invariants from outside the module. If something like this is needed, just provide a public, unsafe setter function. This provides a good place to add precise documentation about when this function is safe to use (thus also very explicitly making this part of the promised, stable API), and the name of the function may give some additional hints to callers. |
@RalfJung if I'm reading your two comments correctly, it sounds like you do want this feature, but you don't want it for public fields? (Seems odd to add it solely for non Update: Oh, whoops, I didn't read the linked pre-RFC or even look carefully at its title. After skimming the proposal there, I can see how that model means that |
I'm not sure I want that feature (unsafe private fields), I am uncertain. But I thought it was worth brining it up for discussion. I do not want unsafe public fields. I think it is a good idea to keep the intuitive notion of abstraction boundary and "how far out do I have to care about this invariant" aligned. |
(just for proper cross referencing purposes, there is an excellent series of comments regarding unsafe fields that unfortunately occurred on an RFC issue other than this one.) |
In that RFC issue, glaebhoerl mentions that safe code can break unsafe code assumptions in at least two ways: glaebhoerl, tsion and RalfJung seem about half convinced that the existence of B invalidates the value of unsafe fields, but I think B is much less terrible than A. In a nutshell, eliminating A is like limiting the use of global mutable state in software. Unsafe fields would create a wonderful invariant that "every memory-safety critical code is an explicit dependency of some unsafe code". This makes inspecting code for safety much easier and more intuitive than the current situation, where starting from unsafe code we need to propagate through every variable it relies on to all code that may access that variable (a much more implicit rule). The rule "only code in unsafe blocks can contribute to any memory unsafety" is not a good goal IMO; unsafe code X relying on the functional correctness of code Y should not mean we need to increase the proof burden around Y by turning off some compiler checks, right? I think the explicit dependency invariant, if we can get it, is about as good as it gets. |
Some quick musings in response to @daniel-vainsencher 's comment
In the absence of an actual RFC, its hard to make verifiable absolute statements about what a given feature would provide. Will it be legal for unsafe code to first borrow an unsafe field into a usual
(I am just trying to say that there is a non-trivial burden of proof here, even when it comes to supporting seemingly obvious claims such as A above.) Update: One might argue that safe code won't be able to do anything harmful with the Update 2: And of course my argument above is a bit of a straw-man, in the sense that buggy unsafe code could also pass a borrowed reference to private data to clients who should not have any access to it. But I'm not claiming that privacy prevents such things. I'm just wary of what claims are made of a proposed feature. |
@daniel-vainsencher Also how do (or don't) indirect calls factor into that scenario? |
To clarify (to myself first of all) I am referring to RFC #80 and believe that the proposed feature is that safe code by itself cannot access unsafe fields. Your scenario of unsafe code passing references to unsafe fields into safe code is allowed. This could require annotating the parameter in the safe function as unsafe (which makes some sense) but I am not assuming this (allowing more uses for the unsafe keyword is a bigger change). Note that unsafe code may access both safe and unsafe fields, but should be written so that safety depends only on values of unsafe fields. I was not at all clear on this in my previous comment. @pnkfelix :
|
I am not advocating for making unsafe fields potentially public (or against it). I can more easily imagine uses for private unsafe fields (restricting access even more than now, hence reducing the inspection burden), but system level abstraction may exist where public makes sense. Do we know of examples of such? |
@daniel-vainsencher What I was thinking of is that "every memory-safety critical code is an explicit dependency of some unsafe code" seems to break down the moment that the unsafe code makes an indirect call -- suddenly the range of code which may be implicated becomes open-ended again. It may be that this isn't relevant in practice because if FWIW, if |
@glaebhoerl sorry, I thought by indirect you meant merely transitive callees, now I understand you mean calls to functions (or methods of traits) passed into the unsafe code as parameters. I am still calling that an explicit dependency in terms of a human inspector, even if the call is more or less opaque to the compiler: when inspecting unsafe code with a function parameter, you are in general liable to find out what values (code) those parameters might have and reason about their functional correctness. The cost incurred is proportional to the power being used (in contrast to accessing any field requires inspection of the module), so it seems ok to me. |
@glaebhoerl I am not sure it is very meaningful as a contract to require unsafe code to be "safe for all possible inputs" when in the current situation, it is often not safe for every value of fields that can be accessed by safe code anywhere in the module. Feels a bit like requiring iron bars on the window when the door is missing... ;) I may well be missing ways to screw things up, we'll see. |
Wondering what @gankro thinks about this proposal, having just read his thesis. |
I have frequently argued against unsafe fields and unsafe types. I essentially introduce the problem they're supposed to they're supposed to solve, and then reject them in this section of the nomicon. I don't really have the energy to put in a detailed argument, but basically I don't believe it's well-motivated. unsafe-fn is, unsafe-trait is. Pub unsafe fields could be argued for as a useful ergonomic boon for exposing an unsafe piece of data (see: mutating statics, accessing repr(packed) fields, derefing raw ptrs). Priv-unsafe fields are basically pointless. I don't think I've ever seen an error in the wild that could have been prevented, and I've seen several that wouldn't have. |
I just read that section (all of chapter two, actually), unsafe fields are The argument I have put above for unsafe private fields is twofold: the Isn't this the place to argue for/against this feature? I have frequently argued against unsafe fields and unsafe types. I I don't really have the energy to put in a detailed argument, but basically — |
This is not actually true. First of all, this relies on nobody forgetting to mark any fields unsafe. Secondly, the code that has to be inspected it still "any function that contains an unsafe block", since there will usually be local variables that we rely on to have certain invariants. Finally, I'd be surprised if, after properly marking all fields as unsafe, there would be a significant number of functions left that have no unsafe block.
I disagree, and think "pub unsafe" is a contradiction in itself. Privacy is tied to abstraction safety is tied to hiding unsafe behind an abstraction boundary, I think it would just be confusing to loosen this connection. |
I once opened an RFC for this, but after talking to @gankro I am convinced this feature doesn't offer much. The number of lines of code in unsafe blocks is not at all a good proxy for the "unsafety" of the code. A single line of unsafe code can do basically anything and affect any other code (usually within the privacy boundary) so you don't shorten review time by typing unsafe a smaller number of times. |
|
If fields have additional invariants, they should be private. If you make them public, you actually make your invariants part of the public API, and you can never ever change the invariants associated with that field again because that could break in combination with client code that still assumes (and establishes!) the old invariants. This is just really bad API design, and IMHO confusing. If you really want this, go ahead and add a getter/setter, that doesn't even incur any overhead after inlining. But at least you had to make some conscious effort to decide that this should be part of your public API. |
The API concern isn't restricted to |
Sure. And the answer to this problem is to make the field private, not to make it |
That was also my argument the first time this feature came around - I don't believe this feature pulls its weight. OTOH, privacy is primarily intended for abstraction (preventing users from depending on incidental details), not for protection (ensuring that invariants always hold). The fact that it can be used for protection is basically an happy accident. To clarify the difference, C strings have no abstraction whatever - they are a raw pointer to memory. However, they do have an invariant - they must point to a valid NUL-terminated string. Every place that constructs such a string must ensure it is valid, and every place that consumes it can rely on it. OTOH, a safe, say, buffered reader needs abstraction but doesn't need protection - it does not hold any critical invariant, but may want to change its internal representation. On the other 2 corners, a tuple has no abstraction and no protection - it is just its contents, while an unsafe |
I disagree, that's not an accident. Abstraction is exactly what allows code to rely on local invariants to be maintained, no matter what well-typed code runs using our data structure. Besides type safety, establishing abstraction is the key feature of type systems. Now, Rust has this mechanism to pretty much embed untyped code within typed code (I know that |
Here we go again. |
@reem I am also not convinced that the feature would pull it's weight. But part of your response confused me:
I didn't think this feature leads to fewer unsafe blocks. If anything, there will be more. I thought the reason some claim it shortens review times is that, in some cases, you get to focus your attention solely on the unsafe blocks. That's quite different, no? |
Maybe "accidentally" wasn't the right word. Still, the existence of (public) |
Moderator note: Just a gentle reminder to all to keep comments constructive. Thanks! |
@Kimundi: actually my fix does not quite address your concern: creating the new value is unsafe, so the common case of This means that the proposed library only adds a speed bump, not a guarantee. So it is not a reasonable replacement for the language feature, and I am not sure it is even worth using in a project: it will probably prevent some bugs, but also make maintainers over-confident. Blech. Great catch! In short, yes, unsafe fields seem to require a language feature. |
@ticki Unsynchronized reading during a write from another thread can be unsafe, because it may produce an invalid value. If a public field is declared as safely readable and there's some kind of unsafe (non-enforced) synchronization strategy, unsafe code can't guard against reading from safe code, and UB may result. |
No, that's not possible. |
Why not? Am I missing something? |
I am guessing that @ticki is relying on "mutable xor shared" and @golddranks is talking about a module that breaks it internally e.g., to provide an efficient concurrent data structure. I think that my assumption that only writing is unsafe needs discarding. Unsafe fields should be unsafe to access, period. And I think that fields not subject to "mutable xor shared" should never be public, even if any unsafe fields are allowed to be (which I don't know about). |
If unsafe fields are unsafe to read, they don't have the same power. Everyone would prefer to write the respective getter function instead of the field directly, sacrificing performance. |
How important is the performance benefit of shaving off an accessor in debug builds? I think the main argument for private unsafe fields is preserving memory safety at a small maintenance expense. In this case, accessing those fields through a small set of functions that ensure written and read values are indeed valid is probably a best practice anyway. |
Well, considering how many getters there are, I think that some gain will be there. |
But the argument for local guarantees for invariants is much more compelling. |
My thoughts on this:
|
@Kimundi About safety of dropping: To my understanding your concern is that any implementation of the Drop trait that accesses the unsafe fields has to include Or are you concerned about memory safety when Drop is not implemented at all? IIUC, Rust does not guarantees that drops will occur, so we should not depend on them for memory safety purposes anyway. Right? |
But it is strongly wanted (and often assumed, though not for memory safety) for destructors to be called. Not doing so would be quite counterintuitive. If destructors are not to be called in unsafe fields, I would require a |
@ticki, @Kimundi, you seem to be referring to dropping the value in the field that is marked unsafe, right? I though Kimundi was referring to a drop of the struct containing the field. But unsafe code already should ensure (beforehand) that it is safe to drop any sensitive fields before they get dropped, right? So the unsafe fields proposal merely helps formalize this obligation: any field that might be unsafe to drop at any time should be marked an unsafe field, and brought to a safe state before it gets dropped. Hence, I think the compiler should drop values in unsafe fields exactly when it would in regular ones. Am I missing something? This is also an important property if we want porting modules (with unsafe to use unsafe fields) to be an easy enhancement of documentation with no associated danger. |
@daniel-vainsencher: I'm referring to this case: struct Foo {
unsafe bar: SomeType
} where If If, however, So either Third option would be to silently not drop then, but that seems like a footgun. |
@Kimundi Thank you for taking the time to write your view very explicitly. The term
You seem to treat accessing unsafe fields as unsafe in sense (1): actually can cause UB, like As far as I am concerned, unsafe fields are exactly like unsafe functions, a tool for (3). Marking a field unsafe is sufficient warning to any maintainer of the code that when the compiler does its usual thing around Therefore in my opinion unsafe fields should be dropped exactly when regular fields are; they should not require manual dropping, implementation of Drop, a Copy bound. All of those are tools an implementor might use to avoid unsafety in sense (1), but its his choice. |
I think you convinced me :) To put what you said in different words, would you agree that unsafe fields could be defined as this:
|
That is a concise and precise definition that I agree with.
|
I think unions will need to eventually support very similar capability, but with the opposite sign - making fields safe. |
I feel like rust-lang/rust#41622 is another point where unsafe fields could have helped: If we make it so that types with unsafe fields to NOT get any automatic instances for unsafe OIBITS, that would have prevented accidentally implementing Really, there is no way to predict how the invariant of a type with unsafe fields could interact with the assumptions provided by unsafe traits, so implementing them automatically is always dangerous. The author of a type with unsafe fields has to convince himself that the guarantees of an unsafe traits a satisfied. Not getting these instances automatically is an ergonomics hit, but the alternative is to err on the side of "what could possibly go wrong", which is not very Rust-y. |
@RalfJung good point. I am pretty strongly in favor of unsafe fields at this point. The only thing that holds me back is some desire to think a bit more about the "unsafe" model more generally. |
Bit OT: I wonder how hard would it be to implement a way to express invariants maintained by said struct and insert lints (and possibly runtime checks in debug mode) about them. Probably it wouldn't help with broken |
I've a question, I want to be able to point to the same entity with one pointer being owned, the other being mutably borrowed. My example is to implement a single linked list which is optimized for appending elements to the end, so I have a first_node and a last_node which can potentially be overlapping: pub struct SingleLinkedList<T>
{
unsafe first_node: OptionNode<T>,
unsafe last_node: &OptionNode<T>,
pub length: mut u32,
} Would that be possible? Or is there any workaround to this? |
@sighoya This won't be possible, as it would break Rust's aliasing model The only thing this does is makes accessing these fields |
This would (finally) be resolved by #3458. |
Multiple developers have registered a desire to allow public access to fields that expose abstraction-breaking impl details (rather than following the current convention of making all such fields private to the structure in question, thus containing the abstraction-breakage to the structure's module).
See the following postponed (or otherwise closed) RFCs:
unsafe
qualifier on struct fields #80The text was updated successfully, but these errors were encountered: