Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std/OsString: Add detailed comment on args meaning #95139

Closed
wants to merge 1 commit into from

Conversation

Xuanwo
Copy link
Contributor

@Xuanwo Xuanwo commented Mar 20, 2022

Signed-off-by: Xuanwo [email protected]

Part of #91789

Address the concern:

do we need better docs about what the argument value means (not bytes) (#91789 (comment))

@rust-highfive
Copy link
Collaborator

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 20, 2022
@Xuanwo
Copy link
Contributor Author

Xuanwo commented Mar 20, 2022

Also, ping @dtolnay & @TennyZhuang for a look, thanks.

@Xuanwo
Copy link
Contributor Author

Xuanwo commented Mar 20, 2022

r? @dtolnay

@rust-highfive rust-highfive assigned dtolnay and unassigned m-ou-se Mar 20, 2022
@ChrisDenton
Copy link
Member

ChrisDenton commented Mar 20, 2022

Hmm, I'm not sure I understand. A str is able to be losslessly referenced as an OsStr so their internal representation (and therefore byte size) must be exactly equal. That's the promise AsRef<OsStr> makes, no?

The only difference would be where an OsStr isn't a valid str. But in that case the byte size of the str would be undefined because it's unencodable as UTF-8.

@dtolnay
Copy link
Member

dtolnay commented Mar 27, 2022

A str is able to be losslessly referenced as an OsStr so their internal representation (and therefore byte size) must be exactly equal.

@ChrisDenton the internal representation and therefore byte size are the same. And str's length/capacity are defined as bytes. But nothing says that OsStr capacity is measured in bytes. It could be measured in nibbles for example, and that would be a correct standard library implementation conforming to the currently documented guarantees.

@ChrisDenton
Copy link
Member

ChrisDenton commented Mar 27, 2022

Hm, are the docs intentionally vague on this or is it an unintended omission?

str len and OsStr len returning the length in different units would be very surprising to me on any platform.

@dtolnay
Copy link
Member

dtolnay commented Mar 28, 2022

I think it's not vague and it's not an unintended omission. This part of the OsString docs discusses it:

As discussed in the OsString introduction, OsString and OsStr store strings in a form best suited for cheap inter-conversion between native-platform and Rust string forms, which may differ significantly from both of them, including in storage size and encoding.

This is saying that the "storage size" of an OS string (which is the size that len returns, and the same units as the additional arg of try_reserve) "may differ significantly" from the storage size of the "Rust string form" of the same data, i.e. that os_str.len() and os_str.to_str().unwrap().len() may differ significantly.

len also says:

Note that this does not return the number of bytes in the string in OS string form.

This number is simply useful for passing to other methods, like OsString::with_capacity to avoid reallocations.

And with_capacity deliberately uses "length units of OS string", not "bytes", for the meaning of the argument i.e. it's supposed to be a quantity that you obtained by calling len on an OS string.

The string will be able to hold exactly capacity length units of other OS strings without reallocating.

At a previous job I worked on the compiler for an accelerator that used 16-bit bytes but the application processor attached to it had 8-bit bytes, so it's not strange to me that data for different purposes would be measured differently. On that system the accelerator could store an OS string in half the number of bytes that it would take to store a native string, because the OS's bytes are all small. If you took a native string and cast it to an OS string, keeping the same memory representation, you'd have an OS string that was twice as long as your original native string.

@Xuanwo
Copy link
Contributor Author

Xuanwo commented Mar 28, 2022

This PR has been replaced by #95392, I'm going to close it now. Please take a look at the new PR instead.

Thank @ChrisDenton & @dtolnay for reviewing! 💘

@Xuanwo Xuanwo closed this Mar 28, 2022
@Xuanwo Xuanwo deleted the try_reserve_2_doc branch March 28, 2022 02:04
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Jun 17, 2022
…olnay

std: Stabilize feature try_reserve_2

This PR intends to stabilize feature `try_reserve_2`, closes rust-lang#91789

This PR will also replace the previous PR: rust-lang#95139
workingjubilee pushed a commit to tcdi/postgrestd that referenced this pull request Sep 15, 2022
std: Stabilize feature try_reserve_2

This PR intends to stabilize feature `try_reserve_2`, closes rust-lang/rust#91789

This PR will also replace the previous PR: rust-lang/rust#95139
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants