-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impl Step for char (make Range*<char> iterable) #72413
Conversation
Enables Range<char> to be iterable Note: https://rust.godbolt.org/z/fdveKo An iteration over all char ('\0'..=char::MAX) includes unreachable panic code currently. Updating RangeInclusive::next to call Step::forward_unchecked (which is safe to do but not done yet becuase it wasn't necessary) successfully removes the panic from this iteration.
(rust_highfive has picked a reviewer for you, use r? to override) |
Seems okay to me. Note: I have not reviewed the implementation whatsoever. |
This PR adds @rfcbot fcp merge |
Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Is We’ve been bitten before by inference regressions or name resolution regressions when adding an impl where both the trait and type have existed for a long time, so a Crater run could be good here. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
Yes, and this is pretty intrinsic to
It would be nice to further call this out, sure, but I definitely don't know where best to do so.
No arguments here, though I cannot think of how this could potentially lead to any inference failures. |
I understand very well that surrogate code points are excluded from Anyway, these semantics sound good 👍 |
The definition of the successor/predecessor operations used by Furthermore, one of the rules is that if there exist some Certainly, this makes sense for |
Rest of code looks fine but 💥 there is a lot of |
@bors r+ |
📌 Commit 64d79eaf43329fdb7a8d3df6e55f09db3f6b66b5 has been approved by |
(I really need to figure out why my format-on-save is disagreeing with |
@bors r+ |
📌 Commit cd6a8ca has been approved by |
impl Step for char (make Range*<char> iterable) [[irlo thread]](https://internals.rust-lang.org/t/mini-rfc-make-range-char-work/12392?u=cad97) [[godbolt asm example]](https://rust.godbolt.org/z/fdveKo) Add an implementation of the `Step` trait for `char`, which has the effect of making `RangeInclusive<char>` (and the other range types) iterable. I've used the surrogate range magic numbers as magic numbers here rather than e.g. a `const SURROGATE_RANGE = 0xD800..0xE000` because these numbers appear to be used as magic numbers elsewhere and there doesn't exist constants for them yet. These files definitely aren't where surrogate range constants should live. `ExactSizeIterator` is not implemented because `0x10FFFF` is bigger than fits in a `usize == u16`. However, given we already provide some `ExactSizeIterator` that are not correct on 16 bit targets, we might still want to consider providing it for `Range`[`Inclusive`]`<char>`, as it is definitely _very_ convenient. (At the very least, we want to make sure `.count()` doesn't bother iterating the range.) The second commit in this PR changes a call to `Step::forward` to use `Step::forward_unchecked` in `RangeInclusive::next`. This is because without this patch, iteration over all codepoints (`'\0'..=char::MAX`) does not successfully optimize out the panicking branch. This was mentioned in the PR that updated `Step` to its current design, but was deemed not yet necessary as it did not impact codegen for integral types. More of `Range*`'s implementations' calls to `Step` methods will probably want to see if they can use the `_unchecked` version as (if) we open up `Step` to being implemented on more types. --- cc @rust-lang/libs, this is insta-stable and a fairly significant addition to `Range*`'s capabilities; this is the first instance of a noncontinuous domain being iterable with `Range` (or, well, anything other than primitive integers). I don't think this needs a full RFC, but it should definitely get some decent eyes on it.
Rollup of 9 pull requests Successful merges: - rust-lang#67460 (Tweak impl signature mismatch errors involving `RegionKind::ReVar` lifetimes) - rust-lang#71095 (impl From<[T; N]> for Box<[T]>) - rust-lang#71500 (Make pointer offset methods/intrinsics const) - rust-lang#71804 (linker: Support `-static-pie` and `-static -shared`) - rust-lang#71862 (Implement RFC 2585: unsafe blocks in unsafe fn) - rust-lang#72103 (borrowck `DefId` -> `LocalDefId`) - rust-lang#72407 (Various minor improvements to Ipv6Addr::Display) - rust-lang#72413 (impl Step for char (make Range*<char> iterable)) - rust-lang#72439 (NVPTX support for new asm!) Failed merges: r? @ghost
fn test_char_range() { | ||
use std::char; | ||
assert!(('\0'..=char::MAX).eq((0..=char::MAX as u32).filter_map(char::from_u32))); | ||
assert!(('\0'..=char::MAX).rev().eq((0..=char::MAX as u32).filter_map(char::from_u32).rev())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woah doesn't this loop over all char
?^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cc #72812
While here clean up all pkglint warnings. Changes since 1.44.1: Version 1.45.2 (2020-08-03) ========================== * [Fix bindings in tuple struct patterns][74954] * [Fix track_caller integration with trait objects][74784] [74954]: rust-lang/rust#74954 [74784]: rust-lang/rust#74784 Version 1.45.1 (2020-07-30) ========================== * [Fix const propagation with references.][73613] * [rustfmt accepts rustfmt_skip in cfg_attr again.][73078] * [Avoid spurious implicit region bound.][74509] * [Install clippy on x.py install][74457] [73613]: rust-lang/rust#73613 [73078]: rust-lang/rust#73078 [74509]: rust-lang/rust#74509 [74457]: rust-lang/rust#74457 Version 1.45.0 (2020-07-16) ========================== Language -------- - [Out of range float to int conversions using `as` has been defined as a saturating conversion.][71269] This was previously undefined behaviour, but you can use the `{f64, f32}::to_int_unchecked` methods to continue using the current behaviour, which may be desirable in rare performance sensitive situations. - [`mem::Discriminant<T>` now uses `T`'s discriminant type instead of always using `u64`.][70705] - [Function like procedural macros can now be used in expression, pattern, and statement positions.][68717] This means you can now use a function-like procedural macro anywhere you can use a declarative (`macro_rules!`) macro. Compiler -------- - [You can now override individual target features through the `target-feature` flag.][72094] E.g. `-C target-feature=+avx2 -C target-feature=+fma` is now equivalent to `-C target-feature=+avx2,+fma`. - [Added the `force-unwind-tables` flag.][69984] This option allows rustc to always generate unwind tables regardless of panic strategy. - [Added the `embed-bitcode` flag.][71716] This codegen flag allows rustc to include LLVM bitcode into generated `rlib`s (this is on by default). - [Added the `tiny` value to the `code-model` codegen flag.][72397] - [Added tier 3 support\* for the `mipsel-sony-psp` target.][72062] - [Added tier 3 support for the `thumbv7a-uwp-windows-msvc` target.][72133] \* Refer to Rust's [platform support page][forge-platform-support] for more information on Rust's tiered platform support. Libraries --------- - [`net::{SocketAddr, SocketAddrV4, SocketAddrV6}` now implements `PartialOrd` and `Ord`.][72239] - [`proc_macro::TokenStream` now implements `Default`.][72234] - [You can now use `char` with `ops::{Range, RangeFrom, RangeFull, RangeInclusive, RangeTo}` to iterate over a range of codepoints.][72413] E.g. you can now write the following; ```rust for ch in 'a'..='z' { print!("{}", ch); } println!(); // Prints "abcdefghijklmnopqrstuvwxyz" ``` - [`OsString` now implements `FromStr`.][71662] - [The `saturating_neg` method as been added to all signed integer primitive types, and the `saturating_abs` method has been added for all integer primitive types.][71886] - [`Arc<T>`, `Rc<T>` now implement `From<Cow<'_, T>>`, and `Box` now implements `From<Cow>` when `T` is `[T: Copy]`, `str`, `CStr`, `OsStr`, or `Path`.][71447] - [`Box<[T]>` now implements `From<[T; N]>`.][71095] - [`BitOr` and `BitOrAssign` are implemented for all `NonZero` integer types.][69813] - [The `fetch_min`, and `fetch_max` methods have been added to all atomic integer types.][72324] - [The `fetch_update` method has been added to all atomic integer types.][71843] Stabilized APIs --------------- - [`Arc::as_ptr`] - [`BTreeMap::remove_entry`] - [`Rc::as_ptr`] - [`rc::Weak::as_ptr`] - [`rc::Weak::from_raw`] - [`rc::Weak::into_raw`] - [`str::strip_prefix`] - [`str::strip_suffix`] - [`sync::Weak::as_ptr`] - [`sync::Weak::from_raw`] - [`sync::Weak::into_raw`] - [`char::UNICODE_VERSION`] - [`Span::resolved_at`] - [`Span::located_at`] - [`Span::mixed_site`] - [`unix::process::CommandExt::arg0`] Cargo ----- Misc ---- - [Rustdoc now supports strikethrough text in Markdown.][71928] E.g. `~~outdated information~~` becomes "~~outdated information~~". - [Added an emoji to Rustdoc's deprecated API message.][72014] Compatibility Notes ------------------- - [Trying to self initialize a static value (that is creating a value using itself) is unsound and now causes a compile error.][71140] - [`{f32, f64}::powi` now returns a slightly different value on Windows.][73420] This is due to changes in LLVM's intrinsics which `{f32, f64}::powi` uses. - [Rustdoc's CLI's extra error exit codes have been removed.][71900] These were previously undocumented and not intended for public use. Rustdoc still provides a non-zero exit code on errors. Internals Only -------------- - [Make clippy a git subtree instead of a git submodule][70655] - [Unify the undo log of all snapshot types][69464] [73420]: rust-lang/rust#73420 [72324]: rust-lang/rust#72324 [71843]: rust-lang/rust#71843 [71886]: rust-lang/rust#71886 [72234]: rust-lang/rust#72234 [72239]: rust-lang/rust#72239 [72397]: rust-lang/rust#72397 [72413]: rust-lang/rust#72413 [72014]: rust-lang/rust#72014 [72062]: rust-lang/rust#72062 [72094]: rust-lang/rust#72094 [72133]: rust-lang/rust#72133 [71900]: rust-lang/rust#71900 [71928]: rust-lang/rust#71928 [71662]: rust-lang/rust#71662 [71716]: rust-lang/rust#71716 [71447]: rust-lang/rust#71447 [71269]: rust-lang/rust#71269 [71095]: rust-lang/rust#71095 [71140]: rust-lang/rust#71140 [70655]: rust-lang/rust#70655 [70705]: rust-lang/rust#70705 [69984]: rust-lang/rust#69984 [69813]: rust-lang/rust#69813 [69464]: rust-lang/rust#69464 [68717]: rust-lang/rust#68717 [`Arc::as_ptr`]: https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#method.as_ptr [`BTreeMap::remove_entry`]: https://doc.rust-lang.org/stable/std/collections/struct.BTreeMap.html#method.remove_entry [`Rc::as_ptr`]: https://doc.rust-lang.org/stable/std/rc/struct.Rc.html#method.as_ptr [`rc::Weak::as_ptr`]: https://doc.rust-lang.org/stable/std/rc/struct.Weak.html#method.as_ptr [`rc::Weak::from_raw`]: https://doc.rust-lang.org/stable/std/rc/struct.Weak.html#method.from_raw [`rc::Weak::into_raw`]: https://doc.rust-lang.org/stable/std/rc/struct.Weak.html#method.into_raw [`sync::Weak::as_ptr`]: https://doc.rust-lang.org/stable/std/sync/struct.Weak.html#method.as_ptr [`sync::Weak::from_raw`]: https://doc.rust-lang.org/stable/std/sync/struct.Weak.html#method.from_raw [`sync::Weak::into_raw`]: https://doc.rust-lang.org/stable/std/sync/struct.Weak.html#method.into_raw [`str::strip_prefix`]: https://doc.rust-lang.org/stable/std/primitive.str.html#method.strip_prefix [`str::strip_suffix`]: https://doc.rust-lang.org/stable/std/primitive.str.html#method.strip_suffix [`char::UNICODE_VERSION`]: https://doc.rust-lang.org/stable/std/char/constant.UNICODE_VERSION.html [`Span::resolved_at`]: https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.resolved_at [`Span::located_at`]: https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.located_at [`Span::mixed_site`]: https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.mixed_site [`unix::process::CommandExt::arg0`]: https://doc.rust-lang.org/std/os/unix/process/trait.CommandExt.html#tymethod.arg0
[irlo thread] [godbolt asm example]
Add an implementation of the
Step
trait forchar
, which has the effect of makingRangeInclusive<char>
(and the other range types) iterable.I've used the surrogate range magic numbers as magic numbers here rather than e.g. a
const SURROGATE_RANGE = 0xD800..0xE000
because these numbers appear to be used as magic numbers elsewhere and there doesn't exist constants for them yet. These files definitely aren't where surrogate range constants should live.ExactSizeIterator
is not implemented because0x10FFFF
is bigger than fits in ausize == u16
. However, given we already provide someExactSizeIterator
that are not correct on 16 bit targets, we might still want to consider providing it forRange
[Inclusive
]<char>
, as it is definitely very convenient. (At the very least, we want to make sure.count()
doesn't bother iterating the range.)The second commit in this PR changes a call to
Step::forward
to useStep::forward_unchecked
inRangeInclusive::next
. This is because without this patch, iteration over all codepoints ('\0'..=char::MAX
) does not successfully optimize out the panicking branch. This was mentioned in the PR that updatedStep
to its current design, but was deemed not yet necessary as it did not impact codegen for integral types.More of
Range*
's implementations' calls toStep
methods will probably want to see if they can use the_unchecked
version as (if) we open upStep
to being implemented on more types.cc @rust-lang/libs, this is insta-stable and a fairly significant addition to
Range*
's capabilities; this is the first instance of a noncontinuous domain being iterable withRange
(or, well, anything other than primitive integers). I don't think this needs a full RFC, but it should definitely get some decent eyes on it.