-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic implementation of the string pattern API #22466
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
r? @aturon |
|
||
#[inline] | ||
fn match_starts_at(self, haystack: &'a str, idx: usize) -> bool { | ||
let mut matcher = self.into_searcher(haystack); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be self.into_searcher(haystack[idx..])
, and then the loop becomes unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, indeed that could work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, actually, it may not. E.g. asking if a match for abab
starts at 2 in abababab
should return false, since the matches are indices 0-4 and 4-8, or does the searching look for all matching substrings, even if they overlap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, tricky!
Its true that as defined only all non-overlapping matches are valid, which means you can not slice the haystack as a shortcut. But that also means any potential implementer of this default method could not do that trick either, which makes the method kinda useless to even provide, so I think I have to change something:
- Either allow those methods to give answers that don't need to be identical to the yielded indices of the searcher.
- Remove the index argument and turn them into something like
is_front_match()
andis_end_match()
, relying on the caller of this method to slice the haystack beforehand.
Both seem fine to me, but the latter is more minimal, so I'll probably go with that one. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the latter approach. As you say, it's more minimal, and I think it's also more clear.
OK, I've taken a review pass and this looks basically good to me. There's a lot of detail work -- adding documentation, cleanup, etc -- but I think the core functionality is in reasonable shape to land. Thanks @Kimundi! |
@@ -677,6 +677,8 @@ macro_rules! iterator { | |||
fn next(&mut self) -> Option<$elem> { | |||
// could be implemented with slices, but this avoids bounds checks | |||
unsafe { | |||
::intrinsics::assume(!self.ptr.is_null()); | |||
::intrinsics::assume(!self.end.is_null()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked with dotdash about - its not exactly duplication because the assumes are at a different location, and depending how the code will be inlined there might be cases where either of the two changes might not apply. Also, giving llvm multiple optimization hints can't really hurt.
However, currently these hints are unneeded for my code anyway, because the optimized codepaths that depended on them did not end up in the current iteration of this PR, so if you'd rather have them removed for now I can do that too.
Added a few bugfixes and additional testcases
505d7ca
to
c8dd2d0
Compare
⌛ Testing commit c8dd2d0 with merge 41fb6ac... |
💔 Test failed - auto-win-32-nopt-t |
@bors: retry |
⌛ Testing commit c8dd2d0 with merge 562ee80... |
💔 Test failed - auto-win-32-nopt-t |
Hm, those failures seem unrelated to the PR? |
Unless that timeout error is because of the slower string search... |
@bors: retry Looks unrelated |
⌛ Testing commit c8dd2d0 with merge cf04813... |
💔 Test failed - auto-mac-64-nopt-t |
@bors: retry |
⌛ Testing commit c8dd2d0 with merge e4fff79... |
💔 Test failed - auto-linux-64-opt |
Could you see if |
(The |
@bors: retry Hm. |
This is not a complete implementation of the RFC: - only existing methods got updated, no new ones added - doc comments are not extensive enough yet - optimizations got lost and need to be reimplemented See rust-lang/rfcs#528 Technically a [breaking-change]
This is not a complete implementation of the RFC:
See rust-lang/rfcs#528
Technically a
[breaking-change]