-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate contains to occursin, deprecate callable regexes #26283
Conversation
Maybe also change the deprecation message for |
I also haven't yet done the complete rename of |
base/operators.jl
Outdated
isfound(needle, haystack) | ||
|
||
Determine whether `needle` is found within the collection `haystack`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is found" isn't very specific. I'd copy the description given for findfirst
, and point to it. I think it makes sense to define officially isfound
in terms of findfirst
, even if more efficient implementations can be used under the hood.
base/operators.jl
Outdated
``` | ||
""" | ||
isfound(needle, haystack) = findfirst(needle, haystack) !== nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this in array.jl near findfirst
.
There's an issue here during bootstrap, where |
|
cb1feeb
to
ea55930
Compare
The macOS failure is the spawn thing and the 32-bit Windows failure is a timeout. |
I've edited your first post and the title to make it a bit more clear what's happening here. Deprecating
|
Now also includes deprecation of contains
Good call, thanks Matt. 🙂 |
👍 But, there might be something to be said for |
base/array.jl
Outdated
|
||
# Examples | ||
```jldoctest | ||
julia> isfound('x', "hexonxonx") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently leads to a deprecation warning and an error. I think it would make sense to add a method isfound(::Char, ::AbstractString)
to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Though for Char
containment in String
should we just encourage using in
? Having isfound(scalar, collection)
work when its documented fallback, findfirst(scalar, collection) !== nothing
, does not work is a little weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the only reason findfirst
doesn't support that is due to a deprecation, but that case would not have been useful (looking for a string in a char; yes you read that right). So we could add the method to findfirst
as well, either now or after 0.7 if we want to be super conservative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I didn't add findfirst(::Char, ::String)
just because it wasn't strictly necessary (i.e. the standard pattern for general collections based on equalto
works well), and because can always be added later. At least that's what I noted in the description of #24673. OTOH the findfirst(::String ::String)
method is needed because there is no generic API to looking for a subsequence in general collections at the moment.
It could make sense to add a special case for findfirst(::Char, ::String)
for convenience, though, now that the basic API has stabilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so action items here:
- Add
findfirst(::Char, ::AbstractString)
, which facilitates a correspondingisfound
method - Add back
contains(::AbstractString, ::Union{Char,AbstractString})
Currently all methods for contains
deal with strings and regular expressions. So effectively rather than replacing contains
with isfound
, which this PR currently does, we'll add isfound
and replace the Regex
methods for contains
with methods for isfound
. That is, deprecate contains(::AbstractString, ::Regex)
in favor of isfound
and otherwise leave contains
alone.
Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could possibly use in(::String, ::String)
for this. I think a special case there would be just as valid as having a special contains
method. Big plus side: that'd use the same argument ordering as isfound
and match
, whereas contains
would be backwards (cf. #25584, which this PR fixes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not do that; in
is exceptionally clean and consistent in having all of its methods do the same thing (checking for element containment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so updated action items:
- Add
findfirst(::Char, ::AbstractString)
- Revert deprecation of
contains
So the PR would effectively be what it originally was, just adding isfound
and deprecating callable regexes. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to go ahead with removing contains
as agreed, and separately consider whether we want contains(string, string)
as a convenience function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so just add findfirst(::Char, ::AbstractString)
and leave the rest of the PR as-is.
NEWS.md
Outdated
@@ -1070,6 +1070,10 @@ Deprecated or removed | |||
* The `remove_destination` keyword argument to `cp`, `mv`, and the unexported `cptree` | |||
has been renamed to `force` ([#25979]). | |||
|
|||
* `contains` has been deprecated in favor of a more general `isfound` function ([#26283]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also mention the arguments, since their order is reversed.
base/regex.jl
Outdated
@@ -141,20 +141,18 @@ function getindex(m::RegexMatch, name::Symbol) | |||
end | |||
getindex(m::RegexMatch, name::AbstractString) = m[Symbol(name)] | |||
|
|||
function contains(s::AbstractString, r::Regex, offset::Integer=0) | |||
function isfound(r::Regex, s::AbstractString; offset::Integer=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't offset
be a positional argument just like the index passed to findnext
? Then there could be a generic definition isfound(needle, haystack, i) = findnext(needle, haystack, i) !== nothing
.
Another possibility would be to have keyword arguments like start
and stop
, which would allow specifying the ending index without passing the starting index when needed.
base/path.jl
Outdated
contains(a[end:end], path_separator_re) ? string(C,a,b) : | ||
string(C,a,pathsep(a,b),b) | ||
isempty(a) ? string(C,b) : | ||
isfound(path_separator_re, a[end:end]) ? string(C,a,b) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really related, but a[end:end]
is a one-character string, right? So why call isfound
rather than ==
? Because of the regex?
base/strings/search.jl
Outdated
@@ -101,6 +101,8 @@ julia> findfirst("Julia", "JuliaLang") | |||
findfirst(pattern::AbstractString, string::AbstractString) = | |||
findnext(pattern, string, firstindex(string)) | |||
|
|||
findfirst(c::AbstractChar, s::AbstractString) = findfirst(equalto(c), s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're at it, better add a findnext
method, and change the deprecations to use the new methods where appropriate.
stdlib/REPL/src/REPLCompletions.jl
Outdated
@@ -419,7 +419,7 @@ function afterusing(string::String, startpos::Int) | |||
r = findfirst(r"\s(gnisu|tropmi)\b", rstr) | |||
r === nothing && return false | |||
fr = reverseind(str, last(r)) | |||
return contains(str[fr:end], r"^\b(using|import)\s*((\w+[.])*\w+\s*,\s*)*$") | |||
return isfound(r"^\b(using|import)\s*((\w+[.])*\w+\s*,\s*)*$", str[fr:end]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use offset here.
test/strings/search.jl
Outdated
@@ -65,6 +65,7 @@ end | |||
for str in (u8str, GenericString(u8str)) | |||
@test_throws BoundsError findnext(equalto('z'), str, 0) | |||
@test_throws BoundsError findnext(equalto('∀'), str, 0) | |||
@test findfirst('z', str) == nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also test in a case that works, just in case? An easy way would be to loop over the predicate, using identity
and equalto
.
That's a good point: |
We've just got rid of the confusing Anyway there's no hurry to add this to 1.0, right? |
The only thing I really want out of this in 1.0 is something like |
Yes, the highest priority is to rename
|
I like |
I'm not quite sold on |
|
Yes; "string search" is the classical name for the problem of looking for a substring. |
To be fair, that's only the case for string and regex needles (because there's no ambiguity).
Yes, currently there's no way to do that. The Search & Find Julep evoked several possible solutions. Proposals 1 and 2 included
I'm not saying that would necessarily be a good idea. I'd also like something like Anyway I think for 1.0 it would be enough to find a good replacement for |
Triage sees some more votes for |
Game plan from triage:
Will update the PR for this. |
Windows failures are timeouts. |
NEWS.md
Outdated
* `contains` has been deprecated in favor of a more general `occursin` function, which | ||
takes its arguments in reverse order from `contains` ([#26283]). | ||
|
||
* `Regex` objects are no longer callable. Use `isfound` instead ([#26283]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
occursin
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HAH whoops. 🙃
base/array.jl
Outdated
false | ||
``` | ||
""" | ||
occursin(needle, haystack) = findfirst(needle, haystack) !== nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed. The change should just deprecate contains
to occursin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the findfirst
commit is irrelevant, so I'll drop that
occursin takes its arguments in reverse order from contains.
Deprecate contains to occursin, deprecate callable regexes
Ref #26211 (comment). There are two distinct commits here:
isfound
function that's effectivelyfindfirst(needle, haystack) !== nothing
. Edit: this is now a generalized replacement forcontains
.Regex
objects in favor ofisfound(::Regex, s)