-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow nchar to be greater than string length in first/last #24556
Conversation
fc7573c
to
49f2b6e
Compare
49f2b6e
to
1272653
Compare
base/strings/basic.jl
Outdated
end | ||
str[1:nextind(str, 1, nchar-1)] | ||
idx = min(endof(str), nextind(str, s, nchar-1)) | ||
str[s:idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can use @inbounds
here now? That would (or could, if @boundscheck
is implemented where needed) offset the cost of callind endof
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that @inbounds
will help (but please correct me if I am wrong or I have misunderstood something what you have meant), because neither
getindex(s::String, r::UnitRange{Int})
nor getindex(s::AbstractString, r::UnitRange{Int})
call endof
(they work differently, but perform the checks without calling endof
and what they do does not seem to me to be influenced by @inbounds
(but maybe I am wrong).
The optimization would be to write a specialized method for first
for String
only which would call unsafe_string
instead of getindex
. It would be a bit faster, but I do not know if we want to have such a messy code (the same could be done for last
).
Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I guess leave it that way for now if the performance impact isn't too large.
this seems reasonable to me. if someone wants the old semantics they can always check the length themselves |
Note that there will be a small inconsistency between |
@nalimilan The issue is rather with |
Mmm, why doesn't |
@nalimilan Good point - I will update the implementation (but it will lead to the inconsistency you have observed). |
base/strings/basic.jl
Outdated
@@ -595,7 +595,7 @@ end | |||
""" | |||
first(str::AbstractString, nchar::Integer) | |||
|
|||
Get a string consisting of the first `nchar` characters of `str`. | |||
Get a string consisting of at most the first `nchar` characters of `str`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording makes it sound like the function may just randomly decide to give you any number of characters up to the number you asked for. It would be clearer to leave this sentence as is but add:
unless
str
is shorter thannchar
characters, in which case a string equal tostr
is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
base/strings/basic.jl
Outdated
end | ||
|
||
""" | ||
last(str::AbstractString, nchar::Integer) | ||
|
||
Get a string consisting of the last `nchar` characters of `str`. | ||
Get a string consisting of at most the last `nchar` characters of `str`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
CI failures seem unrelated. |
closing. current implementation of |
PR following the discussion in #23960 (comment).
For string
str
it allowsnchar
to be greater thanlength(str)
. In such a case the whole contents of the original string is returned.This returned whole string does not have to be of the same type as
str
as in generalgetindex
my return a string of different type than the original string.CC @stevengj