You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does this issue reproduce with the latest release?
Yes.
What did you expect to see?
A function in the utf8 package which, for a given string and rune index, returns the byte index of that rune in the string.
What did you see instead?
No such function.
What did you try to fix it?
Write my own implementation, like so:
func RuneIndexToByteIndex(s string, runeIndex int) int {
currentRuneIndex := 0
for i := range s {
if currentRuneIndex == runeIndex {
return i
}
currentRuneIndex++
}
if currentRuneIndex == runeIndex {
return len(s)
}
return -1
}
Additional comments
YES, this IS a wasteful way to do it. However, it can be part of idiomatic code. e.g.:
Usually, we only add functions to packages like utf8 if they're very commonly needed, or if they are tricky to implement correctly. Does this fall into either category?
I've never needed this function, and if I did, you yourself show that it can be implemented in under ten lines. So it seems to me like it's not necessary to add it to the standard library.
Also, any reason why we should have RuneIndexToByteIndex and not ByteIndexToRuneIndex?
You make a fair point. It is easy to implement correctly. But then again, omitting it supports the spread of misc packages and such in people's projects. (Which I am guilty of myself – part of a different, much larger issue)
I don't feel qualified to make a decision here, and would prefer to see some more people chew on this.
Although your example use is not in a loop, if this existed, inevitably people would use it inside loops processing the entire string. And in that context, the overall loop would then run in quadratic time, since there would be N calls (N = len(s)) and as you get further into the string each one would take longer and longer, requiring N/2 time on average. So overall you'd get a loop that runs in N^2 time. We work very hard to avoid making this kinds of accidents easy. They are already too easy in general. (See the excellent https://accidentallyquadratic.tumblr.com/ blog.)
What version of Go are you using (
go version
)?1.12.4
Does this issue reproduce with the latest release?
Yes.
What did you expect to see?
A function in the
utf8
package which, for a given string and rune index, returns the byte index of that rune in the string.What did you see instead?
No such function.
What did you try to fix it?
Write my own implementation, like so:
Additional comments
YES, this IS a wasteful way to do it. However, it can be part of idiomatic code. e.g.:
Other constructs are possible, such as retrieving a slice of indices. Would like to hear some thoughts on this.
The text was updated successfully, but these errors were encountered: