Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow invalid UTF-8 string literals, deprecate b"..." #25073

Merged
merged 3 commits into from
Dec 15, 2017

Conversation

StefanKarpinski
Copy link
Member

also print invalid UTF-8 characters correctly

related: #25072

@StefanKarpinski StefanKarpinski changed the title invalid UTF-8 in string literals: allow it and print it allow invalid UTF-8 string literals, deprecate b"..." Dec 14, 2017
@StefanKarpinski StefanKarpinski force-pushed the sk/invalid-utf8 branch 5 times, most recently from aa2fba3 to 726d003 Compare December 14, 2017 10:47
@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Dec 14, 2017
@StefanKarpinski
Copy link
Member Author

This is ready to go. After the first commit, b"..." is no longer any more expressive than normal string literals, and we can get rid of cases where we have written String(b"...") to construct string literals containing invalid UTF-8, since they can now just be written as "..." directly. The additional decision is whether to deprecate b"abc" as a shorthand for Vector{UInt8}("abc") or keep it.

@quinnj
Copy link
Member

quinnj commented Dec 14, 2017

I would vote to keep it; it can be handy sometimes.

@StefanKarpinski
Copy link
Member Author

Ok, I can revert the deprecation and just keep the String(b"...") => "..." changes.

@StefanKarpinski
Copy link
Member Author

This is pretty bad though (pointed out by @Keno):

julia> function f(p)
           v = b"123"
           p && (v[1] = 'x')
           return v
       end
f (generic function with 1 method)

julia> f(false)
3-element Array{UInt8,1}:
 0x31
 0x32
 0x33

julia> f(true)
3-element Array{UInt8,1}:
 0x78
 0x32
 0x33

julia> f(false)
3-element Array{UInt8,1}:
 0x78
 0x32
 0x33

We should consider changing this while we're at it.

@StefanKarpinski
Copy link
Member Author

Triage resolution: make b"..." create a new vector each time instead of the same one every time.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Dec 14, 2017
@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Dec 14, 2017
@StefanKarpinski StefanKarpinski merged commit 4b5d067 into master Dec 15, 2017
@StefanKarpinski StefanKarpinski deleted the sk/invalid-utf8 branch December 15, 2017 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants