Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

joinpath gives surprising (wrong?) results for unc network paths in Windows #33455

Closed
jonathan-saunders-ge opened this issue Oct 2, 2019 · 9 comments

Comments

@jonathan-saunders-ge
Copy link

Using joinpath on a UNC path does not always behave as expected (or as I expect).

In the following example:

  • Case 7 seems like a definite defect.
  • Cases 5 & 6 may be debatable, but I would also argue that joinpath should be able to join all the parts of a UNC path even if the first part is not the full server+share part. I can certainly see how it would be useful to have it behave that way.

Code:

import InteractiveUtils
InteractiveUtils.versioninfo()

println()
@show "1: " * joinpath("D:", "a", "b", "c")
@show "2: " * joinpath("D:\\", "a", "b", "c")
@show "3: " * joinpath("D:\\a", "b", "c")
@show "4: " * joinpath("D:\\a\\", "b", "c")

println()
@show "5: " * joinpath("\\\\server", "share", "b", "BAD")   # expect "\\\\server\\share\\b\\BAD"
@show "6: " * joinpath("\\\\server\\", "share", "b", "BAD") # expect "\\\\server\\share\\b\\BAD"
@show "7: " * joinpath("\\\\server\\share", "b", "BAD")     # expect "\\\\server\\share\\b\\BAD"
@show "8: " * joinpath("\\\\server\\share\\", "b", "GOOD")  # expect "\\\\server\\share\\b\\GOOD"

Result:

Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

"1: " * joinpath("D:", "a", "b", "c") = "1: D:a\\b\\c"
"2: " * joinpath("D:\\", "a", "b", "c") = "2: D:\\a\\b\\c"
"3: " * joinpath("D:\\a", "b", "c") = "3: D:\\a\\b\\c"
"4: " * joinpath("D:\\a\\", "b", "c") = "4: D:\\a\\b\\c"

"5: " * joinpath("\\\\server", "share", "b", "BAD") = "5: \\\\server\\sharebBAD"
"6: " * joinpath("\\\\server\\", "share", "b", "BAD") = "6: \\\\server\\sharebBAD"
"7: " * joinpath("\\\\server\\share", "b", "BAD") = "7: \\\\server\\sharebBAD"
"8: " * joinpath("\\\\server\\share\\", "b", "GOOD") = "8: \\\\server\\share\\b\\GOOD"
@fredrikekre
Copy link
Member

Thanks for the report. Likely a duplicate of #33127, fixed by #33116. Can you try with Julia's master branch?

@jonathan-saunders-ge
Copy link
Author

I'm not in a position to build from source, but I pulled the latest from the nightlies (https://julialang.org/downloads/nightlies.html) page and got the same result:

Julia Version 1.4.0-DEV.231
Commit ef0c9108b1 (2019-10-02 07:36 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

"1: " * joinpath("D:", "a", "b", "c") = "1: D:a\\b\\c"
"2: " * joinpath("D:\\", "a", "b", "c") = "2: D:\\a\\b\\c"
"3: " * joinpath("D:\\a", "b", "c") = "3: D:\\a\\b\\c"
"4: " * joinpath("D:\\a\\", "b", "c") = "4: D:\\a\\b\\c"

"5: " * joinpath("\\\\server", "share", "b", "BAD") = "5: \\\\server\\sharebBAD"
"6: " * joinpath("\\\\server\\", "share", "b", "BAD") = "6: \\\\server\\sharebBAD"
"7: " * joinpath("\\\\server\\share", "b", "BAD") = "7: \\\\server\\sharebBAD"
"8: " * joinpath("\\\\server\\share\\", "b", "GOOD") = "8: \\\\server\\share\\b\\GOOD"

@musm
Copy link
Contributor

musm commented Oct 5, 2019

This is just a bug in joinpath. The way it is defined to call it self for multiple arguments, is unlikely to work in cases like this.

@StefanKarpinski
Copy link
Member

I'm fine with rewriting joinpath to be non-recursive, but why is it unlikely to work? It seems like whatever the definition, it should always follow the identity that

joinpath(p1, p2, rest...) = joinpath(joinpath(p1, p2), rest...)

Which is currently how it's defined and therefore guaranteed to be true. If that's not the case, that's quite concerning and I'd like to understand why.

@StefanKarpinski
Copy link
Member

It seems that UNC paths are always absolute, so consider the 2-arg version, joinpath(a, b):

  • If b is an absolute path, ignore a
  • If a is a UNC path and b is a relative path without a drive, join with a path separator
  • If a is a UNC path and b is a relative path with a drive letter... then what?

The only choices for the last case seem to be:

  • ignore the UNC path
  • throw an error

Neither one is a great option. In any case, I don't see how this is incompatible with the recursive identity since:

  • If b is an absolute path then joinpath(a, b) will be absolute
  • If a is a UNC path and b is is a relative path without a drive then joinpath(a, b) will also be a UNC path
  • If a is a UNC path and b is a relative path with a drive letter, no matter which choice we make for the behavior the result is compatible with the recursive identity.

@StefanKarpinski
Copy link
Member

I'm not disputing that there's a bug here since the BAD results are clearly wrong, but I want to get to the bottom of this claim that the correct behavior of joinpath is somehow incompatible with a recursive definition.

@musm
Copy link
Contributor

musm commented Oct 9, 2019

If a is a UNC path and b is a relative path with a drive letter... then what?

If that's the case we default b, like in previous cases. Since b is "absolute" (even if it's relative, because the user specified the drive the path is relative to)

There's a really subtle bug here related to the behavior of pathsep

const path_separator    = "\\"
const path_separator_re = r"[/\\]+"


function pathsep(paths::AbstractString...)
    for path in paths
        m = match(path_separator_re, String(path))
        m !== nothing && return m.match[1:1]
    end
    return path_separator
end


function _joinpath(a::String, b::String)
    isabspath(b) && return b
    A, a = splitdrive(a)
    B, b = splitdrive(b)
    !isempty(B) && A != B && return string(B,b)
    C = isempty(B) ? A : B
    isempty(a)                              ? string(C,pathsep(a,b),b) :
    occursin(path_separator_re, a[end:end]) ? string(C,a,b) :
                                              string(C,a,pathsep(a,b),b)
end

This code is identical to the joinpath function in Base. However run

julia> _joinpath("\\\\server\\share","b")
"\\\\server\\share\\b"

julia> joinpath("\\\\server\\share","b")
"\\\\server\\shareb"

For some reason pathsep completely returns the wrong value for joinpath.

I suspect something related to compiler is going on here. Because if I then use Revise and make change to joinpath, by e.g. adding a print statement. The recompiled function magically works.

So in fact, joinpath is correct, but compilation seems to screw up something in the function when the sysimage is created?

@musm
Copy link
Contributor

musm commented Oct 20, 2019

fixed by #33477

@jonathan-saunders-ge
Copy link
Author

I ran my little test on the latest nightly and it look good.

Thanks!

Julia Version 1.4.0-DEV.349
Commit 864e6d63d2 (2019-10-19 00:41 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

"1: " * joinpath("D:", "a", "b", "c") = "1: D:a\\b\\c"
"2: " * joinpath("D:\\", "a", "b", "c") = "2: D:\\a\\b\\c"
"3: " * joinpath("D:\\a", "b", "c") = "3: D:\\a\\b\\c"
"4: " * joinpath("D:\\a\\", "b", "c") = "4: D:\\a\\b\\c"

"5: " * joinpath("\\\\server", "share", "b", "GOOD") = "5: \\\\server\\share\\b\\GOOD"
"6: " * joinpath("\\\\server\\", "share", "b", "GOOD") = "6: \\\\server\\share\\b\\GOOD"
"7: " * joinpath("\\\\server\\share", "b", "GOOD") = "7: \\\\server\\share\\b\\GOOD"
"8: " * joinpath("\\\\server\\share\\", "b", "GOOD") = "8: \\\\server\\share\\b\\GOOD"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants