Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize copy!(::Dict, ::Dict) #34101

Merged
merged 5 commits into from
May 5, 2020
Merged

Optimize copy!(::Dict, ::Dict) #34101

merged 5 commits into from
May 5, 2020

Conversation

tkf
Copy link
Member

@tkf tkf commented Dec 14, 2019

This PR implements two optimizations for copy!(dst::Dict, src::Dict). This came up in another PR for implementing setindex for Dict: #33495 (comment)

Case 1: copy!(dst::D, src::D) where {D <: Dict}

This is a straightforward case where you can copy properties as-is:

function unsafe_copy!(dst::D, src::D) where {D <: Dict}
    copy!(dst.vals, src.vals)
    copy!(dst.keys, src.keys)
    copy!(dst.slots, src.slots)

    dst.ndel = src.ndel
    dst.count = src.count
    dst.age = src.age
    dst.idxfloor = src.idxfloor
    dst.maxprobe = src.maxprobe
    return dst
end


using BenchmarkTools
d = Dict{Int,Int}(zip(1:10^5, 1:10^5))
c = Dict{Int,Int}()
@btime copy!($c, $d)
@btime unsafe_copy!($c, $d)

prints

  2.146 ms (0 allocations: 0 bytes)
  299.572 μs (0 allocations: 0 bytes)

in Julia 1.3.

Case 2: copy!(dst::Dict{Kd}, src::Dict{Ks}) where {Ks,Kd>:Ks}

This is the case you don't need to re-compute hashes:

using Base: isslotfilled

function unsafe_copy!(dst::Dict, src::Dict)
    resize!(dst.vals, length(src.vals))
    resize!(dst.keys, length(src.keys))

    svals = src.vals
    skeys = src.keys
    dvals = dst.vals
    dkeys = dst.keys
    @inbounds for i = src.idxfloor:lastindex(svals)
        if isslotfilled(src, i)
            dvals[i] = svals[i]
            dkeys[i] = skeys[i]
        end
    end

    copy!(dst.slots, src.slots)

    dst.ndel = src.ndel
    dst.count = src.count
    dst.age = src.age
    dst.idxfloor = src.idxfloor
    dst.maxprobe = src.maxprobe
    return dst
end


using BenchmarkTools
d = Dict{Int,Int}(zip(1:10^5, 1:10^5))
c = Dict{Any,Float64}()
@btime copy!($c, $d)
@btime unsafe_copy!($c, $d)

This prints

  4.030 ms (130549 allocations: 1.99 MiB)
  1.889 ms (99489 allocations: 1.52 MiB)

in Julia 1.3.

@KristofferC KristofferC added the performance Must go faster label Dec 14, 2019
@thofma
Copy link
Contributor

thofma commented Dec 16, 2019

Does the case 2) assume that hash(convert(T, x)) == hash(x) for x::S and S <: T?

@rfourquet
Copy link
Member

Does the case 2) assume that hash(convert(T, x)) == hash(x) for x::S and S <: T?

Do you have an example where x !== convert(T, x) ? Unless S == T, T would be abstract, and I would expect convert(T, x) to just return x because x is already a T.

@thofma
Copy link
Contributor

thofma commented Dec 16, 2019

Why would T be abstract? I can define convert for any type I want and the documentation does not mention that it has to preserve hashes.

So to answer your question:

julia> struct A
       end

julia> struct B
       end

julia> Base.convert(::A, ::Type{B}) = B()

julia> a = A(); convert(a, B) !== A
true

There are also methods of the form convert(::S, ::Type{T}) for concrete S, T with S !== T in Base itself.

@rfourquet
Copy link
Member

Why would T be abstract?

When S <: T and S != T, generally T is abstract (there are probably some corner cases, I believe there is an issue where two types are not equal but represent the same set of values). If S is concrete and S != T, I think T is always abstract, but I'm not certain).

There are also methods of the form convert(::S, ::Type{T}) for concrete S, T with S !== T in Base itself.

Besides the reversed order of the argument, yes there is no problem with that. My remark was based on the S <: T relation that you mentioned.

@rfourquet
Copy link
Member

I can define convert for any type I want and the documentation does not mention that it has to preserve hashes.

Correct, for example x = rand(); hash(x) != hash(convert(Float16, x)) (most of the time).

@thofma
Copy link
Contributor

thofma commented Dec 16, 2019

Sorry, you are right.

So in the signature of the second case, either Kd === Ks or Kd is abstract and thus convert(::Type{Kd}, x::Ks) === x?

@rfourquet
Copy link
Member

So in the signature of the second case, either Kd === Ks or Kd is abstract and thus convert(::Type{Kd}, x::Ks) === x?

Yes. Even without considering the abstract/non-abstract distinction, if Kd>:Ks, then it is expected that convert(Kd, x::Ks) === x because x is already of the requested type (x isa Ks implies x isa Kd).

base/dict.jl Show resolved Hide resolved
base/dict.jl Outdated Show resolved Hide resolved
base/dict.jl Outdated Show resolved Hide resolved
test/dict.jl Outdated Show resolved Hide resolved
@tkf tkf merged commit cf123a8 into JuliaLang:master May 5, 2020
@tkf tkf deleted the inplace-copy-dict branch May 5, 2020 04:14
@Keno
Copy link
Member

Keno commented May 5, 2020

Seems to have broken CI. I'm gonna suggest a revert until this can be investigated.

Keno added a commit that referenced this pull request May 5, 2020
@tkf
Copy link
Member Author

tkf commented May 5, 2020

Sounds good to me. The CI was all green as of the last commit so I thought it was good to go.

@tkf tkf restored the inplace-copy-dict branch May 5, 2020 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants