-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix dict x == x
to return missing if x
contains it
#34809
Conversation
386fcb9
to
1c8f885
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this seems closer to the kind of semantic people seem to be aiming for.
As a general note - I still find it interesting that missing
(and NaN
) have to get handled in generic code rather than just falling out. It also seems unfortunate that I can't create my_missing
in a user package (or have ==
return anything other than Bool
or Missing
) and have it work well with the Base
containers (when having most stuff possible in user space seemed to be a design goal of the language/standard library).
Would you recommend us to write generic code that assumes only missing
and NaN
are allowed to violate ==
being reflexive (and not user-defined types)?
base/abstractdict.jl
Outdated
@@ -474,14 +474,16 @@ function isequal(l::AbstractDict, r::AbstractDict) | |||
end | |||
|
|||
function ==(l::AbstractDict, r::AbstractDict) | |||
l === r && return true | |||
if l === r | |||
return any(ismissing, values(l)) ? missing : true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about when NaN
is a value of the Dict
? Perhaps something like this:
return any(ismissing, values(l)) ? missing : !(any(isnan, values(l))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And out of curiousity, do you know if the compiler can elide the entire any
when the eltype
doesn't intersect Missing
or AbstractFloat
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can:
julia> f(a) = any(ismissing, a)
julia> @code_llvm f([1])
; @ REPL[2]:1 within `f'
define i8 @julia_f_17290(%jl_value_t addrspace(10)* nonnull align 16 dereferenceable(40)) {
top:
ret i8 0
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand
julia> @code_llvm f(Dict("foo" => 1))
; @ REPL[10]:1 within `f'
define i8 @julia_f_17814(%jl_value_t addrspace(10)* nonnull align 8 dereferenceable(64)) {
top:
; ┌ @ reduce.jl:765 within `any'
%1 = call i8 @julia__any_17815(%jl_value_t addrspace(10)* nonnull %0)
; └
ret i8 %1
}
julia> using BenchmarkTools
julia> data = Dict([randstring(10) => rand(1:10) for i in 1:10^6]);
julia> @btime f(data)
8.170 ms (0 allocations: 0 bytes)
false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but we can't do this for any(ismissing, values(d))
. Dict iteration seems to be too complex. Another reason to switch to the ordered representation perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now there's another problem. For maximum generality I'd like to just remove the ===
, but we have tests for circular dictionaries that then break 🤦♂️ . Do we really need that? Arrays etc. stack overflow in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't care about the circular dictionaries, personally, but I dunno - maybe someone does? The ===
is not a complete protection in any case, depth two nestings give a stack overflow (I'm guessing you need to maintain a stack of identities to do it properly? We seem to do that with IOContext
s in show
...)
(Out of interest, I'd also mention that ===
, at least on the keys
, is a nice shortcut for isequal
on each key, which is used heavily as a (massive) optimization in Dictionaries.jl as it lets you both skip comparing keys as well as having to look up values by key (you get fast co-iteration of the values via vectors, so you may even get that ellision of any
, I'll have to look into that...))
if isa(l,IdDict) != isa(r,IdDict) | ||
return false | ||
end | ||
length(l) != length(r) && return false | ||
anymissing = false | ||
for pair in l | ||
isin = in(pair, r, ==) | ||
isin = in(pair, r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the key be compared with isequal
and value compared with ==
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what this does. I just want to phase out the third argument to in
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, thanks.
So are we using ==
everywhere for in
except for AbstractSet
using isequal
?
That's a good point. We could do a little better by checking whether
|
closes #34744 use `isequal` to compare keys in `ImmutableDict`
1c8f885
to
167ca59
Compare
Maybe I'm missing something but why function ==′(l::AbstractDict, r::AbstractDict)
if isa(l,IdDict) != isa(r,IdDict)
return false
end
length(l) != length(r) && return false
acc = true
for pair in l
isin = in(pair, r)
isin === false && return false
acc &= isin
end
return acc
end This way, we only require that Maybe |
It's a good point, @tkf - that does seem more generic. Further curiosity: in that code snippet, will the compiler know when |
closes JuliaLang#34744 use `isequal` to compare keys in `ImmutableDict`
closes #34744 use `isequal` to compare keys in `ImmutableDict`
closes #34744
Also use
isequal
to compare keys inImmutableDict
. I'm not sure why it was using==
, but this is not exported so we might as well change it to be more similar toDict
.