-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve inferability of unique #20317
Conversation
Is it worth having a helper function that takes the iterator state, |
I compared the performance of this PR vs. such approach (unique_recursive) in the following case N = 1000
A = [ones(Int,N);ones(N);im*ones(Int,N);ones(BigInt,N);trues(N);ones(BigFloat,N);ones(String,N)] and I see
I'm not too convinced that the difference is going to be that significant in general, but if anyone has an strong opinion in favor of the "split" approach I can consider it. |
If that's all there is to gain it's probably not worth the extra complication. Probably the code generated for BTW, I think you should have done |
Might be worth benchmarking for small |
@@ -212,6 +212,9 @@ u = unique([1,1,2]) | |||
@test length(u) == 2 | |||
@test unique(iseven, [5,1,8,9,3,4,10,7,2,6]) == [5,8] | |||
@test unique(n->n % 3, [5,1,8,9,3,4,10,7,2,6]) == [5,1,9] | |||
# issue 20105 | |||
@test @inferred(unique(x for x in 1:1)) == [1] | |||
@test unique(x for x in Any[1,1.0])::Vector{Real} == [1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps also worth testing the expected output types? Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to also type assert the first one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreso the second. Did the ::Vector{Real}
appear in the original? If so apologies, I missed it. Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's been there all along, but don't sweat it! ;)
OK, here are more timings
I still think that this is not much worse and is not worth the extra code complexity. |
The best case for the recursive implementation is probably |
OK, found a slightly more compact way to take the loop out than in the gist above and made that change here accordingly. |
base/set.jl
Outdated
if !done(itr, i) | ||
x, i = next(itr, i) | ||
end | ||
return unique(itr, out, seen, x, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you attempting to inline the body of _unique
specialized for T
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah
base/set.jl
Outdated
return unique(itr, out, seen, x, i) | ||
end | ||
|
||
@inline unique(itr, out, seen, x, i) = _unique(itr, out, seen, x, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably don't want to export this signature under the same public name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Fixed.
base/set.jl
Outdated
if !done(itr, i) | ||
x, i = next(itr, i) | ||
end | ||
return _unique(itr, out, seen, x, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC from earlier comments, you wish to inline the body of unique_from
here. But IIUC, this construction will only inline the body of _unique
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my thinking, but somehow I got the idea from a question I asked @yuyichao that the body of unique_from
would also be inlined. Maybe I assumed it would be the case and it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a typo. What I meant was that you should make unique_from
always inline and use a wrapper to pick the specific signature that will actually inline the wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought of that, but believed it wasn't a typo what you wrote. Thanks for clarifying. I'll fix this.
b73129f
to
c3dc349
Compare
If no one objects, will merge once CI passes. |
base/set.jl
Outdated
x, i = next(itr, i) | ||
if !isleaftype(T) | ||
S = typeof(x) | ||
return _unique_from(itr, S[x], push!(Set{S}(), x), i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can push!(Set{S}(), x)
simplify to Set{S}(x)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. The Set
constructor is defined for iterables, so that won't work if x
is not iterable. But Set{S}((x,))
should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks Pablo! :)
#20317 improved inference of unique, but problematic cases still arise for containers with known but abstract eltypes. Here, we short-circuit the `typejoin` when the return type is determined by the element type of the input container. For `unique(f, itr)`, this commit also allows the caller to supply `seen::Set` to circumvent the inference challenges.
#20317 improved inference of unique, but problematic cases still arise for containers with known but abstract eltypes. Here, we short-circuit the `typejoin` when the return type is determined by the element type of the input container. For `unique(f, itr)`, this commit also allows the caller to supply `seen::Set` to circumvent the inference challenges.
#20317 improved inference of unique, but problematic cases still arise for containers with known but abstract eltypes. Here, we short-circuit the `typejoin` when the return type is determined by the element type of the input container. For `unique(f, itr)`, this commit also allows the caller to supply `seen::Set` to circumvent the inference challenges.
#20317 improved inference of unique, but problematic cases still arise for containers with known but abstract eltypes. Here, we short-circuit the `typejoin` when the return type is determined by the element type of the input container. For `unique(f, itr)`, this commit also allows the caller to supply `seen::Set` to circumvent the inference challenges.
JuliaLang#20317 improved inference of unique, but problematic cases still arise for containers with known but abstract eltypes. Here, we short-circuit the `typejoin` when the return type is determined by the element type of the input container. For `unique(f, itr)`, this commit also allows the caller to supply `seen::Set` to circumvent the inference challenges.
Fixes #20105 and supersedes #20106