-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a default implementation of length
using iterate
.
#35947
base: master
Are you sure you want to change the base?
Conversation
e1f3d41
to
4ae9169
Compare
I don't think |
I agree with @tkf; I think this violates the spirit of |
Yeah - we could do with a trait to figure out if I'm assuming I should focus efforts on #35946 until then? |
I don't think there is a way to implement julia> using Base.Broadcast: broadcasted, instantiate
julia> itr = (y ? missing : x for (x, y) in instantiate(broadcasted(tuple, 1:1000, broadcasted(rand, Bool))));
julia> Base.IteratorSize(itr)
Base.HasShape{1}()
julia> xs = skipmissing(itr);
julia> count(_ -> true, xs)
512
julia> count(_ -> true, xs)
470 |
@tkf Your example is that of a random iterator. You can't do anything reliable with it! 🤣 I would expect
Sorry @JeffBezanson I didn't quite understand this. Don't we use
I guess that's my point - I'm not sure I undertand the spirit; I always assumed it was linked to the |
Is the argument that sometimes for some stateful iterators you may wish to "peek" at the length without popping everything off? |
Or is it this: If the |
This PR can introduce a segfault in the programs that previously safely threw an error. For example: function maptoints(f, xs)
ys = Vector{Int}(undef, length(xs))
for (i, x) in enumerate(xs)
@inbounds ys[i] = f(x)
end
return ys
end The tension here is |
Yes, the idea is that there are some iterators you should not call |
Maybe it's reasonable to have nitems(x) = haslength(x) ? length(x) : count(_ -> true, x) in It may be useful to have a common interface you can overload when you have a better implementation than |
Ref. #35530 for sneaky bugs that can come up if the |
Thinking about this more, I think it makes sense to have nitems(x::Reverse) = nitems(x.itr) # no need to go backward
nitems(x::Generator) = nitems(x.iter) # no need to evaluate `f`
nitems(x::Accumulate) = nitems(x.itr) # ditto
nitems(x::Flatten) = sum(nitems, x.it) # inner iterators may have nice `nitems` and so on. |
Yes, something along the lines of Can we just use
EDIT: What I like is that the "count" verb gives the connotation that it might literally iterate through and count the items, as opposed to |
I can see that |
Thanks, I couldn't quite recall the reasoning. @tkf wrote
Haha. Do we need to rename |
Actually, my comment was misleading as the reason why we can't use
Oops :) Though that comment was in a different context. It was rather a byproduct that it was possible to get the curried form Anyway, I think currying is rather orthogonal to the current issue because we can't break how |
True. Damn - naming things is hard. It seems that a lot of languages use A quick trip to the thesaurus only left me with
I note that so does |
Totally agree... Thanks for the survey. Yeah, it's a bit unfortunate that what we have is incompatible with other languages. Looking at this,
Two-argument case as well, I think. |
Yes - good point! |
Another lastitem(xs) = foldl(right, xs)
right(_, x) = x We can't use Like lastitem(x::AbstractArray) = last(x)
lastitem(x::Reverse) = firstitem(x.itr)
lastitem(x::Filter) = firstitem(filter(x.flt, reverse(x.itr)))
lastitem(x::Generator) = x.f(lastitem(x.iter))
lastitem(x::Flatten) = lastitem(lastitem(x)) and so on. It'd also be better to have I started to wonder if it makes sense to consistently use |
Here is an alternative to #35946.
For all iterables with
SizeUnknown
we default to usingiterate
to get thelength
when requested. Algorithms can still take care to checkIteratorSize
on unknown iterables in order to know if the function is O(1) or not.The particular (new) behavior I am seeking is:
but I've had to write this defintion elsewhere before so I thought it made sense in
Base
. I also believe we should tielength
to theiterate
API, generically.