Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

length method for iterables #13772

Closed
my-little-repository opened this issue Oct 26, 2015 · 6 comments
Closed

length method for iterables #13772

my-little-repository opened this issue Oct 26, 2015 · 6 comments

Comments

@my-little-repository
Copy link

Shouldn't there be a default length method based on start/next/done? At the moment, collect fails on iterables which do not have a length method.

Arguably, this means that collect may hang if done never returns a true value. But there are many functions that exhibit such a behavior at the moment, so this does not really worsen the matter.

@mschauer
Copy link
Contributor

The iterator protocol does not guarantee at this moment/to my knowledge that the elements and the length of the iterator does not change when julia iterates over the iterator again. So calling n = length(iter), reserving n slots and then filling the slots by restarting the iterator is undefined behaviour. So only iterators with fix length should provide length (and they do, barring #13276) , but there cannot be a general fallback counting the elements and giving useful information.

@my-little-repository
Copy link
Author

@mschauer What do you mean by "there cannot be a general fallback counting the elements and giving useful information."? Here is a general fallback counting the elements and giving useful information.

mylength(l) = mapfoldl(x->1, +, 0, l)

The following shows that it returns useful information.

julia> length(take(countfrom(), 10))
ERROR: MethodError: `length` has no method matching length(::Base.Take{Base.Count{Int64}})
julia> mylength(take(countfrom(), 10))
10

There are plenty of functions on iterators that do not rely on a length method and that actually compute the length of an iterator. It is true that such a length does not always return pertinent information on all iterators, but so it is for most methods that work on iterables without length methods.

@toivoh
Copy link
Contributor

toivoh commented Oct 26, 2015

I think it would be pretty unpleasant to call length on eg an input stream, getting a count on the avaliable input data and then finding that the data had been thrown away.

It seems better that those functions that could deal with iterables without a length method are written to do so.

@mschauer
Copy link
Contributor

@my-little-repository You can check your intuition against the following iterator:

import Base: start, eltype, next, done 
immutable BernoulliTrials
    b::Float64
end

eltype(it::BernoulliTrials) = Bool

start(it::BernoulliTrials) = rand() > it.b

function next(it::BernoulliTrials, state)
    return true, rand() > it.b
end

function done(it::BernoulliTrials, state)
    return state
end

Defining length as you like would break the collect function.

julia> collect(BernoulliTrials(0.9))
5-element Array{Bool,1}:
 true
 true
 true
 true
 true

julia> import Base.length; length(it::BernoulliTrials) = mapfoldl(x->1, +, 0, it)
length (generic function with 58 methods)

julia> collect(BernoulliTrials(0.9))
9-element Array{Bool,1}:
  true
  true
  true
 false
  true
  true
 false
 false
 false

@my-little-repository
Copy link
Author

@toivoh At the moment, collect relies on length for some types of iterables.

 julia> type Iterable x::Float64 end;
    Base.start(x::Iterable) = 0;
    Base.next(x::Iterable, state) = (0, state+1);
    Base.done(x::Iterable, state) = true;
    collect(Iterable(0))

 => 0-element Array{Any,1}

 julia> collect(enumerate(Iterable(0)))
 ERROR: MethodError: `length` has no method matching length(::Iterable)
  in collect at array.jl:253
  in collect at array.jl:272

So you suggest rewriting it so it works even if length is not defined for the given type? This seems legit but it also looks pretty difficult to find a list of all methods that should work but fail on unlengthable iterables and fix them. Providing a default length should be simpler, no?

@JeffBezanson
Copy link
Member

I agree with @mschauer. A default length that is both slow (O(n), or perhaps non-terminating) and possibly destructive is not useful.

The general problem with iterators like Enumerate is that they have a length only if their wrapped iterator does, and this is very hard to express. Adding a default length doesn't really solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants