Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: redesign comprehensions #14782

Closed
wants to merge 10 commits into from
Closed

Conversation

JeffBezanson
Copy link
Member

This lowers [ x for i in y ] to collect(Generator(i->x, y)) (which currently just calls map), and continues @mbauman 's generator syntax (x for i in y), which lowers to just the Generator part. Dict comprehensions work by calling Dict on a Generator.

So far only 1-d comprehensions are implemented, but they perform as well as they used to (this is based on jb/functions). To get that I had to modify map to iterate over its argument instead of indexing, which is significantly faster for ranges. Enabling bounds-check-elim for ranges might fix this.

My current inclination is to keep a simple Generator or MapIterator that just applies a function to one iterator, and have a separate type for N-d.

This approach keeps the current parsing of comprehensions and uses no macros. I also kept the existing implementation of typed comprehensions, since there's nothing wrong with it. That can easily be changed though.

To do:

  • N-d comprehensions (will probably use product iterator plus some special-case methods)
  • Define Dict(::Generator) using the type accumulation behavior of map
  • Return Array{Union{}} from map in the empty case

Would fix: #4470 #7258 #14786
Makes it easy to implement #3166 #550 #4867
References: #11398 #11034

@carnaval
Copy link
Contributor

glorious !

for the n-d case I think I remember fixing bugs in the lowering while doing #11398 so it may also be a good time to port those to the scheme code. IIRC they were all related to using a "colon index" in the iteration spec

edit: nvm seems we got rid of this feature

@anthonyclays
Copy link
Contributor

Can we also get #550 (filters in 1D comprehensions, like (x for x in xs if pred(x)))? Seems like a generator would be perfect for this.
Maybe this could be implemented using a FilteredGenerator type?

@mbauman
Copy link
Member

mbauman commented Jan 24, 2016

Very cool! I love how jb/functions makes these things fast enough that comprehensions can just be a sugar for collecting generators. That does make a good case for multidimensional generators, which (as I recall), makes the syntax a little tougher.

@anthonyclays
Copy link
Contributor

This would almost entirely supersede the sum(f::Function, itr) functionality (also for prod, minimum and maximum, any and all etc.)

It may also be a good idea to revisit #11750. In many use cases for any / all, a comprehension is far cleaner than the function-as-first-argument syntax, for example

julia> any(x % 7 == 0 || x % 11 == 0 for x in 1:6)
WARNING: 
    Using non-boolean collections with any(itr) is deprecated, use reduce(|, itr) instead.
    If you are using any(map(f, itr)) or any([f(x) for x in itr]), use any(f, itr) instead.
 in depwarn at ./deprecated.jl:73
 in nonboolean_any at ./deprecated.jl:816
 in any at ./reduce.jl:363
 in eval at ./boot.jl:267
while loading no file, in expression starting on line 0
false

vs the same thing with lambdas

any(x -> x % 7 == 0 || x % 11 == 0, 1:6)

or even

any(1:6) do x
    x % 7 == 0 || x % 11 == 0
end

I'm in favor of dropping the deprecation warning. Instead we should throw a TypeError when the iterator yields a non-boolean. This is consistent with the current behaviour when a non-boolean is used in the condition of an if or while statement: ERROR: TypeError: non-boolean (<Type>) used in boolean context. Essentially, it makes any(itr) behave like any(identity, itr).

@JeffBezanson
Copy link
Member Author

@anthonyclays +1 sounds good to me.

@JeffBezanson JeffBezanson force-pushed the jb/functions branch 2 times, most recently from 45ab442 to 3661f16 Compare January 25, 2016 22:09
@StefanKarpinski
Copy link
Member

One issue is that our sum function does pairwise summation, not straight linear summation. I'm not sure that can be supported with a generator – it might have to be something like a lazy collection instead so that the values can be accessed out of order.

@JeffBezanson JeffBezanson force-pushed the jb/functions branch 2 times, most recently from 828bfd2 to e7050ad Compare January 26, 2016 04:26
@JeffBezanson JeffBezanson force-pushed the jb/comprehensions branch 2 times, most recently from 5592013 to 216d19a Compare January 26, 2016 15:52
@mschauer
Copy link
Contributor

With the n-d case in mind, wouldn't it be better to introduce a new name for the collect-function of the generator, for example tabulate. collect is expected to produce a one-dimensional object, but the result for more-dimensional comprehensions are arrays. Then length and collect can be seen as linear versions of size and tabulate.

@JeffBezanson JeffBezanson force-pushed the jb/functions branch 2 times, most recently from 6b04549 to 12fbb1a Compare January 27, 2016 20:21
blakejohnson and others added 2 commits January 28, 2016 09:53
blakejohnson and others added 7 commits January 28, 2016 21:19
WIP: redesign closures, then generic functions
fix #4470 Generalized comprehension syntax
fix #7258 type-inference-independent comprehensions

[ci skip]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants