Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early termination for loops and comprehensions #3158

Closed
aroben opened this issue Sep 6, 2013 · 15 comments
Closed

Early termination for loops and comprehensions #3158

aroben opened this issue Sep 6, 2013 · 15 comments

Comments

@aroben
Copy link
Contributor

aroben commented Sep 6, 2013

It would be useful to have a syntax to terminate loops and comprehensions early. I'm imagining something like this:

numbers = [0, 1, 1, 2, 3, 5, 8, 13, 21]

# Comprehension syntax

smallNumbers = (n for n in numbers while n < 10)
# => [0, 1, 1, 2, 3, 5, 8]

# This is equivalent to:
smallNumbers = for n in numbers
  break unless n < 10
  n

smallOddNumbers = (n for n in numbers when n % 2 until n >= 10)
# => [1, 1, 3, 5]

# This is equivalent to:
smallOddNumbers = for n in numbers when n % 2
  break if n >= 10
  n

# Loop syntax

for n in numbers while n < 10
  alert "#{n} is small"

# This is equivalent to:
for n in numbers
  break unless n < 10
  alert "#{n} is small"

for n in numbers when n % 2 until n >= 10
  alert "#{n} is small and odd"

# This is equivalent to:
for n in numbers when n % 2
  break if n >= 10
  alert "#{n} is small and odd"

The while or until clause would come after any when or by clauses.

This would result in more concise code with a better clarity of intent than using manual break statements. It also wouldn't require any new keywords. I believe the syntax is also unambiguous because it currently is an error to write for n in numbers while and for n in numbers until.

@aroben
Copy link
Contributor Author

aroben commented Sep 6, 2013

I'd be happy to try my hand at implementing this if it seems like a good idea.

@hmaurer
Copy link
Contributor

hmaurer commented Sep 10, 2013

I agree this would be a lovely addition to the current comprehensions' features.

@epidemian
Copy link
Contributor

Beware that the proposed syntax is already valid syntax in some cases. For example, this currently compiles:

smallNumbers = (n for n in numbers while n < 10)
# Equivalent to:
smallNumbers = ((n for n in numbers) while n < 10)

Which evaluates to an empty array as the while body is never executed.

@aroben
Copy link
Contributor Author

aroben commented Sep 10, 2013

Interesting, I didn't see that the comprehension syntax with while was valid. Loop syntax with while causes a syntax error as I mentioned in the original description.

I wonder if anyone is actually using n for n in numbers while ... though. It doesn't seem very useful since the while is actually scoped outside of the comprehension.

@epidemian
Copy link
Contributor

I wonder if anyone is actually using n for n in numbers while ... though. It doesn't seem very useful since the while is actually scoped outside of the comprehension.

It's not so much about who's using it, but about consistency. n for n in numbers is an expression, and <expr> while <expr> is also a valid expression, so one would expect n for n in numbers while <expr> to be a valid expression too.

(I find it difficult to come up with a meaningful example though; mostly because of the post-fix while, which i seldom use)

@jashkenas
Copy link
Owner

Cool idea, but the "manual" break syntax is a fine way to do this. Let's not add another.

@aroben
Copy link
Contributor Author

aroben commented Sep 16, 2013

Thanks for considering it!

I'm interested in your thought process here. To me, adding while/until to terminate loops seems very similar to using when to filter/skip items: the latter is the equivalent of using continue, while the former is the equivalent of using break. Why does one make sense and not the other? (I'm not trying to argue, just to understand the philosophy.)

@jashkenas
Copy link
Owner

It does make a decent amount of sense. If you can find a way to have a single new (or perhaps reused) keyword instead of two, and no conflicts with the current valid syntax ... let us know and we'll reopen.

@aroben
Copy link
Contributor Author

aroben commented Sep 24, 2013

The common terminology for this is "take while" and "take until" (used in at least Python, Haskell, C#, Scala, Groovy, and Clojure). Whatever we choose, we could probably skip the "until" variant, though it would be nice to find a syntax that worked for both.

Some options using new keywords:

smallNumbers = (n for n in numbers takewhile n < 10)
smallNumbers = (n for n in numbers takeuntil n >= 10)

smallNumbers = (n for n in numbers whilst n < 10)

smallNumbers = (n for n in numbers aslongas n < 10)

Another option is to reuse break if and break unless, but I don't think it reads as well as any of the above, and it seems almost pointless in the loop syntax since you could just add a newline before break to end up with code that works today:

smallNumbers = (n for n in numbers break unless n < 10)
smallOddNumbers = (n for n in numbers when n % 2 break if n >= 10)

for n in numbers break unless n < 10
  alert "#{n} is small"

for n in numbers when n % 2 break if n >= 10
  alert "#{n} is small and odd"

@aroben
Copy link
Contributor Author

aroben commented Sep 25, 2013

I thought about this a bit more and came up with what I think is an even more useful/general feature that only requires a single new keyword: take. It lets you terminate a loop/comprehension early based on a condition, just as the above proposals do. But it also lets you limit a loop/comprehension to only a certain number of elements, which is not easily possible with the above proposals. This is something I've wanted quite a bit in my CoffeeScript usage, usually when searching through an array for an item I care about:

theItem = (item for item in items when item.foo is 'bar')[0]

This both reads a little awkwardly, and is inefficient, since it will continue searching through the whole array even after the one item I care about has been found.

The proposal

Any loop or comprehension can be suffixed with a new take clause. These clauses have two forms:

  1. take while <expr>/take until <expr>. Loop terminates once <expr> becomes false/true (respectively).
  2. take N. Loop terminates once N elements have been taken.
smallNumbers = (n for n in numbers take while n < 10)
# => [0, 1, 1, 2, 3, 5, 8]

smallOddNumbers = (n for n in numbers when n % 2 take until n >= 10)
# => [1, 1, 3, 5]

firstThreeEvenNumbers = (n for n in numbers when not (n % 2) take 3)
# => [0, 2, 8]

for n in numbers take while n < 10
  alert "#{n} is small"

for n in numbers when n % 2 take until n >= 10
  alert "#{n} is small and odd"

for n in numbers when not (n % 2) take 3
  alert "#{n} is one of the first three even numbers"

I believe this doesn't introduce any ambiguous syntax. take while is unambiguous because take cannot be used as the last word of an expression. And you can still use postfix while/until if you like (though it gets a bit crazy looking):

start = Date.now()
doSomething n for n in numbers take while n < 10 while Date.now() - start < 5000

# Equivalent to:
start = Date.now()
while Date.now() - start < 5000
  for n in numbers
    break unless n < 10
    doSomething n

@davidchambers
Copy link
Contributor

I like your most recent proposal, @aroben. I'd like to go through some of the CoffeeScript projects I work on to see how frequently this form is applicable.

@jashkenas
Copy link
Owner

@aroben Nice work. Do you have any interest in taking a stab at this and reopening as a pull request?

@aroben
Copy link
Contributor Author

aroben commented Nov 1, 2013

I'm definitely interested in implementing this.

I've been mulling over whether the order of take and when or by clauses should matter. I.e., should these two expressions be equivalent?

(n for n in numbers when n % 2 take until n >= 10)
(n for n in numbers take until n >= 10 when n % 2)

To me, even though the order is different, these both read as the when applying first, then the take. (I think it might sound different if when were filter because they would sound more like operations to be applied in order.)

But I certainly can imagine situations where you would want the take to apply before the when. Imagine you have a list of US states with their population and average per capita incomes. If you want to answer the question "Which of the 10 most populous states have average per capita income of over $40,000?". In that case you want to sort the list by population, take 10 of them, then filter them when state.income > 40000. Reversing these operations does not give you the same result.

It kind of feels like we're building a query language, similar to .NET's LINQ.

Am I over-thinking this? In CoffeeScript today, the ordering of when and by clauses is irrelevant: by always applies first, no matter the order specified. We could just say that the overall ordering is:

  1. by
  2. when
  3. take

…no matter what order they appear in the source. But it would make it harder to ask "Which of the 10 most populous states have income over $40,000?" as I explained above.

@xixixao
Copy link
Contributor

xixixao commented Nov 18, 2013

take should always be applied second. Otherwise it would just be:

(n for n in numbers take 10) == (n for n in numbers[0...10])

And you could keep the ordering arbitrary (or not).

@aroben
Copy link
Contributor Author

aroben commented Nov 18, 2013

That's true, in the take N case it isn't really useful to have take apply before when, since you could just slice the array instead.

And I think for take while/take until the order of take vs. when makes no difference. You'll end up with the same result either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants