Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TODO] Nim now supports true lambdas; eg allows map(a~>a*localVar) (wo limitations of sugar.nim =>) #8679

Closed
wants to merge 7 commits into from

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Aug 18, 2018

I found a clean design to support lambdas in Nim via a ~> some_expr(a)
EDIT it now works with 0 or more arguments, eg: () ~> expr ; a~>expr, (a,b)~>expr

advantages over => from sugar.nim

advantages over something like mapIt(expr(it)) or foldl(expr(a,b))

  • hardcoding it is too magical, unhygienic and less familiar than ~> (eg [1] [2])
  • hardcoding a,b in foldl is even worse, as it's not even in fold function name (the user has to know which template uses the magic variables)
  • mapIt doesn't allow nesting (it , eg: file.byLine.mapIt(it.split.mapIt(bar(it))) won't work
  • not clear how mapIt style could be extended for passing multiple lambdas
  • simpler and more explicit (lambda(a)) use inside the template that uses the lambda: no need for inject
  • it works with arbitrary number of arguments without magic conventions (unlike implicit magic ones in (1~>it, 2~>a,b etc) in mapIt, foldl)

example usage

template testCallFun[T](fun: untyped, a:T): auto =
  makeLambda(fun, lambda)
  lambda(lambda(a))

doAssert testCallFun(x ~> x * 3, 2) == (2 * 3) * 3

# after lambda-ifying sort, map etc
s.sort((a,b) ~> a.score < b.score)
s.sortBy(a ~> a.score)
echo stdin.byLine.map(a ~> a.splitter.filter(b ~>b.startsWith("foo")).toSeq

fixes these:

tasks for future PR's

  • allow passing a regular proc in place of a lambda expression for makeLambda
  • allow passing a compile time string for makeLambda
  • type constraint: (a:int, b) => expr(a,b)
  • find a better name for map2 ; maybe just name it map and make sure everything works as before for clients of previousmap function
  • deprecate mapIt; map2 is cleaner
  • are there still use cases for do notation given this? (eg: f2 = filter(colors) do (x: string) -> bool : x.len > 5)
  • are there still use cases for => from sugar.nim given this?
    return a function pointer

Notes

[1] #8675 (comment))

I'm not sure about the final syntax but I always thought the "it" in mapIt and the a and b in foldl just came out of nowhere.

[2] #8675 (comment)

@timotheecour timotheecour changed the title [WIP] Nim now supports lambdas; eg allows map(a=>a*localVar) [WIP] Nim now supports lambdas; eg allows map(a=>a*localVar) (wo limitations of sugar.nim =>) Aug 18, 2018
@timotheecour timotheecour changed the title [WIP] Nim now supports lambdas; eg allows map(a=>a*localVar) (wo limitations of sugar.nim =>) [WIP] Nim now supports true lambdas; eg allows map(a=>a*localVar) (wo limitations of sugar.nim =>) Aug 18, 2018
@Araq
Copy link
Member

Araq commented Aug 18, 2018

I like mapIt and find this implicit assumption "I want to save typing but can't understand a single Nim-specific convention" seriously unconvincing.

@drslump
Copy link
Contributor

drslump commented Aug 18, 2018

Glad you find a better solution @timotheecour!

I'm not yet familiar enough with the macro infrastructure to fully understand the changes, so I have a few doubts:

  • won't this have a very limited applicability since it can only be used with templates/macros? The familiarity of => would make that behaviour very confusing.
  • the multiple arguments version a,b => a+b will require parenthesis for the arguments, looking at the grammar rules I don't think they can be elided.

@timotheecour
Copy link
Member Author

timotheecour commented Aug 18, 2018

won't this have a very limited applicability since it can only be used with templates/macros? The familiarity of => would make that behaviour very confusing.

Yes, a proc by definition won't be able to accept an (untyped) lambda since that the lambda is untyped and type inference is delayed, inside the instantiated template/macro.
That's actually a feature: it allows type inference (see the bugs this PR fixes) and zero-cost (inlining the lambda body). I wouldn't call that "very limited applicability" though, it's not actually limiting what you can do (just use a template to accept a lambda).

An enhancement could be made to let a proc accept a lambda in cases where type inference is doable at call site (as done in sugar's =>) but that's not essential IMO.

the multiple arguments version a,b => a+b will require parenthesis for the arguments, looking at the grammar rules I don't think they can be elided.

maybe, I'd need to think more. Either (a,b)=>a+b or a,b=>a+b will be supported though EDIT indeed, I've updated PR to support(a,b)=>a+b etc

@mratsim
Copy link
Collaborator

mratsim commented Aug 18, 2018

Nice!

For variadic map, you can have a look into how I do it in loop-fusion: https://github.com/numforge/loop-fusion. I can update sequtils later.

Ideally those should work on all "Indexable" containers from nim-lang/RFCs#50. Unfortunately it will only work for seq/array/openarray at the moment, for truly generic maps I'm pending #7737.

@Araq

I like mapIt and find this implicit assumption "I want to save typing but can't understand a single Nim-specific convention" seriously unconvincing.

There is no convention, you have to know sequtils to understand it or a and b.
The writer of another library could use x, y, z.

You can argue that it's a small thing to learn but lots of small things adds up.

@cooldome
Copy link
Member

cooldome commented Aug 18, 2018

There is nothing wrong with mapIt
After using it for quite some time, I find it MUCH better than what other languages with lambdas propose. mapIt template is both efficient and crystal clear. Other languages don't have Nim's functionality to pass a piece of code as the argument directly, so you need wrap it in some silly lambda for no reason. Nim does NOT need to replicate this crap.

@zielmicha
Copy link
Contributor

There is no convention, you have to know sequtils to understand it or a and b.
The writer of another library could use x, y, z.

Maybe we should add the convention to NEP-1?

(it's still sad that mapIt doesn't nest)

@Araq
Copy link
Member

Araq commented Aug 18, 2018

You can argue that it's a small thing to learn but lots of small thing adds up.

"Nim -- how do you write mapIt today?" also adds up!

@bluenote10
Copy link
Contributor

bluenote10 commented Aug 18, 2018

There is nothing wrong with mapIt

I always saw it as one of Nim's weaknesses because

  • enforcing a generic variable name reduces readability
  • the style does not scale because of lack of nesting

The latter point was always the biggest issue for me. It means that you have to switch styles when going from unnested to nested structures, which always looks inconsistent. And when working with 2-d structures like e.g. tables nested iteration isn't rare because you often operate on elements of rows/columns.

@timotheecour Does using untyped arguments for map mean that the users can't properly overload map for their own types similar to the issue with toSeq (nim-lang/RFCs#512)? If so, would it help to make s typed?

@kaushalmodi
Copy link
Contributor

kaushalmodi commented Aug 20, 2018

@mratsim (#8675 (comment)), @drslump (#8675 (comment)):

I'm not sure about the final syntax but I always thought the "it" in mapIt and the a and b in foldl just came out of nowhere.

The "it" is not from "nowhere". The "it" notation comes from Common Lisp's Anaphoric macros. For more, read starting of Ch.14. Anaphoric Macros from On Lisp book [free pdf, html version].

In natural language, an anaphor is an expression which refers back in the conversation. The most common anaphor in English is probably “it,” as in “Get the wrench and put it on the table.”

The anaphoric it symbol has also been ported to the dash.el Emacs-Lisp library, which happens to have a --map function similar to the current mapIt. From the examples in its docs, below should be easy to read even for non-Lispers:

(-map (lambda (n) (* n n)) '(1 2 3 4)) ;; normal version
(--map (* it it) '(1 2 3 4)) ;; anaphoric version

@Araq
Copy link
Member

Araq commented Aug 20, 2018

enforcing a generic variable name reduces readability

On the contrary, universally applied names like result and it increase readability. And this is all about a concise form of expression, so in practice a, b, x, y etc would be used instead which are not better names.

@dom96
Copy link
Contributor

dom96 commented Aug 20, 2018

This is a really cool PR but I think that overloading =>'s meaning like this will confuse programmers.

The Anaphoric convention (as referenced by @kaushalmodi, TIL btw :)) is great. But I can't help but wonder, can we naturally extend it to more than one variable? For example, how could we define a sortIt template?

@timotheecour timotheecour changed the title [WIP] Nim now supports true lambdas; eg allows map(a=>a*localVar) (wo limitations of sugar.nim =>) Nim now supports true lambdas; eg allows map(a=>a*localVar) (wo limitations of sugar.nim =>) Aug 21, 2018
@timotheecour
Copy link
Member Author

timotheecour commented Aug 21, 2018

  • I've added support for 0, 1, or more arguments, see tests in tests/stdlib/tlambda.nim
  • I've un-exported map2 in sequtils.nim to focus this PR just on lib/pure/lambda.nim and tests/stdlib/tlambda.nim

see tests/stdlib/tlambda.nim, it shows examples with 0, 1, 2, 3 arguments, examples using more than 1 lambda, examples with nested lambdas; some of these would be awkward or impossible to do with the magic mapIt(expr(it)), foldl(expr(a, b)); especially the ones with nested lambdas and multiple lambdas.

This is a really cool PR but I think that overloading =>'s meaning like this will confuse programmers.

it shouldn't conflict (please provide example if not); furthermore this PR's => should always be preferred over sugar => since the latter has serious limitations as noted in top-level messages "advantages over => from sugar.nim" (so we could always (in a future PR) alias it to zero-cost lambda, deprecate it, or make it use smthg else, eg ==>); the preferred syntax (=>) should apply to the preferred semantics (zero-cost lambda)

/cc @bluenote10

@timotheecour Does using untyped arguments for map mean that the users can't properly overload map for their own types similar to the issue with toSeq (#7322)? If so, would it help to make s typed?

fixed; I now used typed for 1st arg: template map2(s: typed, lambda: untyped): untyped

@GULPF
Copy link
Member

GULPF commented Aug 21, 2018

If this is accepted I think sugar.=> must be deprecated, since it uses the same syntax for something different.

Just a thought: what if the makeLambda macro falls back to the it syntax? So both of these would work:

let s = [1,2,3]
echo s.map2(a => a + 1)
echo s.map2(it + 1)

This option might be less controversial, and should be possible to implement.

@timotheecour
Copy link
Member Author

timotheecour commented Aug 21, 2018

note that I intend to make makeLambda work with a proc (in future PR maybe), eg:

let s = [1,2,3]
echo s.map2(a => a + 1)
echo "AbC".map2(toLower)

that's not magical since the proc would be expected to be in scope

for fallback to it syntax, it would bring back all the gotchas and edge cases about the magic it, a, b (etc?) variables:

# What if `it`, `a`, or `b` is in scope?
let it = 10 # could be from a nested lambda for example, or from a local var, eg in an import...
echo s.myfun(it) # is that `a=>a` or `a=>it` ?

# what if `it`, `a, or `b` is not mentioned in RHS (ie, a projection)
echo s.myfun(a => c) # ok, it's projection: ignores input and returns local var `c`
echo s.myfun(c) # is that `a => 10`? `(a,b) => c` ? `() => 10` ?  impossible to tell from looking at this
echo s.myfun(a)  # is that a=>a? (a,b) => a? (a,b,c)=>a ? or maybe it=>a where `a` is a local?

# even trickier: 
echo s.myfun(a => (b => 10 + b)) # without any magic `makeLambda` overload, this is not ambiguous
# but with `it` shorthand this is:
echo s.myfun(b => 10 + b) # but that already has a different meaning...

# and it gets even trickier with UFCS

so I'd rather not conflate makeLambda with an overload that would understand magic variable names (it, a, b); that would have to be a different template (eg makeLambdaUnderstandsMagic, maybe for a transition period) but if so, should be IMO deprecated for all reasons already discussed. There is 0 disadvantage to using the standard a=>expr(a) and it avoids all those weird edge cases and per-function convention (it's not a global convention, since there's no way to know that foldl takes a,b without being aware of it, for eg)

@GULPF
Copy link
Member

GULPF commented Aug 21, 2018

for fallback to it syntax, it would bring back all the gotchas and edge cases about the magic it, a, b (etc?) variables:

My proposal would only concern it, not any other variable name.

What if it, a, or b is in scope?

If it is in scope and needs to be referred to, then obviously the shorthand syntax can't be used. The same thing would happen if you explicitly name the lambda parameter it (it => expr), it's not really a surprising behavior. This feature would just be a shorthand for it => expr anyway.

The other issues can easily be solved by requiring that an it-lambda actually references it. So s.map2(10) wouldn't be valid for example.

There is 0 disadvantage to using the standard a=>expr(a)

Changing conventions is always a disadvantage. a=>expr(a) is not the standard, it's a new proposal.

@dom96
Copy link
Contributor

dom96 commented Aug 21, 2018

If this is accepted I think sugar.=> must be deprecated, since it uses the same syntax for something different.

The => has been designed as a short-hand for proc () = ... and is similar to how closures are written in other languages. It works well and it cannot be replaced by a static compile-time variant.

We should flip this the other way around. This PR should use something other than =>, since it uses the same syntax for something different.

@Clyybber
Copy link
Contributor

Clyybber commented Aug 21, 2018

@dom96

I think we should fix up => from sugar.nim to support the features this PR enables(if possible).

EDIT: I think the only real issue with sugar.nim's => is #7816 which unfortunately seems pretty hard to fix.
#7435 seems pretty easy to work around, and doesn't look essential for => to me.

@timotheecour
As for zero-cost inlining, can the compiler inline those procs automatically? I think we should definitely benchmark this PR to test that out.

@mratsim
Copy link
Collaborator

mratsim commented Aug 21, 2018

Due to operator precedence rules I suppose it must start with =.

The 2 big languages with lots of esoteric symbols are C (>>=) and Haskell (>>= <*> >>> <$>, ...), but fortunately neither uses =>> or ==>

See C operator reference and Haskell Hoogle

Thoughts?

@timotheecour timotheecour changed the title Nim now supports true lambdas; eg allows map(a=>a*localVar) (wo limitations of sugar.nim =>) Nim now supports true lambdas; eg allows map(a~>a*localVar) (wo limitations of sugar.nim =>) Aug 24, 2018
@timotheecour
Copy link
Member Author

timotheecour commented Aug 24, 2018

/cc @dom96 PTAL
I changed => to ~> which is short and sweet and not already taken (at least in stdlib)

Due to operator precedence rules I suppose it must start with =.

actually, no, it must be an arrow operator, defined here: https://github.com/nim-lang/Nim/blob/devel/doc/manual.rst#L573
Note: #8759 is affecting it in some cases (it also affect =>...), but I'm assuming it'll be fixed at some point. No big deal, one can always use parens explicitly in this corner case.

I think we should fix up => from sugar.nim to support the features this PR enables(if possible).

looks like we need to keep => which fits different need (this PR's ~> returns a zero-cost lambda (with delayed type inference), whereas => returns a function pointer (not zero cost, and not delayed type inference). There are use cases for both I suppose although ~> should now be the preferred choice in most cases IMO.

let's delay new features to future PR's (eg as I listed in top-level PR msg)

@dom96
Copy link
Contributor

dom96 commented Aug 24, 2018

That's much better, but sadly there are still a couple of things that I dislike with this approach. If we can fix these then there is a chance I would accept this (please note that this isn't a guarantee that it will be accepted, you need to convince @Araq and the implementation needs to be sound).

My main problem is that defining the map2 template isn't elegant. There is a lot of subtle detail in the template. I'm not sure if this is possible but the ideal implementation would allow something like this:

template map[T](s: openarray[T], ctl: static[CompileTimeLambda[T, T]]): seq[T] =
  result = @[]
  for i in s:
    result.add ctl(i)

Where CompileTimeLambda takes a variable number of type arguments, the last of which is the return type of the lambda. I.e. CompileTimeLambda[void] is equivalent to taking zero arguments and returning void, CompileTimeLambda[int, string, int] is equivalent to taking a int and string parameter and returning int.

This sounds doable as long as:

  • Overloading isn't a problem
  • We can live with a possibly really complex macro that transforms a template into whatever needs to be done

But the benefits are huge and would make writing operations this way really easy.

@timotheecour
Copy link
Member Author

timotheecour commented Aug 24, 2018

/cc @dom96 @Clyybber @skellock
I really don't see how your approach with CompileTimeLambda could work with today's compiler.
anything except untyped will have to typecheck at lambda declaration time (ie at the point where user writes a~>b) instead of using delayed type-checking, ruining the whole idea (and making it inferior to mapIt): most of the points marked as "fixes these" in top-level PR would not be fixed under your proposal.

when defined(case1):
  type CompileTimeLambda[T]=object
  template foo[T](expr: CompileTimeLambda[T]): untyped = discard
when defined(case2):
  type CompileTimeLambda[T]=object
  template foo[T](expr: static[CompileTimeLambda[T]]): untyped = discard
when defined(case3):
  type CompileTimeLambda[T]=concept T
  template foo[T](expr: static[CompileTimeLambda]): untyped = discard
when defined(case4):
  type CompileTimeLambda=concept b
  template foo(expr: static[CompileTimeLambda]): untyped = discard
when defined(case5):
  type CompileTimeLambda=concept b
  template foo(expr: CompileTimeLambda): untyped = discard

# Error: undeclared identifier: '~>'
foo(a~>a)

if you have a concrete counter-proposal to this PR, please show me some sample code (I'd be very happy to be proven wrong!)
Note that mapIt is already using untyped, so this PR is in 0 ways worse than what we have today, and has all the improvements that I described in top-level PR post.

Note: in D, they solve your concern (allowing specifying type constrains) via template constraints; this is not currently possible in Nim but see my RFC for that: [1]

@Araq
Copy link
Member

Araq commented Oct 11, 2018

Sorry, but no. The last thing Nim needs is yet-another-way to write lambdas.

@Araq Araq closed this Oct 11, 2018
@timotheecour timotheecour changed the title Nim now supports true lambdas; eg allows map(a~>a*localVar) (wo limitations of sugar.nim =>) [TODO] Nim now supports true lambdas; eg allows map(a~>a*localVar) (wo limitations of sugar.nim =>) Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.