Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "procedures": functions that are guaranteed to not return a value #2726

Closed
wants to merge 2 commits into from
Closed

Add "procedures": functions that are guaranteed to not return a value #2726

wants to merge 2 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Feb 26, 2013

"Procedures" are functions that do not implicitly return a value:

test = ->> 'hi'
test() # undefined

They can, however, manually return a value. Trying to return a value from a procedure will throw a SyntaxError, as it does if you try to return a value from a constructor:

test = ->> return 'hi'
# SyntaxError: In j.coffee, cannot return a value from a constructor or procedure.

Procedures are created with the ->> glyph. Like regular functions, they can be bound to the current "this" context, using the =>> glyph.

I figured I'd get the ball rolling on this by putting together some code. It was a relatively simple change - just adding in the additional symbols, and having the Code node set the @noReturn flag to true if the tag is procedure or boundprocedure. Tests are included, and it passes all current ones.

I know this has been brought up before (specifically in #899, #1812, #1836, #1855,...), but perhaps with a concrete implementation, this could actually go somewhere.

Thoughts?

"Procedures" are functions that do not implicitly return a value:

    coffee> test = ->> 'hi'
    [Function]
    coffee> test()
    undefined

They can, however, manually return a value:

    coffee> test = ->> return 'hi'
    [Function]
    coffee> test()
    'hi'

Procedures are created with the '->>' glyph. Like regular functions,
they can be bound to the current "this" context, using the '=>>' glyph.
@ghost
Copy link
Author

ghost commented Feb 26, 2013

Another solution, which I'm now partial to, would be to outright ban return statements that pass back a value from procedures - in other words, procedures would be guaranteed to return undefined.

Rationale

Coder Context Switches and Asynchronous Functions

Asynchronous functions that operate with the standard callback mechanic should not themselves return a value - the passing back of data should be handled exclusively through the specified callback. Yet, with the current state of the language, ensuring this requires a context switch / backtracking when writing a function.

Consider:

readFiles = (directory, cb) ->
  fs.readdir directory, (err, files) ->
    if err?
      cb err
    else
      output = {}
      for file in files then do (file) ->
        fs.readFile path.join(directory, file), 'utf8', (err, data) ->
          if err?
            cb err
          else
            output[file] = data
            files.pop()
            cb undefined, output if files.length is 0

  return

The return at the end of the function sticks out like a sore thumb. Writing it requires placing the return first, then backing up and filling in the function body, or, at the end, finding the indentation level of the top function, moving back to it, and slotting in the last statement. Either way, the developer has to switch contexts in their train of thought.

As it is, also, the callbacks to the fs functions will also be returning implicit values - including an inadvertent loop comprehension. Preventing this requires even further complexity.

readFiles = (directory, cb) ->
  fs.readdir directory, (err, files) ->
    if err?
      cb err
      return
    else
      output = {}
      for file in files then do (file) ->
        fs.readFile path.join(directory, file), 'utf8', (err, data) ->
          if err?
            cb err
            return
          else
            output[file] = data
            files.pop()
            cb undefined, output if files.length is 0
            return
      return
  return

More backing up, more complexity, and more painful indentation tracking. All those return statements end up littering the code and making it harder to read. Procedures make this far more explicit, and far less complex:

readFiles = (directory, cb) ->>
  fs.readdir directory, (err, files) ->>
    if err?
      cb err
    else
      output = {}
      for file in files then do (file) ->>
        fs.readFile path.join(directory, file), 'utf8', (err, data) ->>
          if err?
            cb err
          else
            output[file] = data
            files.pop()
            cb undefined, output if files.length is 0

Combining the callbacks and the return statements into one line reduces complexity even further.

readFiles = (directory, cb) ->>
  fs.readdir directory, (err, files) ->>
    (cb err; return) if err?
    output = {}
    for file in files then do (file) ->>
      fs.readFile path.join(directory, file), 'utf8', (err, data) ->>
        (cb err; return) if err?
        output[file] = data
        files.pop()
        cb undefined, output if files.length is 0

This way, return is only used for early termination.

Readability

Currently, the same can be accomplished by adding a return, null, undefined, etc. to the end of a regular function. Doing this, however, requires reading through the entirety of a function to determine that it does not, in fact, return a value. Furthermore, the existence of a such a statement at the end of a function does not guarantee that the function will not return a value: there could be another code path elsewhere that still results in an output.

The addition of a separate symbol allows for the intention of a function to be enforced at its definition. return statements paired with a value inside these "procedures" would throw a syntax error; because of this, the presence of a ->> or =>> symbol at the beginning of a function would ensure its behavior. This guards against both inadvertent return values and spaghettification as a code base grows in complexity - documentation can get out of date; later people working with the code could add in return values where none were meant to be.

Existence within the CoffeeScript Design Philosophy

An oft-espoused principle of implicit returns is that they force one to consider good return values when writing a function. I believe that the addition of first-class procedures is complementary to, rather than contradictory of, this design tenet.

The choice between the two types of function glyphs forces the consideration of the return type and purpose of a function from the very beginning. Rather than simply turning off implicit returns, these procedures are guaranteed not to return a value, in a way far more explicit than the addition of a return statement at a function's end. The decision of whether or not to return a value is just as important as that of what type of value should be returned.

Avoidance of Inadvertent Implicit Returns

Compatibility with Testing Frameworks

Several testing frameworks (and other libraries, for that matter) are sensitive to the return types of functions - see vows.

vows = require 'vows'
assert = require 'assert'

vows.describe('Example Case').addBatch
  'example topic':
    topic: -> someAsyncFunction this.callback
    'returns true': (topic) -> assert.equal topic, true

In vows, "topics" for tests can be calculated asynchronously; this relies, however, on a function returning undefined. If the topic function does not return undefined, it is assumed to be synchronous - and thus, the callback value is silently ignored. This can lead to frustrating-to-debug issues cropping up: in the case above, if someAsyncFunction happens to return a value, the test will break.

The addition of a separate symbol for procedures prevents such problems from occurring in the first place, as the decision of whether or not a function should return a value is handled explicitly at its definition.

Performance

It's been said before: the way loop comprehensions are done means that the usage of loops in functions can result in the unintentional construction of arrays.

unlinkFiles = (files) -> fs.unlinkSync file for file in files
# returns an array - most likely not what was intentioned

unlinkFiles = (files) ->
  fs.unlinkSync file for file in files
  return
# does not return an array
# intention to do so is specified indirectly, rather than directly
# cannot be a one-liner, unless semi-colons are used

unlinkFiles = (files) ->> fs.unlinkSync file for file in files
# clearly denoted as a procedural function
# remains a one-liner

API Design

The explicit demarcation of functions as procedural prevents the leakage of internal values, which could lead to unintended usage / breakage as internal implementations change and evolve.


I'll work on an implementation of this (banning return statements with values inside procedures) later today. (This has been implemented.) For now, thoughts?

@shesek
Copy link

shesek commented Feb 26, 2013

They can, however, manually return a value

Why give that option? If someone wants to return a value, he can just use a regular function. If I see ->>, I'd rather know that its guaranteed not to return anything.

@ghost
Copy link
Author

ghost commented Feb 26, 2013

@shesek See my second comment in the issue - this is exactly the direction I'm heading in with this.

@shesek
Copy link

shesek commented Feb 26, 2013

Ah, right - "Rather than simply turning off implicit returns, these procedures are guaranteed not to return a value". sorry, I missed that :)

@epidemian
Copy link
Contributor

Yay!

I must say that i'm not strongly in favour or against having a syntax for procedures. But after that awesome rationale of yours, i agree: it can make the language much better for some of people (maybe lots of people, how knows? =P). You've got my vote 👍

@epidemian
Copy link
Contributor

About the and return construct, instead of

cb err and return if err?

You can write:

(cb err; return) if err?

Maybe not as English-like readable as the first snippet; but it's an option to avoid some of the nesting.

@goto-bus-stop
Copy link

+1 for proposal: CS has needed something like this for a long time. Coco's !-> auto-return suppressing is super handy, but this is much more explicit in its behavior.
+1 for syntax: ->> still looks like a function, but is clearly different and can (therefore) be quickly and easily distinguished from ->

@epidemian
Copy link
Contributor

+1 for syntax: ->> still looks like a function, but is clearly different and can (therefore) be quickly and easily distinguished from ->

->> strikes a bit of an uncanny valley on me. The choice of -> for function is inspired, i think, on the mathematical notation for functions, e.g: f: ℂ → ℝ. So when i see a -> i'm thinking in terms of "take this and transform it into that". ->> looks very similar to ->, but it is semantically very different, i can no longer think of it as a transformation of values.

Anyway, it's just a very minor peeve. !-> (or /-> or -/> or whathaveyou) has that little connotation of "not" which would make it more adequate to indicating "this isn't a transformation; it doesn't return anything", but i think i would prefer ->> over !-> in everyday usage as it's much easier to type 😊

Generalize the routine used to block constructors from returning values
to forbid the same in all Code blocks marned as `@noReturn`.

Throw a SyntaxError in this case.
@ghost
Copy link
Author

ghost commented Feb 26, 2013

I've changed it a bit so that procedures are forbidden from returning a value; doing so will now throw a SyntaxError. I generalized the logic used to prevent constructors from returning values to apply to all Code blocks marked as @noReturn (as these are the only two cases in which that flag is used).

@epidemian I agree that ->> is easier to type than !->, etc, though I'd be willing to sacrifice it if it becomes an issue. The >> in ->>, for me, resembles a "fast forward" symbol, implying motion; the two > signs follow each other, in the same way that a procedure is a series of steps executed one-after-the-other. But that's all totally subjective, of course.

@mehcode
Copy link

mehcode commented Feb 27, 2013

I'd vote for !-> or -/> if we're going to have this syntax.

@epidemian
Copy link
Contributor

The >> in ->>, for me, resembles a "fast forward" symbol

– We should use procedures here.
– Why?
– They are much faster than functions. I read it from one of the CoffeeScript developers.

xD

But yeah, i played a little bit in the text editor and i'm quite convinced that ->> is the best option.

We kinda need the - to also have it's fat form, =. So changing that i think is out of question.

And the > is aesthetically quite important, as it stands out a lot compared to other symbols and i think that's good because very often the function or procedure token is the only thing indicating "this is a separate piece of code!". I tried many other combinations; one that was interesting to me was -: and =:, but i think it doesn't stand out as much:

imAProcedure = -:
  console.log "but i hate my glyph :("

And having more than 2 different (non-alphanumeric) symbols for the same token would make it very inconvenient in my opinion. I think the use of procedures could turn out to be very common (e.g. in all kinds of unit testing code, callbacks, etc), so i'd prefer their symbol to be as keyboard friendly as possible.

@shesek
Copy link

shesek commented Feb 27, 2013

Coming to think about this a bit more...

return cb err if err? is quite commonly used [1] in callbacks, which are great candidates for using procedures. If this won't be possible, you'll have to wrap every callback that delegates its error with an if..else, which is somewhat cumbersome. Even if #2729 makes it in, it doesn't make much sense to forbid regular returns and allow postfix returns.

Also, what about the more general case of using return in procedures to stop the function execution? Maybe only return without a value can be allowed? With value-less returns you still couldn't do return cb err if err?, but returning the return value of the callback is admittedly somewhat hacky and not quite right (tho I personally still do it), so it wouldn't be that much of a loss.

[1] 355 CoffeeScript repositories according to https://github.com/search?q=%22return+cb+err+if+err%22&ref=commandbar&type=Code

@ghost
Copy link
Author

ghost commented Feb 27, 2013

Only returns with values are forbidden inside procedures at the moment, so value-less returns (for flow control) still work as usual.

If #2729 doesn't make it in, you could still do, as you mentioned before:

(cb err; return) if err?

inside a procedure, and that would work.

@jashkenas
Copy link
Owner

I'm afraid I'm going to have to close this for the same reasons as before -- It's a great PR, and a great example implementation, but the reasoning for the current function options haven't changed. Ideally every function returns a meaningful value -- if it's asynchronous, maybe a promise -- even if that value is just true or null, or this for chaining. And if you really don't want to return a value, the nicest place to put that information is at the end of the function, where the value normally comes out. Anyhow, it's all been said before ;)

@jashkenas jashkenas closed this Feb 28, 2013
@ghost
Copy link
Author

ghost commented Feb 28, 2013

Aw, shucks. Was worth a go!

@davidchambers
Copy link
Contributor

I certainly appreciate the time and thought you put into this pull request, @mintplant. For what it's worth, I find your points compelling. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants