

JavaScript Needs Blocks - wycats
http://yehudakatz.com/2012/01/10/javascript-needs-blocks/

======
munificent
> In order to have a language with return (and possibly super and other
> similar keywords) that satisfies the correspondence principle, the language
> must, like Ruby and Smalltalk before it, have a function lambda and a block
> lambda. Keywords like return always return from the function lambda, even
> inside of block lambdas nested inside.

In case you want to get your google/wikipedia on, what Katz is talking about
here is a "non-local return". Normal returns return from the immediately
enclosing (i.e. local) lambda/fn/block/thing. Non-local ones unwind past
multiple enclosing functions to cause an outer one to return.

In Smalltalk, the distinction is between methods and blocks. A return
expression always returns from the enclosing _method_ and will unwind past any
blocks that the return expression is contained in.

In Common Lisp, I believe you name functions and then indicate the name of the
enclosing function that you want to return from by doing `(return-from <fn
name> 3)`.

I thought non-local returns were a bit impure and kind of a weird language
novelty for a while. Recently, I added them to a hobby language of mine
(Finch) and then wrote some code that uses them.

Holy crap are they awesome.

Being able to early return is, I think, one of the things that's really handy
about imperative languages. I use it all the time in my code. But that
cripples your ability to define your own flow-control-like structures that
take functions.

For example, here's some code that sees if an array contains a given item
using a for loop:

    
    
        function contains(array, seek) {
          for (var i = 0; i < array.length; i++) {
            if (array[i] == seek) return true;
          }
    
          return false;
        }
    

Let's say you don't like doing an explicit C-style loop and you want to make
your own forEach() control-flow-like function:

    
    
        function forEach(array, callback) {
          for (var i = 0; i < array.length; i++) {
            callback(array[i]);
          }
        }
    

Now we try to refactor `contains` to use it:

    
    
        function contains(array, seek) {
          forEach(array, function(item) {
            if (item == seek) return true;
          });
    
          return false;
        }
    

Crap, that doesn't work. There's no easy way to make `forEach()` stop early
unless you go out of your way to make it support that by having the callback
return some magic sentinel value. Or you do something hideous like throw a
"return" exception.

Non-local returns solve this neatly and end up being pretty delightful to use
in practice.

~~~
ricardobeat
This problem doesn't exist:

    
    
        // returns true/false
        array.some(function(item){
          item === seek
        })
    

Your "magic sentinel value" would be a Boolean, which is exactly how `some` is
implemented:

    
    
       function forEach(array, callback) {
          for (var i = 0; i < array.length; i++) {
            if (callback(array[i]) === true) break;
          }
        }

~~~
pak
Let's modify the GP's example--instead of checking if an array contains
something, he wants to return a copy where a callback is applied to each
element. Kind of like "map". However, if the array contains other arrays, he
wants to return an empty array.

No problemo in Ruby:

    
    
      def map_unless_nested(arr)
        arr.map {|v| return [] if v.is_a? Array; yield v}
      end
    
      map_unless_nested([2,3,4,5]) {|v| v+1 }
      # => [3,4,5,6]
      map_unless_nested([2,[3,4],5]) {|v| v+1 }
      # => []
    

This starts looking ugly in JavaScript if we want to generalize "map". Either
we can keep "map" clean and wrap our callback so it throws exceptions on array
input:

    
    
      function map(arr, callback) {
        var ret = [];
        for (var i = 0; i < arr.length; i++) {
          ret.push(callback(arr[i]));
        }
        return ret;
      }
      
      function map_unless_nested(arr, callback) {
        try {
          return map(arr, function(v) { 
            if (v instanceof Array) throw "nested!"; 
            return callback(v);
          });
        } catch (e) {
          if (e==="nested!") { return []; }
          throw e;
        }
      }
    
      map_unless_nested([2,3,4,5], function(v){ return v + 1; })
      // => [3,4,5,6]
      map_unless_nested([2,[3,4],5], function(v){ return v + 1; })
      // => []
    

Or, if we want to keep the callback clean, we have to start fiddling with map,
and then it's no longer really just map (it's map with halting parameters),
and then you're using special return values to inform calling functions that
the halting condition was met.

    
    
      function map_unless_test(arr, callback, test) {
        var ret = [];
        for (var i = 0; i < arr.length; i++) {
          if (test(arr[i])) { return false; }
          ret.push(callback(arr[i]));
        }
        return ret;
      }
      
      function map_unless_nested(arr, callback) {
        return map_unless_test(arr, callback, function(v) { 
          return v instanceof Array;
        }) || [];
      }
    

If you still think that exceptions are the best language feature to solve this
example, consider the situation where you want to do something with those
nested arrays. For example, let's take the Ruby example and modify it so it
will return a copy of the first-encountered innermost array with the callback
applied.

    
    
      def map_first_innermost(arr, &block)
        arr.map do |v|
          return map_first_innermost(v, &block) if v.is_a? Array
          yield v
        end
      end
    

That was easy. You can see that non-local returns become useful very quickly.

~~~
ricardobeat
They look useful from a specific mindset. How about these?

    
    
        function map_unless_nested(arr, cb){
          var res = []
          var has_nested = arr.some(function(item){ 
            if (item instanceof Array) return true
            res.push(cb(item))
          })
          return has_nested ? [] : res
        }
    
        function map_first_innermost(arr, cb){
          var res = []
          var has_nested = arr.some(function(item){ 
            if (item instanceof Array) return true
            res.push(cb(item))
          })
          return has_nested
            ? map_first_innermost(res[res.length-1], cb)
            : res
        }

~~~
pak
Well, now you're dodging the spirit of the problem, which was to work through
a single generalization of "map" (or, as tweaked in the second try,
"map_unless_test") that is responsible for executing the callback and
collecting the results into a new array. Here you are doing the result-
collection on your own. You can't express these functions elegantly if you are
required to silo the .push(cb(item))-ing into a separate "map" or map-like
function.

That's not an outlandish requirement; imagine that the callback can be
expensive, and we would like to be able to adjust a centralized "map" function
later so it farms things out to different processes/workers/etc.

To solve this for map_first_innermost, you'd have to throw an exception with
the nested array, but this is just getting hideously ugly.

    
    
      function map_first_innermost(arr, callback) {
        try {
          return map(arr, function(v) { 
            if (v instanceof Array) throw {itWasNested: v}; 
            return callback(v);
          });
        } catch (e) {
          if (e.itWasNested) { 
            return map_first_innermost(e.itWasNested, callback); 
          }
          throw e;
        }
      }

~~~
polotek
I think you're inventing specific requirements to win a debate. So essentially
"I can imagine a scenario that would hard for you to accomplish". Are you
saying there is no scenario you can fathom that's difficult to accomplish in
ruby? If that's that's the case, you should probably be lobbying to replace
javascript with ruby. Not to spend lots of time and effort and pain to turn
javascript into ruby.

~~~
pak
Of course there are scenarios that are difficult in either language. And
lobbying to replace JavaScript with Ruby (I assume you mean in web browsers?)
is simply madness for a host of reasons that aren't worth repeating. Lobbying
for blocks in JavaScript is sensible, though, since it is actually being
considered for the new spec.

It sounds like you think my requirements are contrived. If you prefer to use
functions as iterators in JavaScript, and who doesn't (unless you like
polluting methods with counter variables?) ... there is no way to have
iterators halt early without throwing exceptions or coming up with special
return values. _Every_ programmer needs to iterate and break out of loops. It
is easy to break from iterator functions in Python and Ruby, since the
languages natively support this concept. JavaScript doesn't--that's all we're
saying, and it _would be nice_.

------
oinksoft
I can't take Katz' suggestions for JavaScript seriously. He desperately wants
JS to be his blub (ruby) and doesn't seriously think about how to accomplish
his goals with what the language provides.

> There are two very common problems here. First, this has changed contexts.
> We can fix this by allowing a binding as a second parameter, but it means
> that we need to make sure that every time we refactor to a lambda we make
> sure to accept a binding parameter and pass it in. The var self = this
> pattern emerged in JavaScript primarily because of the lack of
> correspondence.

 _Or_ , you could use Function.prototype.bind where it's needed. Only
functions where an execution context is frequently provided should accept an
execution context (as sugar, basically), and it's generally better to assume
prudent use of bind(). But I guess I'm crazy.

He then proposes a wholesale change to function semantics in the language.
There are no "acrobatics" performed if you want to return a value passed to
forEach(). You just enclose a variable and assign from within the callback.
Any experienced JS dev will tell you that the disadvantage of forEach() and
relatives is _not_ that you can't easily return, but rather, that you can't
easily _break_. However, proponents of a functional style would argue that you
should be filtering your list first so that you only have to deal with
interesting values, and so that there is no need to break: If you need
`break`, use for ()!

I mean, come on, this is the same guy who wrote a reopenClass function in
Ember.js -- for a language with plainly open prototypes.

This is nothing but a post glorifying Ruby and bashing JS for not being Ruby.
There are plenty of valid nits to pick with JS, but being unlike Ruby is not
one of them.

~~~
wycats
_He then proposes a wholesale change to function semantics in the language._

Actually, I linked to a proposal by Brendan Eich, the creator of JavaScript.

 _I mean, come on, this is the same guy who wrote a reopenClass function in
Ember.js_

JavaScript's open prototypes suffer from the inability to define a number of
new properties at once using an object literal. reopen (not reopenClass),
provides that functionality.

 _This is nothing but a post glorifying Ruby and bashing JS for not being
Ruby_

Nope. It's a post glorifying the correspondence principle, which Smalltalk and
Lisp had before Ruby, and arguing that JS would be better with it.

~~~
oinksoft
> Actually, I linked to a proposal by Brendan Eich, the creator of JavaScript.

Naming names doesn't change the idea: It is surely a wholesale change to
function semantics in the language.

> JavaScript's open prototypes suffer from the inability to define a number of
> new properties at once using an object literal. reopen (not reopenClass),
> provides that functionality.

Sure, that's why you use a general-purpose merge function, prototypes just
being objects themselves.

    
    
      var merge = function(o, o2, force) {
          for (var p in o2)
              if (o2.hasOwnProperty(p) && (!(p in o) || force))
                  o[p] = o2[p];
          return o;
      };
    
      var C = function() { /* ... */ };  
      merge(C.prototype, {
          foo: function() { /* ... */ },
          bar: function() { /* ... */ }
      });
    

In any case a name like reopenClass() sounds very much like an attempt to make
JS smell like Ruby even if all the function does is perform a merge.

~~~
raganwald

      I can't take Katz' suggestions for JavaScript seriously.
      ...
      Naming names doesn't change the idea.
    

I’m confused! Are we judging ideas by the source or by their merits?

~~~
oinksoft
He named a name rather than challenge the suggestion ... I named a name to
show that I am noticing a common quality in Katz' recent foray into
JavaScript, that being his desire that it be like Ruby, the language for which
he's best known.

~~~
MartinMond
That's not what happened. You started out with "I can't take Katz' suggestions
for JavaScript seriously." and he pointed out that it wasn't his, but Brendan
Eich's proposal.

To which you answered "Naming names doesn't change the idea."

------
jdale27
By the way, Google seems to think that it's called _Tennent's_ Correspondence
Principle, not _Tennet's_. Might be helpful to those who, like me, hadn't
heard of it.

~~~
wycats
Fixed in the original post. Thanks for letting me know about the typo!

~~~
bodhi
It's described briefly in R. D. Tennent's Book _Principles of Programming
Languages_ , but it doesn't really go into much depth about the ramifications.
Did Tennent write a paper with more discussion of the principle?

(I got a hold of a copy after I ran into the topic previously:
[http://blog.marcchung.com/2009/02/18/how-closures-behave-
in-...](http://blog.marcchung.com/2009/02/18/how-closures-behave-in-ruby.html)
)

------
Volpe
Doesn't Coffeescript 'solve' this problem? (quotes because of the coffeescript
vs javascript debate)

The binding of this is solved with => (fat arrow), and the returning value is
resolved with the fact that both the inner function and outer function will
return their result (I think? :\\) and thus the refactoring is essentially
equivalent to the ruby version?

Anyone else read this as "Ruby guy wants javascript to be more like ruby." ?

------
jason_slack
Weird question, can one download the 'source' to JavaScript, compile and
therefore built on top of it? I guess I have never thought about it, it is
free, is it open too?

~~~
ZitchDog
v8: <http://code.google.com/p/v8/> TraceMonkey:
<http://hg.mozilla.org/tracemonkey> rhino: <http://www.mozilla.org/rhino/> IE:
n/a

~~~
BrendanEich
knowtheory was right, follow the MDN SpiderMonkey links to
<http://hg.mozilla.org/mozilla-central/js/src> \-- beware that
<http://hg.mozilla.org/tracemonkey> is an inactive repo.

------
sitharus
I'm confused, aren't blocks just a hack around a lack of first class functions
and/or closures?

The actual complaint seems to be that JavaScript's scoping rules are different
to his expectations and 'this' binding is, well, we know.

~~~
raganwald
No, blocks are not a hack around first-class functions. Let’s talk about “what
we’re making first-class.” A first-class function is something we both
understand: A thingy with its own variable scope, its own notion of “this,”
some parameters, and some executable code that may or may not return a result.

So what is a block? That’s easy in JavaScript. Here’s a block:

    
    
      if (foo === bar)
      // The block starts here v
      {
        return foo;
      }
      // The block ends here ^
    

Blocks are the chunks of executable code living _inside_ of a function. They
aren’t functions! They share their enclosing function’s scope. They share
their enclosing function’s notion of “this”. You can’t “return” from a block,
if you execute a return form a block, you return from the surrounding
function.

Blocks _already exist_ in JavaScript, but at the moment they only exist for
built-in keyword construct like “if” and “for." Yahuda is simply explaining
why it would be valuable to create a way to pass blocks to functions. The
blocks would continue to be associated with their enclosing function
invocation, unlike passing a function.

To summarize, JavaScript already has first-class functions, and it already has
blocks, this proposal concerns a way to make blocks first-class. Blocks are
not a hack around first-class functions, they’re something else that
JavaScript already has.

~~~
oinksoft
Actually, while it's almost never used (I certainly haven't seen code using
it), blocks cooperate with labels too. The following is perfectly valid:

    
    
      var fn = function() {
          nowWeDance: {
              dance();
          }
      };
    

The block uses the same execution context and scope, of course, so it's not
useful.

A block is legal alone too:

    
    
      (function() {
          {
              return 1;
          }
      })(); // 1
    

I suppose for a very long switch statement (god forbid), blocks could be
useful:

    
    
      switch (x) {
          case 1: {
              // do stuff
              break;
          }
          case 2: {
              // do stuff
              break;
          }
      }
    

But I wouldn't favor it because it makes the break seem implicit to the reader
when it in fact is not.

------
ericbb
Here's a question about blocks: should they even have return values? I was
wondering about that because (1) they are supposed to have an equivalence with
a code sequence and code sequences don't have values and (2) one awkward
difference between blocks and closures is that a closure can return a value
early whereas a block can only (as I understand) return the last value of its
body.

Edit: Also, as I think about it, one could maybe argue that closures are to
expressions as blocks are to statements. In which case, it seems that blocks
should never be bound to values but things that invoke blocks should only be
able to do so as in the Ruby 'yield' where the block is implicit. You could
then avoid the whole escaping problem.

Edit: And further, a thing which invokes a block should maybe belong to yet
another category from closures and blocks? In a sense, such a thing is
analogous to:

    
    
        for (...)
    

or

    
    
        if (...)
    

In other words, such a thing is half-a-statement. You have to append a block
to make a whole statement. :)

------
aufreak3
Here is another way to refactor that example in JS -

    
    
        // constructor for the FileInfo class
        FileInfo = function(filename) {
        
          function withFile(block) {
            var f;
            try {
              f = File.open(filename, "r");
              return block(f);
            } finally {
              f.close();
            }
          }
    
          this.mtime = function () {
            var mtime = withFile(File.mtime);
        
            if (mtime < new Date() - 1000) {
              return "too old";
            }
    
            sys.print(mtime);
          };
        
          this.body = function () {
            return withFile(function (f) { return f.read(); });
          };
        
        };
     
    

An advantage with this refactoring is that the file handle, which isn't
required after getting the mtime, is closed before doing other things.

------
edsrzf
Interestingly, there are places where even Ruby doesn't follow Tennent's
Correspondence Principle: the next and break keywords. They always act on the
innermost block.

------
amasad
+1 For some kind of keyword for Return-from (Although could be emulated with
exceptions).

-1 For adding block syntax.

~~~
MartinMond
Why? Adding a new keyword also means adding new syntax and also will break
stuff, e.g.

    
    
      var new-return-from-keyword = "test";

~~~
amasad
Its not about "breaking stuff". Its about keeping the language as small as
possible. Part of JS's success is being small as a language.

[EDIT]

For example you could teach functions to beginner programmers once. And could
then easily introduce lambdas/callbacks/generators(And maybe block lambdas)
without introducing much new syntax.

------
ericbb
Alternate formulation with hypothetical shift/reset (delimited continuation
support) and blocks that return the same way functions do:

    
    
        mtime: function () {
          return reset {
            var mtime = shift (succeed) {
              this.withFile ({ |f|
                var mtime = this.mtimeFor (f);
                if (mtime < new Date () - 1000) {
                  return "too old";
                }
                return succeed (mtime);
              });
            };
            sys.print (mtime);
            return "young enough";
          };
        },
    

The succeed function is a first class, indefinite-extent function equivalent
to:

    
    
        function (mtime) {
          sys.print (mtime);
          return "young enough";
        }
    

It's the computation within the reset block that comes after the shift form is
evaluated.

Calling it after returning from the method would not raise an exception.

Edit: Keep blocks with 'this' inheritance (I had originally used a standard
closure).

~~~
BrendanEich
Delimited continuations are not gonna happen. See
[http://wiki.ecmascript.org/doku.php?id=strawman:shallow_cont...](http://wiki.ecmascript.org/doku.php?id=strawman:shallow_continuations),
[http://calculist.blogspot.com/2010/04/single-frame-
continuat...](http://calculist.blogspot.com/2010/04/single-frame-
continuations-for-harmony.html), and
[http://calculist.org/blog/2011/12/14/why-coroutines-wont-
wor...](http://calculist.org/blog/2011/12/14/why-coroutines-wont-work-on-the-
web/).

~~~
ericbb
Thanks for the links. It's neat to see such things being considered. My
feeling is that such complicated language devices are out of place in
JavaScript but it was fun to run the experiment.

While I'm thinking about it... I wonder about the choice to overload/reuse the
keyword 'return' in the blocks proposal. Any thought on using something else
to emphasize the distinction in semantics?

Edit: Nevermind. That would break the Tennent Correspondence Principle
wouldn't it? Bad idea.

------
n0thing2kn0
To play devils advocate here... This specific example could be solved with
implicit returns in Coffeescript:

    
    
        class FileInfo
    
          constructor: (@name)->
    
          mtime: ->
            #implicit return
            @withFile (f)->
              if f.time isnt 999 then "too old"
    
          withFile: (block)->
            try
              block
                name: @name
                time: 1000
            finally
              console.log 'finished'
    
        f = new FileInfo('name')
        console.log f.mtime()
    

#finished

#too old

------
mattdeboard
Doesn't most of this apply to Python anonymous functions?

~~~
endlessvoid94
Technically python doesn't have closures, which is what he's talking about
here. Python has lambdas, which are a sort of crippled version of anonymous
functions.

[http://mail.python.org/pipermail/python-3000/2006-November/0...](http://mail.python.org/pipermail/python-3000/2006-November/004395.html)

EDIT: of course the replies are correct. i suppose it was unclear what the
parent meant by "anonymous functions". if he meant lambdas, then they are
indeed closures but without the same capabilities as normal python functions.
their capabilities are unrelated to the fact that they're anonymous or
closures.

~~~
spacemanaki
Technically Python has what I like to call "Java closures" (although I admit
that is not really fair). In Python the body of an anonymous function (lambda)
can only be an expression and the bindings closed over by a named inner
function are immutable. You can get around the second limitation with a single
element list and this is the same way you get around Java's limitation that a
local variable referenced by an anonymous inner class be final.

~~~
narm60
The single element list is no longer necessary w/ Python 3's nonlocal keyword
(along w/ the old global keyword)

------
devongovett
I might be missing something, but it seems to me like @wycats examples could
be done quite easily in JS with a few extra return statements and the use of
Function.prototype.bind. CoffeeScript handles that quite nicely with its `=>`
binding syntax and implicit returns as well if you're into that.
<https://gist.github.com/1593480>

------
ricardobeat
Why do Tennent's Correspondence and Abstraction Principles matter? That's
missing.

~~~
tomdale
From the article:

 _This is also known as the principle of abstraction, because it means that it
is easy to refactor common code into methods that take a block._

I can attest that quickly refactoring code in Ruby is much more
straightforward than in JavaScript. In practice, a small reduction in
cognitive load means that it's easy to just do it now, instead of putting it
off (to the point it never gets done).

------
wslh
JavaScript needs fibers and futures, asynchronous programming is incomplete
without that. Happily Node supports that with
<https://github.com/laverdet/node-fibers/>

------
cloudhead
You forgot haskell and erlang, they need blocks too!

------
dahlia
No. What Katz really needs is not blocks, but RAII. See Python’s with
statement or C#’s using statement.

~~~
LeafStorm
You can do RAII using `with` or `using` blocks, but you can't control
execution with those like you can with blocks - the body of the `with` or
`using` statement will always execute once. You can't prevent it from
executing (unless you throw an exception), and you can't re-execute it with
different values in the `as` clause.

