
Resolving and Binding - tu7001
http://www.craftinginterpreters.com/resolving-and-binding.html
======
kaosjester
As someone with years of experience writing interpreters, this seems a rather
sloppy solution to a straightforward problem. There are two canonical
solutions, and he uses neither:

1\. The first canonical solution, which the author even describes, is
persistent environments. Unfortunately, this requires passing the current
environment as part of the recursive pattern, which means heavily modifying
the existing code. He doesn't do this because it would require revisiting too
much.

2\. The second canonical solution, which you would find in any modern
compiler, is to uniquely rename all the variables. His "resolver" traversal
has ample opportunity to do this, and would provide a far cleaner solution
(with less space overhead than his expr/scope depth lookup table).

Instead, the author creates a stack of environments and annotates variables
with information akin to de Bruijn indicies _for scopes_. Compared to the
alternatives above, this is over-engineered and inelegent, plus it complicates
reasoning about scopes far beyond what's necessary.

As an aside, this assertion is also completely absurd:

> Shadowing is rare and often an error so initializing a shadowing variable
> based on the value of the shadowed one seems unilkely [sic] to be
> deliberate.

~~~
munificent
> Unfortunately, this requires passing the current environment as part of the
> recursive pattern, which means heavily modifying the existing code. He
> doesn't do this because it would require revisiting too much.

Right. When I was first writing the code for this interpreter, I implemented
it using persistent environments (basically the typical Scheme association
list approach).

It worked, but it had some strikes against it:

1\. The persistent approach is not right for global variables, which _are_
dynamically bound. So I ended up needing two Environment classes, a Map-based
one for the global scope, and then a persistent one for locals. That, of
course, also requires an interface so that most code can work polymorphically
with both types.

2\. The Interpreter class and its visit methods are introduced several
chapters before Environment. So all of those preceding visit() methods have to
be redone to have the extra environment parameter passed along with passing it
through some other methods.

Storing the current environment in a field helped, but the code for updating
that field still looked grungy to me. With a book, every bit of extra
boilerplate feels _really_ heavy and I try to keep the code clean and simple.

3\. It becomes really unclear _why_ we want persistent local environments.
Since we would have to introduce them well before the point in the book when
closures can actually cause problems when not having them, it ended up feeling
like the code was poorly motivated.

If I did persistent local scopes when locals are first introduced, there's no
way to show them a sample program that would go back without having them — we
don't have functions, function calls, or closures yet.

The current organization lets the reader go down a naive obvious-seeming path
(and, better, one that reuses all the code we already need for global scopes)
and then lets them viscerally experience the problem caused by thinking about
blocks as a single scope. Then once they feel that pain, they get the
solution.

There are some benefits to the current approach:

1\. It gave me an opportunity to show an example of a semantic analysis, and
adding another pass to a compiler. Those are generally useful techniques, and
I think it's worth walking readers through one.

2\. It lets the reader cover more ground. When we get to the second
interpreter, it takes a different approach to local variables. Variables are
resolved during parsing and stored directly on the interpreter's stack, and
accessed by index. The block scopes are discarded and have no runtime
representation.

I've tried to add other differences between the two interpreters too, just to
reduce the amount of redundancy between them. For example, the Java one lexes
the entire file to a list of tokens while the C one lexes on demand, driven by
the parser. The Java one is an AST walker, the C one compiles to bytecode,
etc.

Of course, if you are an experience language hacker, it means some stuff in
the Java interpreter may seem weird because it's not the "normal" way to do
things. (Though, for what it's worth, I have seen plenty of interpreters that
_do_ create hashtable-based environments for each lexical scope.)

I hope that seems reasonable. The way environments are represented in the Java
interpreter was the most difficult design decision I made. I went back and
forth on it a lot and I'm still not certain I made the right choice. But,
ultimately, if the book is ever going to exist, I had to just pick and move
forward.

> 2\. The second canonical solution, which you would find in any modern
> compiler, is to uniquely rename all the variables. His "resolver" traversal
> has ample opportunity to do this, and would provide a far cleaner solution
> (with less space overhead than his expr/scope depth lookup table).

I considered having a problem exercise to do effectively that, but I felt like
it might be reaching a little too far for a first-time language implementer.

    
    
        > As an aside, this assertion is also completely absurd:
        >
        > > Shadowing is rare and often an error so initializing a shadowing
        > > variable based on the value of the shadowed one seems unilkely [sic]
        > > to be deliberate.
    

What is absurd about it? Did I not word this well? The point I was trying to
get across is that code like this is not common:

    
    
        var a = "global";
    
        fun foo() {
          // Initialize a same-named local variable based on the global.
          var a = a;
        }
    

I can't recall ever seeing code like this in the wild, and if I ever saw it in
a code review, I would certainly tell the author to rename the local variable.

Given that, it seems reasonable to me to _not_ take the approach of deferring
putting the local in scope until after its initializer has run.

Note that the above (or equivalent) code is a compile error in Java and C#. In
JavaScript, using "let", it's an error. In C, it accesses the uninitialized
new variable (!).

~~~
chubot
Honest question, why have local scopes at all? IIRC, Lox is a dynamic object-
oriented language.

Python has a flat scope per function, and closures aren't all that common in
idiomatic Python code. Objects are much more common.

JavaScript also had a flat scope per function until ES6. On the other hand,
closures are very idiomatic in JavaScript, and objects are a little weak.

Looking at the previous chapter [1], from someone familiar with OOP, the
makeCounter and Point examples just seem to be awkward ways of writing
classes, no?

Particularly for an educational language, why have two ways of doing the same
thing?

[1]
[http://www.craftinginterpreters.com/functions.html](http://www.craftinginterpreters.com/functions.html)

~~~
munificent
> Python has a flat scope per function, and closures aren't all that common in
> idiomatic Python code.

Closures are common in "idiomatic" Lox code. Or, at least, I want Lox to be a
language that doesn't have pitfalls around using closures and local variables.

Because Lox is syntactically C-ish, I also think it's important that its
scoping rules roughly follow C and friends.

> JavaScript also had a flat scope per function until ES6.

Right, which is evidence that _not_ having local scopes was a mistake. Adding
"let" was a very big deal and wouldn't have been done unless the semantics of
"var" really were not what users wanted.

I talk a little more about function scope in the context of implicit
declaration here:

[http://www.craftinginterpreters.com/statements-and-
state.htm...](http://www.craftinginterpreters.com/statements-and-
state.html#design-note)

> Looking at the previous chapter [1], from someone familiar with OOP, the
> makeCounter and Point examples just seem to be awkward ways of writing
> classes, no? > > Particularly for an educational language, why have two ways
> of doing the same thing?

I think it's possible to go too far down the rabbit hole when deciding two
things are the "same". Since you can implement integers using functions [1]
why have both numbers and functions? You can implement control flow using
closures and dynamic dispatch, so why have "if" [2]?

The pragmatic answers are:

1\. Users probably don't think of functions and classes as "the same" even
though only one is required for Turing completeness.

2\. Most interpreters do not implement one in terms of the other, so having
both lets us show the implementation techniques that are unique to each.

[1]:
[https://wiki.haskell.org/Peano_numbers](https://wiki.haskell.org/Peano_numbers)
[2]:
[https://www.gnu.org/software/smalltalk/manual/html_node/Cond...](https://www.gnu.org/software/smalltalk/manual/html_node/Conditions.html)

~~~
chubot
Are there any examples of the book of closures that couldn't be better written
as classes? (Or perhaps you disagree about the Counter/Point examples.)

I'm not sure I agree about "let" being evidence of local scopes being
necessary. I believe the primary problem with "var" is the "hoisting"
behavior, i.e. roughly speaking code "later" can affect variable references
now.

In Python this isn't as big of a deal, because you get an UnboundLocalError
for code like this:

    
    
        x = 1
        def f(x):
          print(x)
          x = 2
          print(x)
    

So instead of a silent bug, you'll get a crash, which is easy to debug. In
JavaScript I believe similar examples will lead to subtle bugs, which "let"
fixes. Also in JS "x = 1" is global but "var x = 1" is local.

\-----

"let" also added local scopes, but I don't believe it proves that you needed
them. Anecdotally, coming from C/C++ to Python, I've never missed local
scopes, simply because I keep my functions small.

Mild tangent: Here is something I realized recently about the perennial large
functions vs. small functions debate: It depends on whether you're a C
programmer or not.

You can find some documents by John Carmack and Jon Blow arguing for large
functions. That's because they are C programmers, and in C there is a very
large cost to creating a function: reasoning about ownership/memory management
(and the lack of ability to return non-primitives without out-params.) It's
easier to keep things all in one function, especially if you're only calling
the function in one place.

In Java, JavaScript, and Python, there is no such cost. Garbage collection
takes care of it for you. You can return as big an object as you like. And
indeed the idiomatic style in those languages is small functions.

So I claim that local scopes are much less necessary in Java, JavaScript, and
Python. C doesn't have true functions, which makes local scopes quite useful.

\-----

I understand your point about going overboard making things the same (there
was an old Paul Graham post pondering getting rid of integers in his defunct
Lisp dialect ARC).

But I think it's a different story for classes and closures. There is
obviously an efficiency problem with integers as functions, just like there is
an efficiency problem in Haskell with strings being lists of characters. In
engineering you obviously care about efficiency.

I know this is programming language heresy, but I honestly don't see a real
need for closures if you have classes. (And if you have the Dart-like or Wren-
like constructor initialization shortcuts for classes, which I plan to add to
the language I'm designing.)

I'm sure I'm biased because I'm coming from C and Python, and neither really
has closures. (Python didn't have the ability to mutate variables in an
enclosing scope until the 'nonlocal' keyword was introduced several years ago
-- and I've never seen it used in the wild.)

JavaScript is the only language I've really used with closures and I don't use
it that often (and Scheme in college, but doesn't really count). I just looked
through some JS code of mine, and I pretty much always have an explicit model
of the page state and pass it explicitly using a dependency injection style.
With closures, state is implicit. I don't like the fact that your outer
function can have 8 locals but only 2 of them are captured, and you have to
search up to see which ones they are.

There's one use of closures for an onclick handler, but I would solve that
with __call__ in Python (C++ also has this as "functors", but Java doesn't).

\-----

I also asked a similar question about prototypes vs. classes here [1], which
was a pretty good discussion.

I liked the Wren design because it's like JavaScript/Lua but with classes
instead of prototypes.

And I liked your red function/green function post about async, which is
another important use of closures. And there was a recent Ryan Dahl interview
[2] where he admitted that the Go style was better for servers, and conceded
the "callback pyramid" problem with closures.

\-----

So what's left for closures then? If one agrees that Counter/Point are
naturally classes (which you might not), if you want to make an explicit model
of state in GUI code (e.g. Elm advocates this strongly), and if you believe in
the Go-style async is better, then what are some natural uses of closures?
This is an honest question -- as I said I could be biased coming from
languages that don't have them.

If closures just "fell out of" implementing classes, I would probably
implement them in the language I'm designing. But this chapter shows that
there are some non-trivial issues so I'd be inclined to leave them out.

My pet theory is that the industry learned how to use classes "correctly"
around 2005 or so. From perhaps 1995 to 2005, you were more likely than not to
encounter a mess. (Although Go might be late to the party [3].)

This is another contrarian opinion, but I actually think classes relate more
strongly to functional programming than closures (though closures have more of
a historical relation). Classes are more rigorous about state (random local
vars aren't captured), and functional programming is also about being rigorous
about state. I use classes but I think of it like functional-programming-in-
the-large [4]. There is a false dichotomy between FP and OOP -- the modern
styles of both are converging (explicit state params, dependency injection).

Sorry for the long post -- tl;dr I would like to see some examples of code in
the book which are more natural for closures than classes :)

tl;dr #2 -- If you have a short syntax for initializing class members for
constructor params, and if you have a way of bridging classes and functions,
like __call__/operator(), -- then I claim you don't really need closures.

[1]
[https://news.ycombinator.com/item?id=14415431](https://news.ycombinator.com/item?id=14415431)

[2]
[https://news.ycombinator.com/item?id=15140669](https://news.ycombinator.com/item?id=15140669)

[3]
[https://news.ycombinator.com/item?id=14523728](https://news.ycombinator.com/item?id=14523728)

[4]
[https://news.ycombinator.com/item?id=11841893](https://news.ycombinator.com/item?id=11841893)

~~~
munificent
> Are there any examples of the book of closures that couldn't be better
> written as classes?

The book itself doesn't really have a lot of "representative" Lox code since
it's so focused on implementing the interpreters themselves. But certainly, in
other languages with first-class functions, callbacks are very common and
would be painfully cumbersome to do with classes. Note, for example, that Java
has long supported anonymous inner classes, but still added lambdas later
because the former were so annoying to use in many cases.

> Anecdotally, coming from C/C++ to Python, I've never missed local scopes,
> because I simply keep my functions small.

C and C++ have local scopes. Would you expect this to _not_ have an error?

    
    
        main() {
          for (int i = 0; i < 10; i++) {
            // ...
          }
          printf("%d\n", i); // <-- ?
        }
    

> in C there is a very large cost to creating a function: reasoning about
> ownership/memory management (and the lack of ability to return non-
> primitives without out-params.)

There is also the overhead of the call itself. In the kind of performance-
critical code often written in C/C++, that can matter too. Good compilers will
inline when it makes sense, but those heuristics aren't perfect.

> I know this is programming language heresy, but I honestly don't see a real
> need for closures if you have classes.

You don't need them, but once you get used to them, they sure are handy.
Implicitly closing over locals in the surrounding scope is a little magical,
but it's a really convenient kind of magic than generally seems to help more
than it harms.

All the world's Smalltalk, Scheme, C#, Ruby, Dart, Scala, Swift, JavaScript,
Kotlin etc. programmers probably aren't wrong in liking them. (Although, as a
language designer, it's of course fun and potentially rewarding to
deliberately try to get _off_ the beaten path.)

> if you believe in the Go-style async is better, then what are some natural
> uses of closures? This is an honest question -- as I said I could be biased
> coming from languages that don't have them.

The bread and butter use cases I see are:

1\. Modifying collections. map(), filter(), etc. are so much clearer and more
declarative than imperatively transforming a collection.

2\. Callbacks for event handlers or the command pattern. (If you're using a
framework that isn't event based, this may not come up much.)

3\. Wrapping up a bundle of code so that you can defer it, conditionally,
execute it, execute it in a certain context, or do stuff before and after it.
Python's context stuff handles much of this for you, but then _that 's_
another language feature you have to explicitly add.

~~~
chubot
OK thanks for the answer.

I totally agree with first class functions, and I probably agree with the
Python-style ability to read outer variables (especially in the case that the
inner call doesn't survive longer than the outer call).

What I don't agree with is capturing locals to rebind them -- this is the
explicit vs. implicit state argument.

So for #1 map/filter, I don't really see this as a use case for closures. It's
more about function literals and first-class functions.

#2 I am on the fence about... I would be interested in examples. Like I said,
with my somewhat limited JS experience, I understand why people like them, but
I think you can do OK without mutating the surrounding scope. There's a
distinction between calling a mutating method on an object in the surrounding
scope, and actually rebinding something in the surrounding scope.

#3 might be convincing although I would need examples. The Go-style defer is
scope-based which seems more limited than general closures. Python's context
managers are sort of a syntactic sugar/protocol around using certain kinds of
classes -- much in the same way that iterators are.

The other problem with closures it's not really one language feature --
they're different across languages. There is more than one issue with capture
like the one with loops you mentioned in another comment. And, C++ now has
closures, but there are a few different options with regard to LValues and
RValues that I don't remember at the moment. It doesn't feel that solid to me,
but I'll continue to play with it, and this chapter and some of the comments
will certainly help.

On the one hand, it seems like a ton of code has been written in C++, Java,
and Python without closures. On the other hand, C++ and Java both added some
closure-like features in the last decade, and Python added nonlocal, so that's
probably a trend. (But like I said I've never actually seen anyone use
nonlocal, it feels like something done "for completeness" rather than based on
actual usage.)

------
falsedan
I like that the author ran out of steam for whimsical allegory & contrived
examples after a page or two, and the rest of the chapter reads very tightly &
well.

Also nice to see another explanation of lexical scoping, to complement my
understanding of perl's pads[0].

0:
[http://hop.perl.plover.com/book/pdf/03CachingAndMemoization....](http://hop.perl.plover.com/book/pdf/03CachingAndMemoization.pdf)
pp65-80

