Hacker News new | past | comments | ask | show | jobs | submit login

The issue is not single-pass vs multi-pass. It is instead, what constitutes a compilation unit, i.e., a pass over what?

Clojure, like many Lisps before it, does not have a strong notion of a compilation unit. Lisps were designed to receive a set of interactions/forms via a REPL, not to compile files/modules/programs etc. This means you can build up a Lisp program interactively in very small pieces, switching between namespaces as you go, etc. It is a very valuable part of the Lisp programming experience. It implies that you can stream fragments of Lisp programs as small as a single form over sockets, and have them be compiled and evaluated as they arrive. It implies that you can define a macro and immediately have the compiler incorporate it in the compilation of the next form, or evaluate some small section of an otherwise broken file. Etc, etc. That "joke from the 1980's" still has legs, and can enable things large-unit/multi-unit compilers cannot. FWIW, Clojure's compiler is two-pass, but the units are tiny (top-level forms).

What Yegge is really asking for is multi-unit (and larger unit) compilation for circular reference, whereby one unit can refer to another, and vice versa, and the compilation of both units will leave hanging some references that can only be resolved after consideration of the other, and tying things together in a subsequent 'pass'. What would constitute such a unit in Clojure? Should Clojure start requiring files and defining semantics for them? (it does not now)

Forward reference need not require multi-pass nor compilation units. Common Lisp allows references to undeclared and undefined things, and generates runtime errors should they not be defined by then. Clojure could have taken the same approach. The tradeoffs with that are as follows:

1) less help at compilation time 2) interning clashes

While #1 is arguably the fundamental dynamic language tradeoff, there is no doubt that this checking is convenient and useful. Clojure supports 'declare' so you are not forced to define your functions in any particular order.

#2 is the devil in the details. Clojure, like Common Lisp, is designed to be compiled, and does not in general look things up by name at runtime. (You can of course design fast languages that look things up, as do good Smalltalk implementations, but remember these languages focus on dealing with dictionary-carrying objects, Lisps do not). So, both Clojure and CL reify names into things whose addresses can be bound in the compiled code (symbols for CL, vars for Clojure). These reified things are 'interned', such that any reference to the same name refers to the same object, and thus compilation can proceed referring to things whose values are not yet defined.

But, what should happen here, when the compiler has never before seen bar?

    (defn foo [] (bar))
or in CL:

    (defun foo () (bar))
CL happily compiles it, and if bar is never defined, a runtime error will occur. Ok, but, what reified thing (symbol) did it use for bar during compilation? The symbol it interned when the form was read. So, what happens when you get the runtime error and realize that bar is defined in another package you forgot to import. You try to import other-package and, BAM!, another error - conflict, other-package:bar conflicts with read-in-package:bar. Then you go learn about uninterning.

In Clojure, the form doesn't compile, you get a message, and no var is interned for bar. You require other-namespace and continue.

I vastly prefer this experience, and so made these tradeoffs. Many other benefits came about from using a non-interning reader, and interning only on definition/declaration. I'm not inclined to give them up, nor the benefits mentioned earlier, in order to support circular reference.

Rich




Most of the user annoyance vanishes if declare were to support qualified names: (declare other-namespace/symbol). Is there a technical limitation here?

-steve


One problem is what to do if the other-namespace doesn't already exist. It would have to be created, and that initialization is unlikely to be the same as the declared one. Possibly follow-on effects when it is subsequently required. The other option, with similar issues, is to allow fully qualified references to non-existent vars in code.


If it doesn't already exist, then you issue a compilation error. Better then than the way it works today, IMO.


CL is a language. Not an implementation. One may need to differentiate between a language and an implementation. CL is defined so that different implementation strategies are possible.

CL itself has two different compilation interfaces COMPILE and COMPILE-FILE. A file compiler may implement a multipass strategy, where calls to later in that file provided functions are resolved.

A compiler may also provide different kinds of compilation units. IIRC CMUCL does that. See "block compilation".


Thank you, Rich, very informative.

What is the rationale behind not letting 'declare' declare vars in other namespaces? I've run into that when dealing with circular dependencies, and sometimes it seems that would help.


Such declarations are possible (as would be accepting fully-qualified references to not-yet-existing things), but the devil's in the details again - e.g. what if the other ns doesn't yet exist?

And the complexity/utility tradeoffs must be considered.


"CL happily compiles it, and if bar is never defined, a runtime error will occur. ... You try to import other-package and, BAM!, another error - conflict, other-package:bar conflicts with read-in-package:bar. Then you go learn about uninterning.

fwiw, you get a warning at compile time regarding the undefined symbol. And in the AllegroCL IDE, for example, uninterning is just a matter of hitting [Return] when you get the error message (dialog). I believe uninterning the conflicting symbol is a common restart in other implementations as well.


"what constitutes a compilation unit, i.e., a pass over what?"

So, how is this compilation unit different from all other compilation units?


One way is that Rich's example compilation units (forms input at a REPL) don't all exist at the same time, like compilation units that are files often do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: