
Uniform Structured Syntax, Metaprogramming and Run-Time Compilation - informatimago
https://m00natic.github.io/lisp/manual-jit.html
======
discreteevent
This is very impressive in only the way that lisp can be. But part of why it
works is because SQL is a well defined interface. There was a comment on ltu a
good while back:

\------------

"I would say that a system that allowed other metathings to be done in the
ordinary course of programming (like changing what inheritance means, or what
is an instance) is a bad design. (I believe that systems should allow these
things, but the design should be such that there are clear fences that have to
be crossed when serious extensions are made.)"

The fact Kay realized this fine point of design in the late '60s (according to
him) is why he is a Turing Award Winner.

I know Lisp programmers who even today don't understand this point - their
code is succinct but the API has a massively unnecessary learning curve due to
unclear boundaries.

Sometimes my coworkers object to me paying extraordinary attention to detail
about what the boundaries are. However, if we don't pay attention to
boundaries, we may as well all be Netron Fusion COBOL programmers munging VSAM
records and EDI data formats. By Z-Bo at Wed, 2009-04-15

[http://lambda-the-ultimate.org/node/3265#comment-48165](http://lambda-the-
ultimate.org/node/3265#comment-48165)

~~~
neilv
I think more analogous to the examples Kay was giving would be for a single
program to quietly change the rules for function application, or to quietly
change the behavior of CLOS.

I think of an embedded DSL, based on a macro like it is here, as a bit
different than that, because it's not doing anything quietly -- it applies
only to the parenthesized extent of the code that begins with your macro name:

    
    
      (my-dsl-macro my dsl fills the rest of the parentheses)
    

People who don't already know what `my-dsl-macro` is see the documentation or
code that says it's a macro. Even were it a normal function, people reading
the code should probably know what the function does, so maybe they have to
glance at where the IDE automatically displayed the syntax or documentation
for them, anyway.

Ideally, your editor will also color the macro names differently, especially
for readers who don't know what's a macro or a function. (Rust adds an
exclamation mark to the macro name for a use, which is a good idea when you
don't already know what's macros, but maybe a bit annoying to have all those
exclamation marks when you do know that, say, `println` is a macro.) But even
if you don't know the name is a macro, you'll be clued in if the text within
it doesn't look like your top-level language, which it often doesn't (e.g.,
SQL in s-expressions doesn't look like base CL).

A key is to use DSLs judiciously -- for improved readability (for your base
language programmer, or domain experts), maintainability, and/or performance.
Maybe not for, say, a convenience for the sake of minor code terseness
improvement only.

Of course, SQL is a whopper of a language, and perhaps overkill if you
invented it as a DSL for this particular application. (More suspicious would
be to invent your own relational query language that seems gratuitously
different than SQL, when SQL already existed.)

As for changes more like I think Kay was talking about, in the Racket (Lisp
family) universe, outside of a macro, normally you wouldn't do that, but when
you have a good reason -- say, you want to prototype a lazy language, or
implement a specialized language for a GPU programming backend, or a DSL for
your domain experts -- you can. In that case, you'd have a `#lang` line at the
very top of the file that tells you what different language this is, instead
of `#lang racket`. You might make that produce modules that can interoperate
with modules of other `#lang`s, but you're not doing anything sneaky that
breaks the language of those other modules.

~~~
quelltext
You still have the problem that the `my-dsl-macro` is a black box.

That macro can do anything it wants to the sub-AST. The problem is not about
hygiene re: variables but about the meaning of symbols. A function application
within it that doesn't obviously have anything to do with the DSL is also
subject to the macro's will to have its meaning changed. This can make
composition difficult. Who guarantees that if you combine `my-dsl-1-macro`
with `my-dsl-2-macro` or simply with general purpose code, that they don't
interfere in odd ways?

Granted, this is not a big concern for a well-designed and widely used macro.
However, the bottom line is that the inherent freedom of the abstraction
allows for issues to arise whose debugging require a deep understanding of the
involved macros and their implementations.

You don't really have the same problem if there is only procedural
abstraction. The boundaries are clearer there. Even if you do something like
the Interpreter Pattern or Haskell style AST-constructing EDSLs, you have
guarantees that whatever AST is constructed cannot inspect whatever non-DSL
code you combined with it.

~~~
comma_at
> Who guarantees that if you combine `my-dsl-1-macro` with `my-dsl-2-macro` or
> simply with general purpose code, that they don't interfere in odd ways?

Nobody does. But that's where the power lies. You can't have unrestricted
power with restricted features. Lisp gives you unrestricted power. It's up to
you to use it correctly, for w/e definition of correct _your_ context makes
sense.

~~~
quelltext
Yes, but the argument was rather along the lines of "here's why I think (some
people think) we shouldn't have unrestricted features" or "I don't want ...
because...". Languages don't necessarily _need_ macros (many don't have them)
so arguing on the basis that the dangers of macros are an inevitable cost for
having them is correct but it works under the assumption that macros are
indeed desired by everybody.

It's a bit like arguing about pointers and memory safety.

I do think there is a lot of value to look into these things and research
alternatives to macros. For instance the problem I have described could be
addressed by a new macro-like feature that has to respect certain boundaries.

------
m-felleisen
Does it matter that McCarthy won the Turing Award before Alan Kay? What a
silly point.

Is it awful that functions are black boxes? That they are bundled in
libraries?

Does it matter that for loops don't leave a stack trace? How can poor
programmers debug them w/o a stack trace?

Can you imagine that overriding a method completely changes what a bundle of
methods does in a class?

It is so sad that every time a new idea comes out, the "resistance"
(programmers who often lack experience with it or experienced an inferior
implementation of the idea) are the loudest to complain and hold sw dev back.

------
barrkel
This is talking about evaluating dynamic expressions over (semi-?)-structured
row-oriented data, for the purpose of filtering.

It contrasts a tree interpreter in C++ with a JITted dynamically generated
Lisp expression, with some hand-waving away of what the equivalent JIT in C++
would be, seemingly dismissing it as taking too long (is that what "unpause
cosmic time" alludes to? I'm not sure).

The tree interpreter is a little unorthodox - it isn't how I'd write an AST-
walking interpreter - and other interpreter techniques like generating linear
programs for a simple virtual machine aren't considered. These can be pretty
fast, especially with some use of implementation-specific computed goto,
available in gcc and clang. It would get rid of the author's worries about
recursion and lack of TCO, increase locality and decrease cache usage.

But of course there's not much need to write such an interpreter. Why not use
a JIT framework for C++? Depending on the library, it wouldn't be a whole lot
more complex than a traversal of the AST.

And the next question is, if the problem is querying plain-text databases, why
not use Apache Impala? It's written in C++, and uses LLVM to compile SQL
expressions into native code, and can evaluate filters (but not just filters,
the full power of SQL) over CSV text.

Maybe Impala and its dependencies is too big, but if that's the case then your
data is small and a simple interpreter would be plenty fast enough.

~~~
comma_at
Unless you provide one of

\- a simple VM implementation \- code using a JIT framework \- using Apache
Impala on this particular dataset

which performs on par _and_ didn't take astonishingly long to write then these
are just vague claims and hypotheses.

~~~
firethief
A lot of people have experience with simple VMs. There's nothing super unique
about this problem that requires every claim about it to be accompanied by a
proof by construction.

~~~
comma_at
Either you missed the context or are pulling a straw man. Building a simple VM
is not unique, but can you build one that will be competitive with the other's
in a reasonable timeframe?

~~~
firethief
Yes, I could. Sorry, I wasn't trying to fight a straw man. I sincerely thought
you were saying you wouldn't accept estimates of the project's difficulty in
lieu of proof.

------
Veedrac
I'm not sure if I'm missing something, or if this is meant to be allegorical
for some other, more difficult problem, but the argument here seems very
strange. For sure, C++'s string handling is an awful sight, but the jump to
DSLs seems unmotivated. This issue can be handled with simple, traditional
helper functions.

    
    
        select(recordS5, [](cxr, subcode, commercial_name, date_disc) {
            return cxr.like("YY|XX") &&
                   ...etc;
        });

~~~
m00natic
It's allegorical in the sense that this is general Lisp technique useful not
just in this case. The DSL is targeted at (non programmer) end users and
supposed to be fired through a REPL. Wouldn't want to make them write C++ with
lambdas, semi-columns and whatever syntax traps of the latest standard.

~~~
Veedrac
That doesn't support the post's argument, that it's _not_ about aesthetics,
but about qualitatively simpler solutions.

> Often times I hear the claim that (programming language) syntax doesn't
> matter or if it does, it's only to provide some subjective readability
> aesthetics. It's somewhat surprising to me how large majority of supposedly
> rational people like programmers/computer scientists would throw out
> objectivity with such confidence. So let me provide a simple real world case
> where uniform syntax enables out of the box solution which is qualitatively
> simpler.

That making non-programmers use complex syntax and type semicolons is bad is
fair, but it's a rather different claim than the post's.

~~~
m00natic
Well, dynamic queries by end users is the main goal here. Your static helper
functions are completely unusable in that context. Analysing a query and
generating code at runtime is easy and idiomatic with uniform syntax (and
accompanying language support) and the claim is that solution is not only
speedier but implementation in CL is qualitatively simpler than the
alternatives.

~~~
Veedrac
The point my first comment made is that you don't need to analyse a query at
runtime, you just need to provide functions. I will agree again that Lisp
makes REPLs easier to write than does C++, but a CL with Lua syntax could
still, just as easily, provide functions and expose a REPL. It's the same
solution, just without the unjustified AST transformations.

~~~
m00natic
How can you provide a REPL language without analyzing it at runtime? Write Lua
random syntax in the REPL? Not a great improvement over C++. Not to mention
that you'll probably use something like `eval` which is not compilation thus
inferior.

By the way, even your original example - I can't see how it can work honestly.
How can you identify fields through (lambda) parameter names only (no mention
of types either)? Probably the least boilerplate-heavy solution would be
stringly typed.

~~~
Veedrac
A Lua REPL is hardly worse than a Lisp one.

> something like `eval` which is not compilation thus inferior

I honestly don't know what that means. Turning text into code _is_
compilation; there is no difference between the two in that regard, except
perhaps that in the Lisp DSL case it's more manual.

> How can you identify fields through (lambda) parameter names only (no
> mention of types either)?

Not familiar enough with Lua, but in Python you just use keyword arguments.

------
jaytaylor
This looks super interesting, but very difficult to read on mobile with Chrome
(Android, Samsung S10e):

[https://i.imgur.com/bJw95FB.jpg](https://i.imgur.com/bJw95FB.jpg)

Rotating sideways helped to a degree, but in hopes the author sees this
thread..

Update:

Sadly, Firefox isn't any better:

[https://i.imgur.com/ONswSNb.jpg](https://i.imgur.com/ONswSNb.jpg)

I guess this means this instance can't chalked up to "Chrome is becoming the
new IE6".

 _grin_

~~~
TeMPOraL
True.

Fortunately, there's a workaround - "request desktop site" in Firefox (and
Chrome) menu.

~~~
jaytaylor
Oh, wow, this is excellent, and fixes it!

I never knew about this option. Thank you, sincerely, TeMPOral!

~~~
TeMPOraL
You're welcome. It's a life-saver, in cases like this, or when a mobile site
disables the ability to pinch-zoom.

~~~
jan_g
Pinch zooming can also be force-enabled for all sites in the browser's
accessibility settings.

------
nudq
If I remember correctly, Postgres was initially developed in Lisp, but then
rewritten. Was that a mistake, or evidence against the thesis of this article?

~~~
lispm
Not really, the original Postgres was developed in a mix of 17000 lines of
Lisp and 63000 lines of C. This was difficult to develop/debug at that time.
Probably still would be.

It had a 'gigantic' memory footprint of 4 MB - the all-in-C version only used
1 MB. The Lisp version was also slower and they didn't use features like GC...

[http://db.cs.berkeley.edu/papers/ERL-M90-34.pdf](http://db.cs.berkeley.edu/papers/ERL-M90-34.pdf)

~~~
nudq
> By a process of elimination, we decided to try writing POSTGRES in LISP. We
> expected that it would be especially easy to write the optimizer and
> inference engine in LISP, since both are mostly tree processing modules.
> Moreover, we were seduced by AI claims of high programmer productivity for
> applications written in LISP.

Yes, that's what I remembered, they started out using LISP.

> Our feeling is that the use of LISP has been a terrible mistake for several
> reasons.

"Terrible mistake" is pretty unambiguous language.

~~~
lispm
> Yes, that's what I remembered, they started out using LISP.

Only for the optimizer and the inference engine.

The authors also had no prior experience developing an application in a hybrid
of C and Lisp.

> "Terrible mistake" is pretty unambiguous language.

4MB memory footprint was a terrible mistake at that time.

~~~
nudq
They obviously started writing Postgres in LISP, because _" we soon realized
that parts of the system were more easily coded in C"_ wouldn't make sense if
writing a hybrid had been the initial plan.

They tried going all LISP at first, and failed. Was it them, or was it LISP?

~~~
lispm
Since they had no experience in Lisp programming, they chose the wrong
language just for 'doing something different'.

Writing a database in a performant way isn't something for a Lisp newbie.

"By the time Version 1 was operational, it contained about 17000 lines of LISP
and about 63000 lines of C".

Version 1 was written in a mix of C and Lisp.

That's also not surprising, since that would have been a common approach for
some technical reasons. But it's a bit difficult to do - again, especially as
a newbie.

> Was it them, or was it LISP?

Their lack of experience, their approach, the LISP implementation they were
using, the hardware constraints (4 MB footprint was not acceptable to them),
... A conservative approach using a lower-level systems programming language
like C was a good choice at that time and they were much more successful with
that approach.

There were/are a bunch of databases written in Lisp and even in a mix of C and
Lisp: Statice (Symbolics), Zeitgeist (TI), Orion/Itasca, AllegroStore (Franz),
... But they were written by Lisp experts.

