
Are “jets” a good idea? (2017) - networked
http://lambda-the-ultimate.org/node/5482
======
ianbicking
It occurs to me that ASM.js (before WebAssembly) was full of jets. That is,
they wrote weird idioms that the interpreter could understand and optimize,
while technically keeping the semantics of the language itself (JavaScript).
The most visible example was someVar|0, which was a kind of way to assert that
the expression would be an integer. Not really a type declaration, since it
didn't ensure someVar would be a number, only that the expression was integer.

In the examples though, clearly the goal is to have some provability in the
core language, make that provability feasible with a small language, and keep
that as the language becomes further optimized. It seems subtly different than
an optimizing compiler.

~~~
theoh
Somebody mentioned "failure of full abstraction" on HN a few months ago
(actually they said something a bit different), which I tracked down to a late
90s paper by Martin Abadi. It refers to a situation where the object code is
exploitable because it doesn't fully reflect the semantics of the source code.

One use of jets, in the secure code setting, is to provide a core language
that is so simple that it can obviously be compiled in a "fully abstract" way.

That is my understanding, anyway. The notion of fully abstract compilation
seems to be quite hot, and it comes originally from Abadi's paper:
[http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-
RR-154.pdf](http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-154.pdf)

------
kjeetgill
This is already a standard piece of how JITs work today. See the discussion of
Java intrinsics and Array.fill() here [0].

> An intrinsic function is applied as a method substitution [...] and the
> compiler will replace it's apparent source-level implementation with a pre-
> cooked implementation. In some cases intrinsics are sort of compilation
> cheats [...]

> The intrinsics are all set up in vmSymbols.hpp and if you look, you'll see
> Arrays.fill is NOT on the list. [...] Because it is something like an
> intrinsic... [...] What is it then? Arrays.fill is a code pattern
> substitution kind of compiler shortcut, basically looking for this kind of
> loop:

``` for (int i = 0; i < arrayB.length; i++) { arrayB[i] = b; } ```

And replacing it with a call into the JVM memset implementation.

[0]: [http://psy-lob-saw.blogspot.com/2015/04/on-arraysfill-
intrin...](http://psy-lob-saw.blogspot.com/2015/04/on-arraysfill-intrinsics-
superword-and.html)

~~~
shub
That has bugged me since the first time I read about Nock. JITs use intrinsics
to work around inefficiency of generated code when the maintenance burden of
the intrinsic is worth the performance boost, and the grand plan is to make a
VM so terrible that you need a shitload of intrinsics for programs to
terminate before the heat death of the universe? Years later I still haven't
gotten past that initial bafflement. It just makes no sense at all.

~~~
lmm
> JITs use intrinsics to work around inefficiency of generated code when the
> maintenance burden of the intrinsic is worth the performance boost, and the
> grand plan is to make a VM so terrible that you need a shitload of
> intrinsics for programs to terminate before the heat death of the universe?

Put it this way: the JVM is perhaps the most sophisticated JIT on the planet,
maybe a million lines of code, the product of decades of programming research
and billions of programmer-hours, and it still needs intrinsics to improve
performance. So how about we skip the billions of programmer hours and just
use lots of intrinsics?

I don't know if it's going to work, but it's interesting enough to be worth a
try.

~~~
kjeetgill
I'm baffled by your conclusion and that of your parent post.

> So how about we skip the billions of programmer hours and just use lots of
> intrinsics?

This is very nearly the equivalent of: Why don't we skip writing a compiler
and just USE glibc?

My post was about the opportunity to forgo the use of an intrinsic and instead
do pattern recognition as part of compilation. The point being that the "jet-
like" pattern matching phase that exists in many compilers today, my example
being from the hotspot JVM's JIT compiler, but GCC will match the for loop and
INTRODUCE a call to memset too.

~~~
shub
I was using "intrinsic" inappropriately, I think, to mean both substitution of
method calls and substitution of bytecode patterns. Yeah, the JVM already does
the second thing. It's not roses for them, either[0]. The thing that I've been
looking for in all this discussion is acknowledgement that the JVM team has
been doing this, has encountered engineering challenges to the point that
they're willing to change javac to simplify a jet, and an explanation of how
these weird VMs are so different that none of that matters and using a
shitload of jets is perfectly fine.

[0] [http://openjdk.java.net/jeps/280](http://openjdk.java.net/jeps/280)

------
derefr
If you assume the program is being passed through a JIT (a thing that is
already common for interpreted code, and which _also_ tries to guarantee that
it doesn't change the resulting program's semantics), then one can think of
jets as simply explicit hints built into a JIT regarding how to optimize
particular known idioms well known in the corpus of programs the JIT was
designed to work with.

This is exactly how GPU drivers get gradually more optimized for existing
games: the driver's shader compiler actually has an ever-growing library of
shader-source-code patterns, with hardware-vendor-optimized replacement
sections.

------
valarauca1
"Are optimizing compilers a good idea?"

Look if you replace 1 representation with another that is _correct_ (to within
the human author's reason), but faster. You have the identical discussion, and
an answer (yes).

~~~
koala_man
Well, there's another aspect: "Is it OK to design a language that will be
unusable without an optimizing compiler?"

Running any normal language without an optimizing compiler is totally fine and
routinely done. Here we're talking about languages that might be thousands or
millions of times slower.

~~~
setr
Compared to a optimizing compiler, a fresh implementation for a language would
probably also be thousands/millions of times slower, unless the codebase
implements the optimizations itself.

And then you end up speeding up the compiler to reach a decent state, to speed
up the code

But time to reach initial compiler is much faster with a jet-based one
(there's barely anything to the language).

and then you end up adding jets to reach a decent state

I'm not sure there's a significant difference in this regard. The main
question, I think, is which is easier/simpler/faster to optimize into a proper
state, and in the case of compiler optimization being better, if its
sufficiently better to beat the ease of initial implementation for a jet-
dependent language

~~~
koala_man
It would be great if optimizers made code 1000x or 1000000x faster, but apart
from some edge cases like the removal of ineffectual loops, disabling
optimization in any compiler shows that ~10x is more realistic.

My day job is writing compilers (for traditional languages), and I have no
doubt that it's significantly easier/simpler/faster to implement the jet based
one.

However, the jets only apply to known algorithms, so it's really more like an
optimized library than a compiler technique. Code that uses the "library" will
be really fast with little effort, but anything that requires functionality
not directly found in the library will get the 1000-1000000x overhead, while
the traditional compiler will run it all with ~10x overhead.

I totally agree that it's the perfect fit for long-lived, highly heterogenous,
correctness first applications like blockchain, but I don't see how it'll be
useful for general purpose languages.

------
_bxg1
This is the problem I have with Lisp, but exponentially worse. Languages like
this try and stretch ever-closer to an imaginary "perfectly pure language".
It's an interesting exercise, but in the process they get further and further
away from concepts that are palatable to the human mind, and end up esoteric
and minimalist beyond usefulness. We sometimes forget that programming
languages are not designed for computers, but for humans who want to express
ideas to computers.

~~~
bunderbunder
Do you mean Scheme, specifically? Because I have a hard time seeing that in
lisp in general. Racket's small but fairly pragmatic, Clojure is pretty big,
and Common Lisp is frankly a bit of a kitchen sink language.

~~~
vinceguidry
Not your parent, but I personally didn't find S-expression syntax to be
superior. Treating code as data and vice versa is metaprogramming, and it
should be something you know how to do when you need it, easy enough to where
you're not really going out of your way to do it when you do need it, but
weird enough so that you can notice it and immediately see that you are going
to need to put on your thinking cap in order to proceed further down the road
of understanding this program, when you come across it.

I've yet to see a better approach than Ruby's.

~~~
taeric
Amusingly, I've yet to hit as much frustration as I have when using a few ruby
packages at work. Holy hell is it confusing to figure out where things came
from, sometimes.

I don't necessarily care for the data as code, most of the time; but I do like
that most everything has a sensible way to serialize to a file. Even better,
unless I am wanting to get optimized, the round trip can take care of itself.

Curious what the syntax is for a basic list of key value pairs? Well, it is
the same as any list of pairs. Which is the same as any list, unless you are
trying to get fancy.

This freedom really is liberating, at times. Yes, some folks have made macros
so that they have a syntax to have a different literal for different styles of
maps/tables/etc. By and large, life is easier if you don't use those and just
stick to the minimal syntax necessary to do this.

Which really helps with your concern. When I can see what the transformation
is, I can literally substitute into my mental model what the macro is doing,
so that I don't have to juggle a bunch of crap around.

And all of this, oddly enough, lands you into not caring if you are calling to
a function or a macro. I have more trust in calling things in most lisps I've
worked with, because I can understand both ways in terms of themselves easily
enough. With ruby, I'm often wary of using a new method in a source file,
because I really can't trust all of the interactions that are assumed for it
to work out.

~~~
vinceguidry
I'm not sure how much of our respective experiences is due to familiarity. For
example, I have REPL power tools and a really good understanding of Ruby's
object model to help me figure out what something's doing. So that almost
never concerns me all that much.

Whenever I pull in a new gem, I usually budget a few minutes to wrap my head
around the semantics of how the gem is expected to be used. Occasionally the
documentation isn't good enough and I need to source dive.

What I never have to deal with is new syntax. No matter how Ruby is being
used, the syntax is always the same comforting, pleasing-to-read near-prose.
What changes when you use metaprogramming in Ruby is semantics. You can't add
new syntax to the language unless you actually change the language.

Like you, I find people misusing metaprogramming. For me usually this is
fairly easy to detect. If I look for the source of a method and it's
dynamically generated, then showing the source will take me to where it was
generated.

I won't pretend to have chosen Ruby over Lisp because I liked the language
better. I chose it because I figured it would help me get a job better. But as
time went on I found myself at home with Ruby.

I don't think it would take me all that long to get used to any given Lisp.
But I don't believe it actually gives anyone superpowers, at least, not
superpowers you can't have with Ruby. I have yet to hear a lisper actually
articulate a solid advantage for code as data.

When there isn't any functional reason to choose between alternatives, all
there's left is the aesthetic. Which is where all those parenthesis finally
become important.

~~~
taeric
Hmm, you seem to be indicating that people use macros to create new syntax all
of the time. I have not actually found that to be the case that often. In
particular, there are only two macros I know I use somewhat regularly that do
this. LOOP and FORMAT. And even those always follow the same general syntax of
lisp. Unless you get crazy, it is not usual to actually use things that
drastically change the full syntax of lisp.

But again, in lisp, you probably make use of more metaprogramming than you'd
realize, precisely because it is indistinguishable from normal programming.

(I'm tempted to contrast this with some of the DSLs I've seen in the likes of
scala and ruby... I have a hypothesis that all of the hoops you have to jump
through to make a macro like construct cause people to go all in when they do
so.)

Now, I'm also not claiming lisp gives superpowers. Again, I'd almost go the
other route. Most lisp is typically quite simple, because you can accomplish
quite a lot that way.

Which does bring it back to aesthetics some. The parens just flat out don't
bother me. They are a bemusing joke, that never actually made sense. Too
often, the difference is minimal compared to most C programs. Foo(bar) becomes
(Foo bar). Which isn't that big of a deal. The basic math changes from infix
to prefix is a bit to get used to. However, I find it is much more pleasant to
deal with generics (+ matrix1 matrix2) as opposed to matrix1.plus(matrix2) of
java and the like. Of course, I was also a huge fan of my HP calculator in
college. RPN for the win! :)

So, yes, I do think it comes down to preference. I don't think you're
preference is wrong/bad. I do think your view of lisp sounds colored more by
bemusing jokes than actual problems in lisp code. But, I fully grant my
experience with ruby may be colored by an overly ambitious project written in
it.

~~~
kazinator
In ANSI CL, the macro related to formatting is called _formatter_ ; _format_
is just a function.

The _format_ function can take a function as an argument in place of the
format string.

The _formatter_ macro can compile a format string into such a function.

    
    
        (format nil (formatter "(~a-~a)") x y)
    

_formatter_ spits out a lambda that is passed to _format_ ; _format_ then
passes it the stream, and the remaining arguments.

~~~
taeric
Indeed, apologies for messing that up. :) My point was more on the rarity of
"syntax breaking" macros. And, even that one doesn't really change the syntax
much. It is just a very complicated function, for all intents and purposes.

~~~
kbp
> My point was more on the rarity of "syntax breaking" macros.

Why don't you consider macros like DEFUN, DEFCLASS, COND, etc to be "syntax
breaking"?

~~~
kazinator
Not speaking for grandparent, I would say that _cond_ isn't syntax-breaking
because it doesn't have to employ grammar -driven parsing to recognize and
delimit the clauses and their constituent forms. The nested list structure of
_cond_ that you see _is_ the real syntax.

 _defun_ could be like this, though its ANSI CL specification is quite screwed
up.

The sytax is given as:

    
    
       defun function-name lambda-list [[declaration* | documentation]] form*
    

where the double square brackets are described in "1.4.1.2.1 Splicing in
Modified BNF Syntax" which denotes that a list of elements is to be spliced
into a larger structure, and the elements may appear in any order. The
description of this idiotic bullshit is convoluted, but an example section
gives the gist of just how idiotic:

 _" For example, the expression_

    
    
      (x [[A | B* | C]] y)
    

_means that at most one A, any number of B 's, and at most one C can occur in
any order. It is a description of any of these:_

    
    
      (x y)
      (x B A C y)
      (x A B B B B B C y)
      (x C B A B B B y)
    

_but not any of these:_

    
    
      (x B B A A C C y)
      (x C B C y)
    

Note how the B elements generated by B* can be interspersed! So in the case of
_defun_ it means that we can have variations like:

    
    
      (defun name (args ...) decl doc decl decl decl body...)
    
      (defun name (args ...) doc decl decl decl decl body...)
    
      (defun name (args ...) decl decl decl decl doc body...)
    

There can be at most one docstring, and an arbitrary number of declarations,
and they can be mixed in any order.

Note that if we interpret A|B*|C as a simple regex notation, it wants all the
B's clumped together; someone really had to work hard to concoct this
counterintuitive BNF extension.

~~~
taeric
That all said, I can't imagine that most folks are surprised by the general
shape of most defun statements. The complexities that go into how it is
implemented is crazy, but it is not at all like the first time you see a LOOP
invocation, is it?

~~~
kbp
> I can't imagine that most folks are surprised by the general shape of most
> defun statements.

The general shape of a LOOP expression is just a list of symbols and lists. A
complicated LOOP expression that uses many features at once doesn't end up
looking very Lispy, but a shorter one like (loop :repeat 10 :do (print
'hello)) shouldn't look at all foreign, especially if you write the keywords
as keywords, which many people do.

I guess when people talk about the regularity of Lisp syntax, I think they
usually mean that (op ...args) applies the operator OP to the list ARGS. LOOP
obeys this very well: the only LOOP syntax I can think of that breaks it is
`using (hash-key k)` or with HASH-VALUE. Universally accepted macros break
this rule all the time, though. By shoving everything into lists, they totally
abandon trying to keep with the (op ...args) form. If you see that regularity
as a great advantage of Lisp's, then it seems like you should be much more
bothered by the more "mundane" macros than you should be by LOOP.

    
    
        (loop :for i :below 10 :do (print i))
    

Is less "syntax-breaking" than:

    
    
        (dotimes (i 10) (print i))

~~~
kazinator
Commonly used _loop_ syntax is replete with syntax that breaks "op args ...":

    
    
      when <condition> do
      for <var> in <list>
      for <var0> = <init0> then <step0> and
        for <var1> = <init1> then <step1>
      collecting <expr> into <var>
    

The normal macros that don't have a flat argument structure, but shove
everything into lists are keeping with the idea of leveraging the nesting so
that the construct is "parsed on arrival" (POA?) into the macro. All that
remains is to access its structure.

~~~
kbp
Syntactically, those are things that could have been implemented as operators,
but aren't, they're just LOOP keywords; lots of functions take keyword
arguments (edit: even if LOOP's are a little different from &key parameters).
LOOP avoids using unquoted lists for things other than calling operators.

edit: Oh, right, destructuring, too! I guess that also counts.

Out of curiosity, how would you feel about a macro like this (ignoring that it
doesn't enforce any keyword order)?

    
    
        (defmacro for (var &key = (then =) while do)
          `(do ((,var ,= ,then))
               ((not ,while))
             ,do))

------
nv-vn
Unrelated, but have LTU's site admins just disappeared? I signed up for an
account probably over a year ago and it still hasn't been verified.

~~~
nerpderp83
Same. Maybe you just have to hijack them at ICFP.

~~~
naasking
There were some issues with the signup process over the past year. You should
email them directly:

[http://lambda-the-ultimate.org/node/5535](http://lambda-the-
ultimate.org/node/5535)

------
core-questions
> Urbit's Nock, which, other than the jet concept, I honestly think is either
> a strange art project or snake oil, so I won't discuss this further.

Urbit is very real, though whether it will go anywhere or not is a different
question. You can run the code today, the language is.... weird as, but there
it is, running and doing what it does.

~~~
icebraining
Snake oil != vaporware.

------
ychen306
I think the standard term is Idiom Recognition.

------
refulgentis
Did this end up going anywhere interesting? As written up, it sounds like an
attempt to simplify programming that would just turn into a "turtles all the
way down" problem.

~~~
akvadrako
They haven't made a lot of public gains yet but the project has only recently
raised enough money to grow beyond a few developers.

------
redleggedfrog
What about debugging? Without an IDE and/or a stepping debugger you're in for
a lot of no fun.

~~~
nils-m-holm
Are you referring to debugging in insanely simple languages or to debugging in
general? Because I have written code in a text editor and debugged with PRINT
(or whatever the language offers) for decades. It was a lot of fun!

------
cousin_it
The language specification should guarantee that when you implement an
algorithm from your algorithm book, you get the same time and space complexity
as described in the book, with reasonably small constants. You should be able
to hit your big-O target without knowing which "jets" get special treatment by
the compiler. That's all I ask for, and it's not too much to ask.

------
paradroid
This is basically asm or microcode at a higher level.

~~~
Jach
Yeah I was pretty sure I've seen this in both Nim and Common Lisp, but don't
compile-time macros in general allow for this? A comment even mentions source-
text replacement with C functions. So it seems like the "jet" is just a type
of macro that is formally specified to behave as a pure transformation...

~~~
derefr
If macros are "GOTO", a jet is a "COMEFROM". A programmer _knows_ they're
calling a macro, and expects that code to change. A jet, meanwhile, affects
code that was written without knowledge of the jet.

~~~
coding123
It sounds like a aspect oriented programming.

