
Smaller Code, Better Code - jpt4
http://www.sacrideo.us/smaller-code-better-code/
======
arcfide
As the author of this code in question, I'd like to make the offer to the
Hacker News community and anyone at large. I'll do a live screen cast
demonstration for interested persons and walk you through the entire compiler
in 30 minutes to 1 hour. In the end you won't have a complete understanding of
the compiler, but if you have reasonable prior programming experience, I claim
that you will have a better, more full, and complete understanding of the
compiler than if you had spent the same amount of time learning most other
compiler designs. At that point, you would be able to continue your own self-
study and would be able to start making contributions to the compiler rather
quickly. This is an offer to demystify the code to people so that they have an
opportunity to see how it really does make the whole compiler simpler and
easier to work with.

If people express interest, I'll run such a live session and let people judge
for themselves what they think of the code and my approach to "simplicity"
after they've been introduced personally to the code base.

~~~
dang
That's a great idea. If you'd be interested in doing this semi-officially on
HN (maybe something along the lines of an AMA) please email hn@ycombinator.com
and let's co-ordinate it!

~~~
arcfide
Done.

------
natch
From the project:

...

rth,←' A zs;A rs=scl(r.v(0));rr##mf(zs,rs,p);if(c==1){z.v=zs.v;R;}\',nl

    
    
      rth,←'  array v=array(z.s,zs.v.type());v(0)=zs.v(0);\',nl
    
      rth,←'  DO(c-1,rs.v=r.v(i+1);rr##mf(zs,rs,p);v(i+1)=zs.v(0))z.v=v;)\',nl
    
      rth,←' DL(zz,if(rr##scl){rr##df(z,l,r,p);R;}\',nl
    

...

No.

And commit messages like "Hopefully that does it." No again.

~~~
arcfide
Don't complain that Chinese is ugly and unreadable just because you speak
English as your native tongue.

Technically, the above is a snippet of C++ put into an APL variable "rth" but
there's so much more to it than that, and so much more to the design that
you're missing.

The design and choice of aesthetic in the compiler is a _very_ intentional one
that is arguably one of the main issues that has caused me to rewrite the
compiler so many times over the years and has lead to this massive code
adjustment.

There are very good reasons that the compiler is written in the style that it
is, and you cannot compare it to other project's style guides.

Keep in mind that this compiler is designed to run natively on the GPU in a
fully data-parallel fashion.

One major issue that I had to address, and I discuss a little bit in a thread
above, is the idea of the malleability of the code base. It's critically
important to this project that I be able to adapt and alter the compiler
rapidly. For example, I recently had to rewrite the entire backend due to a
shift in some underlying core technology. This shift lead to a shrinkage of
about 2000 lines of code because the underlying supporting libraries were a
better fit to what I needed than what I was using previous to this. But I
might not have been willing or able to make this change if I didn't have
confidence that the rewrite would be swift and fast. Indeed, it took only two
months to rewrite the backend from scratch, add more new features, improve
robustness, and so forth. The code also got cleaner.

This obsessive need to be highly adaptable leads me to the desire to have
exceptionally "disposable" code. The cost for replacing or deleting code
should be as low as possible.

This has a few follow ups. In order to achieve the above, I need to ensure
that I understand the ramifications of deleting code as readily as possible as
quickly as possible. This basically means that I need to be able to squeeze as
much of the compiler into my head as possible, and what doesn't fit, I need to
be able to "see" and "read" as quickly and as readily as possible.

The compiler is designed so that I can see as much as possible with as little
indirection as possible, so that when I see a piece of code I not only know
how it works in complete detail, but how it connects to the world around it,
and every single dependency related to it in basically one single half screen
full of code (usually much less than that) without any jumps, paging,
scrolling or any movement. It means that I can completely understand the
ramifications of any edit I make in nearly complete detail without any
dereferencing or indirection. There are one or two places where there are some
helper utilities which are on a different page, but these are part of the
"domain vocabulary" which is basically in my mental cache any time I'm working
with that code. I keep these "helpers" to a minimum, so that they can fit with
anything else I want and not waste mental space in my head. Too many helpers
leads to a failure to understand the complete macro picture and thus defeats
my ability to delete code.

In order to make the code more readable, it has to be highly consistent and
idiomatic. I take this to an extreme level. This code is _highly_ regular and
predictable, to an almost obsessive degree. I do this by enforcing a style
discipline on the code that allows me to eliminate the use of a host of
abstractions, further paring down the complexity of the programming language
in which I'm working and allowing me to think in the same mental plane at all
times.

The idea of semantic density is critical to this point. The semantic density
of the APL code I'm using to solve the problem is at a certain rate. I
maintain a consistent density rate by choosing my variable names in such a way
that they visually align with the expressivity per character of the built in
primitive symbols. This means that the cadence when reading the code is
maintained. The "universal" naming scheme allows me to take any given name and
know exactly its purpose, parentage, place, and use in the compiler without
adding any additional cognitive overhead of inheritance syntax, datatypes,
classes, or anything more than a name.

The C++ code above is written the way it is to allow it to stylistically align
with the semantic density of the APL code. This means that I can jump between
the runtime and the compiler portions of the code with minimal mental shifts
between the two, because the style and approach are similar. The code can be
"read" in much the same way with minimal change. I am intentionally
prioritizing internal semantic and stylistic consistency over satisfying the
popular expectations of how C or APL code should look. I believe the internal
consistency within the project contributes more strongly to the day-to-day
readability and hackability of the project.

Furthermore, I strongly restrict my use of programming languages features.
This simplifies self-hosting, but it is primarily a means of maintaining
stylistic and cognitive power. Since I know how I need to think about my
problem "compilation on the GPU" in order to make it go, I can restrict myself
to a paradigm that only allows me to think in this way. I choose a paradigm
that is also exceptionally expressive to allow me to be productive as well. By
selecting the right core paradigm, I can eschew further programmatic
abstractions since they contribute nothing and only cost something.

One way in which I do this is to write the core of the compiler with only one
or two syntactical conventions, and only one main programming method: function
composition. The entire core of the compiler is a single points-free (almost),
data-flow, data parallel expression. Names provide the anchor points of the
"macro" level ideas, but the language is expressive enough that I need very
few other anchor points. Instead, I use only function composition over the
core primitives with a syntax known as "trains" to create the mental effect of
working with normal expressions when in reality I create new functions with
every line in the core compiler (which is 90 lines or so). By restricting
myself to only writing in this style, the mental effect works. If I had to
switch between expression level and trains/points-free style in the code, it
would be much less readable. But because I can now treat my points-free
programs as regular expressions for all intents and purposes, it actually
simplifies my cognitive load, as there is only one thing to think about:
function composition.

~~~
arcfide
By keeping the code as visible (read, small) as possible, I see more code and
can better reason at a macro level. To scale this down into the micro level of
dealing with individual compiler passes, I replace all the traditional
programming paradigms with others in a sort of 1 for 1 exchange. In this way,
I develop a new set of idiomatic programming methods that are so concise, they
can begin to be read as we read and chunk English phrases. By doing so, it
becomes actually _easier_ to just write out most algorithms, because the
normal name for such an algorithm is basically as long as the algorithm itself
written out. This means that I start to learn to chunk idioms as phrases and
can read code directly, without the cost of name lookup indirection. I can get
away with this because I've made reusability and abstraction less important
(vastly so) because I can literally _see_ every use case of every idiom on the
screen at the same time. It literally would take more time to write the
reusable abstraction than it would to just replace the idiomatic code in every
place. It's a case of the disposability of code reaching a point that
reusability is much less valuable.

This means that in those cases where reuse _is_ valuable, it's very valuable,
and it comes to the fore and you can see it as the critical thing that it is.
It doesn't get drowned in otherwise petty abstractions that assist
reusability, since we don't need that anymore.

Furthermore, if I write my code correctly, there is very, very little boiler
plate in the compiler. Almost none. This means that every line is significant.
By doing this it means that you don't get the fun of feeling like you're
accomplishing something by typing in lots of excess boiler plate, but it does
mean that you have no wasted architecture. Because rewriting the architecture
is so trivial, basically everything now becomes important, and you don't have
petty book keeping code around. You know that everything is important, and
there is no superfluous bits.

The result, as mentioned elsewhere, is code that is getting continuously
simpler, rather than continuously more complex. The code is getting _easier_
to change over time, not harder. The architecture is getting simpler and more
direct and easier to explain. Because it costs so little to re-engineer the
compiler, I can do so constantly, resulting in little to no technical debt.

This is an intentional synergistic choice of a host of programming techniques,
styles, disciplines, and design choices that enables me to program this way.
Give up one of them and you start to break things down. It allows for a highly
optimized programming code base that has all of the desirable properties
people wish their code bases have, and it scares people. I think that's a good
thing. Because I don't want people to see this codebase as just another thing.
I want them to see that this is something truly different. How can I get away
with no module system? How can I get away with no hierarchy? How can I get
away with having everything at the top-level, with almost no nested
definitions? How can I get away with writing a compiler that is not only
shorter, but fundamentally simpler from a PL standpoint than standard
compilers of similar complexity by using only function composition and name
binding? How can I get a code base that has more features but continues to
shrink?

By chasing smaller code. :-)

I assure you, and I'll make good on this in another reply here, I could get
you up and running on understanding the code and how it works faster than just
about any other compiler project out there. In the end, one of the goals I
want for this compiler is for people to say, "Woah, wait, that's it? That's
trivially simple." The more I can push people to think of my compiler as so
trivial as to be obvious, the more I win. The compiler really is so dirt
simple as to shock any normal compiler writer.

But to make it that simple, I have to do things in ways that people don't
expect, because people expect complexity and indirection, they expect
unnecessary layers for "safety" and they expect code that needs built in
protections because the code is too complex to be obviously correct.

I'm pushing the other direction. If you can see your entire compiler at one go
on a standard computer screen, what sort of possibilities does that open up?
You can start thinking at the macro level, and simply avoid a whole host of
problems because they are obviously wrong at that level. When you aren't
afraid to delete you entire compiler and start from scratch? What sort of
possibilities does that open up to you?

~~~
thealistra
Two things I want to say/ask.

1\. What happens if you get sick. You say this is a project in production and
there is money on the table (I assume not only yours). What if you get sick
and are unable to work for 3 weeks or 6 months. Don't you think that this code
is very hard to grasp to someone else, who would have to temporarily work on
your postion?

2\. It is weird, that you wrote such a long essay, spanning two comments, but
it has so little examples from the actual code. Usually when people explain
stuff they go between the abstract concepts and how they are materialized in
the code. Here you only explain the idea behind writing it and how it makes
you feel/operate/gain flexibility and performance but the closest to the code
information I've got from it is that it has compiler passes and that it has a
C++ runtime in a string variable. Just a thought, what do you think about
that?

~~~
arcfide
At this point, if I get sick, the code doesn't move much. If I were
permanently disabled, this someone else could take over. I have people
contribute bugs, tests, and other things fairly often. If you had to
temporarily work on the code base and weren't familiar with the background of
the project, I would say you'd be lost. It's just not the sort of thing that
you can start tweaking things here and there so easily, because almost
everything that needs changing is a matter of addressing architectural or
serious questions that require you to really understand the project. Because
of the way the code is written, there's basically no "code monkey" type work.
That means that you only do meaningful work, but it also means that only
people who are knowledgeable architects can work on the code. You can imagine
the same thing in other code bases. Imagine that you didn't need any of your
lower-level programmers anymore for work because there was nothing for them to
do. Now imagine how the bus factor changes on the code when only your chief
architects are necessary for working on that code base. That's very nice in
one dimension, but it does create quite a different picture.

You're right about the code examples. I figure that people were already
posting some code snippets. I wanted to give the big ideas rather than any
specifics. The reason for this is basically that if you take any single line
of code out of context, it's a bit hard to explain why I'm doing the things
that I'm doing. It's very much a macro design, which is why I am offering the
live session to go through. It's sort of, but not quite, an "all or nothing"
thing. if you let me sit down with you and go through the entire code base,
then I can explain how it all fits together and why things are the way they
are, but if you just take a single piece of code out, you're missing the
picture.

If I took a single compiler pass, out, for instance, you'd have between 1 and
12 lines of code to look at. I could explain a few features, but how would I
explain that when you look at this piece of code you're able to see it
entirely in context? Well, I can't, because the code it completely out of
context at that point. Or what about demonstrating how the naming conventions
exhibit structure informative regularity? Again, I can't, because that's a
visual design element of the code. It's something you have to "see" by looking
at the whole painting as it were.

The naming convention is actually a great example. Out of context, there's
apparently no rhyme or reason to it. But in context, it forms a key component
to the visual regularity and continuity throughout the code. The names are an
important part of how you can see the structure of the code. It helps to
orient you in the big pie. But if I were to quote a single line here, there's
now pie to look at, no sky to navigate by. It's just a single constellation.
By analogy, it does less good to say, here's the Big Dipper, it's useful. But
why? Because it's easy to find amidst the context of starts and its shape
helps you to find the North Star. But on its own it doesn't seem as valuable.
At that point it is just another constellation. The same thing happens with
this code.

So I'll go through and explicate it all in detail in the live session, where I
can provide the "painting" and workflow in its entire so people can see how it
works. Then you can see how my comments here match up with the code.

------
burgerdev
At first I was wondering how he managed to write a compiler in 750 loc. Then I
noticed it's for APL, which I would call terse:

    
    
      Y0←{⊃,/((⍳≢⊃n⍵)((⊣sts¨(⊃l),¨∘⊃s),'}',nl,⊣ste¨(⊃n)var¨∘⊃r)⍵),'}',nl}
    

See also
[https://en.wikipedia.org/wiki/APL_(programming_language)#Exa...](https://en.wikipedia.org/wiki/APL_\(programming_language\)#Examples)

~~~
Silhouette
Does anyone here program APL? I've tried to look into it occasionally because
the idea of powerful, concise syntax appeals to me, but the unfamiliar syntax
was always too much to get my head around within a reasonable amount of time.
I'm curious to know whether it really does become second nature after a while,
in the same way that some of us might read a printf format string or regular
expression quite fluently after many years of working with them.

~~~
RodgerTheGreat
I program in K, a close relative, and I have done some tinkering with APL. The
symbols actually don't take long to memorize- perhaps a few days of practice.
It's a bit like learning to read prose. At first you have to sound out words
letter by letter, but eventually you're able to "see" words and phrases built
out of common patterns of symbols. I see ,/f' and think _flatmap_ , ~~': and
think _heads of uniform runs_ , {x@<x} and think _sort up_ , etc.

A dense expression can still take a while to puzzle out sometimes, but
certainly no longer than the equivalent logic spelled out in a more verbose
language across many lines.

~~~
willhslade
What's it like looking for a K job?

~~~
RodgerTheGreat
In my experience, if you do enough open source stuff with K, jobs find you.

------
franciscop
The numbers are nothing like this, but I had a really similar experience to
the author when doing Umbrella JS. With exceptions, but I've tried to keep
every function down to few lines of code by doing heavy code reuse:

    
    
        // src/addclass/addclass.js
        // Add class(es) to the matched nodes
        u.prototype.addClass = function () {
          return this.eacharg(arguments, function (el, name) {
            el.classList.add(name);
          });
        };
    

While they don't do exactly the same (Umbrella JS is more flexible but jQuery
supports IE9), compare that to jQuery's addClass():

    
    
        addClass: function( value ) {
        	var classes, elem, cur, curValue, clazz, j, finalValue,
        		i = 0;
    
        	if ( jQuery.isFunction( value ) ) {
        		return this.each( function( j ) {
        			jQuery( this ).addClass( value.call( this, j, getClass( this ) ) );
        		} );
        	}
    
        	if ( typeof value === "string" && value ) {
        		classes = value.match( rnothtmlwhite ) || [];
    
        		while ( ( elem = this[ i++ ] ) ) {
        			curValue = getClass( elem );
        			cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " );
    
        			if ( cur ) {
        				j = 0;
        				while ( ( clazz = classes[ j++ ] ) ) {
        					if ( cur.indexOf( " " + clazz + " " ) < 0 ) {
        						cur += clazz + " ";
        					}
        				}
    
        				// Only assign if different to avoid unneeded rendering.
        				finalValue = stripAndCollapse( cur );
        				if ( curValue !== finalValue ) {
        					elem.setAttribute( "class", finalValue );
        				}
        			}
        		}
        	}
    
        	return this;
        },

------
rakoo
Reminds me of that good ol' folklore:
[http://www.folklore.org/StoryView.py?story=Negative_2000_Lin...](http://www.folklore.org/StoryView.py?story=Negative_2000_Lines_Of_Code.txt)

~~~
ScottBurson
That's a great story. I can't resist quoting Dijkstra: _If we wish to count
lines of code, we should not regard them as "lines produced", but as "lines
spent" \-- the current conventional wisdom is so foolish as to book that count
on the wrong side of the ledger._

~~~
arcfide
I am very inspired by Dijkstra's high level ideas on programming. Importantly,
one of the fundamental assumptions of Dijkstra was that you could actually
understand your code base and reason about it. The creation of excessive
abstraction may create a degree of robustness that protects against
programmer's who don't understand the code base, under the assumption that no
one will, but at the cost of eventually ensuring that no one will be able to
understand the entirety of the code base or even reason at the macro and micro
levels efficiently at the same time for a large part of the code base.

------
finin
I've found the when teaching, I sometimes work on an example program too much,
producing what I think is elegant and compact code, but that the students find
hard to understand. I suspect that the same may be true when I am
collaborating with others on a program. There can be value in writing code in
a straightforward, easy to comprehend style.

~~~
delinka
"Compact code" is not orthogonal to "less code." It is said that you don't
truly understand the problem you're trying to solve until you've implemented
it a few (3?) times. Once you begin to understand the problem, you can often
find places that required no code: either an existing API solved that problem
and you weren't aware; or perhaps you found a more 'pure' solution and you can
remove much of your code. This does not mean that you need to write compact,
unintelligible code.

Also consider: "In anything at all, perfection is finally attained not when
there is no longer anything to add, but when there is no longer anything to
take away, when a body has been stripped down to its nakedness."

Lately in my programming career, I find myself simplifying code, distilling it
to solve the problem at hand, then clarifying the code (with good variable
names, explicit comparisons to NULL/nil, fully demarcated if/else, small well-
named functions, etc) so that future me can grasp it faster. This has the
added benefit of pleasant peer review and getting new devs acquainted with the
code.

~~~
chrisweekly
Nice post. Related note, I just recently found this gem of a Hickey talk about
"Simple" vs "Easy": [https://github.com/matthiasn/talk-
transcripts/blob/master/Hi...](https://github.com/matthiasn/talk-
transcripts/blob/master/Hickey_Rich/SimpleMadeEasy.md)

------
jcoffland
It's interesting to note that the author has written more lines here in this
thread than are contained in the compiler in question. The English language is
not nearly as concise as APL.

------
dude01
Woah! From the article: "added roughly 4,062,847 lines of code to the code
base, and deleted roughly 3,753,677".

~~~
zzzcpan
This is not a good thing though, meaning the language and abstractions are not
expressive and not reusable enough. Self-hosting compilers, like the author's,
feel wrong to me because of that, meta DSLs for compilers should serve as much
better abstractions and save a lot of work.

~~~
arcfide
Except that your meta DSL probably isn't able to solve the problem that this
compiler is solving, which is putting an entire compiler natively onto the GPU
in a way that the code is actually maintainable in a "native GPU" version,
rather than requiring translation from some other state.

This compiler has gone through many core paradigm shifts in an attempt to find
an appropriate way to express a solution to the problems that it encountered.
Each iteration revealed some new insight into how to solve the problem, but
inevitably lead to a need to rethink the system.

Now, the system is so expressive and capable that reusability isn't even an
issue. At this point reusability is about as useful in the compiler as having
a new word to represent the word "the". Why? Why not just write the? Anything
else you could write is likely to create confounding layers of indirection and
distance between definition and use in the code that will actually obscure
clarity.

Instead, I take the intentional approach to make the code as "disposable" as
possible. Why change a compiler pass that is two lines long when you can just
rewrite it from scratch in less time? By leveraging a different aesthetic,
architecture, and language, I'm able to have more expressivity by removing
unnecessary abstraction and making it as easy as possible to re-engineer the
whole thing at the drop of a hat. This means that I never have to "live with"
code bloat or some design decision that's annoying me. The cost to re-engineer
is so low that I have almost no technical debt. If an architecture fails to
scale, replace it and move on, without any loss of productivity, and a net
gain since the code gets easier and easier to work with on each iteration.

------
skybrian
It seems like there is a missing explanation of the language this compiler
compiles and why someone would want to use it? (Searches on "dfns" and "co-
dfns" don't find much.)

------
fourier
Here is the link:
[https://www.youtube.com/watch?v=gcUWTa16Jc0](https://www.youtube.com/watch?v=gcUWTa16Jc0)
and proper q/a thread:
[https://news.ycombinator.com/item?id=13638086](https://news.ycombinator.com/item?id=13638086)

------
jfoutz
As pointed out in paip, clarity and concision are at odds. It takes good taste
to balance the two.

~~~
BurningFrog
I think it also takes empathy. In the sense that you can imagine how the code
would read to someone else, who was new to it.

------
nattaylor
>for every one of those 750 lines, I've had to examine, rework, and reject
around 5400 lines of code.

I guess there's no such thing as "good enough" with a compiler?

Those are staggering numbers to me. Kudos to the author.

------
arcfide
The live session is up and running now. You can find more information about
the stream and ask your questions at the following post:

[https://news.ycombinator.com/item?id=13638086](https://news.ycombinator.com/item?id=13638086)

------
known
AKA
[https://en.wikipedia.org/wiki/Pareto_principle](https://en.wikipedia.org/wiki/Pareto_principle)

------
n0mad01
thats roughly 1369 loc added per commit or 1855 loc per day.

------
edblarney
Smaller is better, but that does not mean 'fancy pants super dense cryptic
code'.

I think 'simpler' would be a better term than 'smaller'.

Also - every line of code has cost. A lot of cost. Maintenance of code and
complexity is not only expensive, but it adds to the maintenance of other
code.

So less code to solve the problem is almost always better.

~~~
arcfide
At the heart you are absolutely right. We're after simplicity and clarity.
However, I have found that "small" really does make a difference, especially
if you push yourself to be small on the macro, rather than micro level. If I
just chose "simple," it is too easy to believe that it's "simple enough." If I
force myself to maintain poetry like small-ness, then I'm not just able to get
by with "simple enough" but have to seek macro levels of simplification that
we can often fail to see when the code is so large that all we look at is the
single, local view of a single function.

By forcing myself to ever greater degrees of ascetical code sizes, small, cute
micro hacks in a given function don't work. At that point the "fancy pants"
hacks fail, and I am forced to create macro simplifications that obviate the
need for whole classes of programming techniques.

So, yes, we want simple, but it's about how we can push ourselves and our
minds to get there.

~~~
edblarney
Yup, I agree on the 'smaller architecture' bit.

One more point: I find that there are a lot of very common things that we, as
developers, have not 'standardized' on - but if we did, it would be
beneficial.

The underscore/lodash JS libraries are great examples of this.

They are not just a bunch of 'helper functions' \- they are really a series of
new 'functional keywords' that in a way represent a new paradigm in software:
we all get used to these 'mini patterns' and call them the same thing, and
when used in code they can make things a _lost_ simpler.

Map, reduce, find, each, pull, filter etc. etc. - at first glance it would
seem compulsive to jam all these into some code - but once the developers are
familiar with them ... guess what - they become almost part of the programming
language itself.

So I think this is a pretty good example of a 'meta' way to facilitate
simplicity: agree on names for very common patterns, and abstract them away
with tools or linguistic constructs.

~~~
arcfide
Congratulations, you just listed some of the main APL primitives. They've been
standardized and given standard names, even in Unicode, since the 70's or
80's? Sorry, I couldn't resist. :-)

~~~
lmm
Many of them are present in the original lisp paper from 1960. Under standard
names like "map" and "search" that are concise but still words.

~~~
arcfide
Indeed, many of these ideas as expressed in APL come from the 1962 book, "A
Programming Language."

I find it unfortunate that these ideas are only now beginning to find some
general acceptance in larger, more mainstream languages.

