
The Hardest Program I've Ever Written - skybrian
http://journal.stuffwithstuff.com/2015/09/08/the-hardest-program-ive-ever-written/
======
greggman
Great blog post and super interesting.

My feeling though is the problem is they have a line limit. Maybe they should
rethink their style. I'm serious.

Before I worked at Google, in 30 years of programming I never worked at a
company that had a line limit. Adding a line limit at Google did not make me
more productive. At first I thought "hey, I guess 80 chars makes side by side
comparison easier" but then I thought back, hmm. I never had problems
comparing code before when I didn't have a line limit.

Instead what I found was that 80 character limit was a giant waste of time.
The article just pointed out a year of wasted time. I can point to searching
and replacing an identifier and then having to go manually reformat hundreds
of lines of code all because of some arbitrary style guide. I also had code
generators at google that had to generate code that followed the line limit. I
too wasted days futsing with the generator to break the lines at the correct
places all because of some arbitrary line limit.

That should be the real takeaway here. Make sure each rule of your style guide
actually serves a purpose or that its supposed benefits outway its costs.

~~~
ryandvm
I can get behind most code style guidelines, but people who harp on 80
character line limits drive me nuts. It's such a funny anachronism. Is there
really someone out there using an editor that can't soft-wrap?

The width of a developer's editor window is so fundamentally a presentation
issue - I have trouble imagining anything more so. Having line length limits
is like mandating editor color schemes. How about I set my soft-wrap
preferences the way I like and you can do the same?

~~~
dragonwriter
> I can get behind most code style guidelines, but people who harp on 80
> character line limits drive me nuts. It's such a funny anachronism. Is there
> really someone out there using an editor that can't soft-wrap?

Personally, I'd never know. I assume my editors _can_ soft-wrap, but, IME, in
terms of being able to easily work on code, lines that fit the window are
better than long lines that aren't soft-wrapped, and long lines that aren't
soft-wrapped are better than long-lines that are soft-wrapped, so I don't ever
use soft-wrapping features.

I'm personally not that tied to 80 characters as a perfect line limit, but its
a not unreasonable general guideline for most code in most languages. Like
most guidelines, there's times when its inconvenient as a hard limit.

------
sytelus
I have tried few code formatters and almost always regretted it. The big
problem is that code formatting contains a significant portion of intent and
explanation. Sometimes I want to put two assignments on same line because it
emphasizes relationship and atomicity but other times it's better to keep them
on separate lines for sparcity. There are actually quite a few times I wanted
a line go well beyond 80 chars because I wanted to de-emphasize unimportant
monotonous part taking away all attention and have far more more important
steps immediately stand out to reader. I take code formatting very seriously
and consider an integral part of my expression. Style guides are good but they
shouldn't be followed like a robot, let alone enforced by robot. In fact code
formatting tells a lot about culture and philosophy of an author. For example
K&R C starts braces on same line to emphasis compactness as elegance, C#
doesn't to emphasize sparse code as elegance. In SQL sometime it's great to
put subquery on same line and sometime it doesn't - it really depends on what
you want to emphasize and convey rather than hard and fast rules on number of
tokens and syntax analysis. Code formatting is not just set of fixed rules,
it's a communication mechanism that guides reader on what to focus, what is
unimportant, where is a gasoline spill and where wild fires may burn. This is
not to say everyone takes their formatting seriously which is where automate
formatter would probably add value (and the case where you are importing/copy
pasting from somewhere else). I think K&R C is likely the gold standard for
code formatting. You should try out your formatter on those snippets ;).

~~~
rwallace
It's funny, I hold exactly the opposite view for exactly the same reason! I
regard a code formatter nowadays as an essential tool for programming
productivity, the fourth most important tool after an editor, compiler and web
browser. The reason is that the limit on how much I can get done is not so
much wall clock time as mental energy. The thing that costs mental energy is
making design decisions. Without a code formatter, every few lines provides
another invitation to make a design decision about layout.

~~~
CardenB
What formatting tools do you use?

~~~
rwallace
These days, clang-format

------
ridiculous_fish
Thanks for writing this, it was a great read!

I maintain a source beautifier too [1], and it's not as nice as I would like.
One of the issues I run into is that the correct indent on a broken line is
context dependent. For example:

    
    
        while (someReallyReallyReallyReallyLongFunction() &&
               anotherLongFunction()) {
           loopBody();
        }
    

is a nicer indenting than:

    
    
        while (someReallyReallyReallyReallyLongFunction() &&
            anotherLongFunction()) {
            loopBody();
        }
    

In the first case, the two conditions are aligned which makes the code
clearer. Does dartfmt handle this? If not, do you have ideas on how it might?

Also, how does it handle invalid input? I may want to reindent my code before
it's correct.

Also, did you explore constraint solvers instead of a graph traversal? It
seems like they would be a natural fit.

[1]: fish_indent, [https://github.com/fish-shell/fish-
shell/blob/master/src/fis...](https://github.com/fish-shell/fish-
shell/blob/master/src/fish_indent.cpp)

~~~
vog
In that specific case, isn't it better style to write it differently anyway?

 _Either:_

    
    
        conditionWithReadableName: function() {
            return
                someReallyReallyReallyReallyLongFunction()
                && anotherLongFunction();
        }
        ...
        while (conditionWithReadableName()) {
            loopBody();
        }
    

_Or if you don 't want to make up a good name for your conditional:_

    
    
        while (true) {
            if (!someReallyReallyReallyReallyLongFunction()) break;
            if (!anotherLongFunction()) break;
            loopBody();
        }
    

_Or, in a more rules-based fashion:_

    
    
        conditionWithReadableName: function() {
            if (!someReallyReallyReallyReallyLongFunction()) return false;
            if (!anotherLongFunction()) return false;
            return true;
        }
        ...
        while (conditionWithReadableName()) {
            loopBody();
        }
    

I find it troubling to make up "good" indentation rules when the code to
indent isn't well-written in the first place. Multi-line conditionals are an
anti-pattern in itself (no matter if they appear in "while", "if" or "for").

~~~
ikurei
I agree with you in that multi-line conditionals should go, and no beautifier
is going to make it up for bad code, but it's still great that they try. You
can't always devote as much time as you'd like to make the code perfect, and
that is when a beautifier comes the most handy. Also, having a good beautifier
when you have to deal with other not-so-perfectionist programmer's code is
great.

~~~
vog
_> having a good beautifier when you have to deal with other [...]
programmer's code is great._

Good point! I didn't think of that.

------
sudo_bang_bang
"you’d expect it to do something pretty deep right?... Nope. It reads in a
string and writes out a string."

We're all just doing the same thing in one way or another :) Good work and
nice article.

~~~
aiiane
Along these lines, we have a joke at Google that all of our problems are just
transforming one protocol buffer into another.

~~~
feelix
And programming anything, or taking over the universe for that matter, is just
a matter of the hitting the right keys on the keyboard in the right order.

~~~
ant6n
[https://xkcd.com/722/](https://xkcd.com/722/)

~~~
jonahx
And all human activity since the dawn of time is just using our bodies to move
molecules from one place in space to another....

------
jcizzle
Huh, I thought the 'Reformat to Dart Style' option in IntelliJ was using
dartfmt and I was so disappointed at its output I stopped using it. Just went
and tried dartfmt from the command line after reading this - dartfmt is
significantly better. Fun to hear the approach that went into it.

For anyone that hasn't tried it, grab the Dart SDK and the IntellIJ Dart
plugin. Takes less than 5 minutes to setup. It's been a great platform for
building server side stuff - I haven't tried it for front end web stuff. It
took about 3 reads of the language tour ([https://www.dartlang.org/docs/dart-
up-and-running/ch02.html](https://www.dartlang.org/docs/dart-up-and-
running/ch02.html)) and about a week and I already felt very comfortable with
the entire platform.

~~~
munificent
> Just went and tried dartfmt from the command line after reading this -
> dartfmt is significantly better.

\o/ It improved a _lot_ in the past two months. That's when rules and the new
splitter landed.

------
kitd
Good article!

The most complex single piece of code I ever wrote was a scheduler. The user
could specify a pattern of when events should be raised (eg on this date, at
this time, every other hour on the last day of every month, at midnight for me
in this TZ on a server in another TZ, etc), and the scheduler would raise the
events at the prescribed instant(s).

That took about 9 months, and my biggest takeaway was that how humans measure
time is completely f __*ed up!

------
justinator
This made me curious as to what the Perl Tidy formatter was like, as I use it
often. You know, it's Perl, so maybe a few regexes here and there, and much
wizardry.

The Tidy.pm module is 1.1M in size, and over 30,000 lines long. I have much
respect for formatters now, I thought the job they do was an easy one.

Fantastic looking sourcecode, btw,

[https://metacpan.org/source/SHANCOCK/Perl-
Tidy-20150815/lib/...](https://metacpan.org/source/SHANCOCK/Perl-
Tidy-20150815/lib/Perl/Tidy.pm)

------
bigger_cheese
In the intro to programming course I took at University the final assignment
we had was something similar (Format Text) we had to write a program that when
given three arguments (a text file to read from, an int specifying the
characters per line and an int specifying the lines per page) would output the
text file formatted correctly breaking on words etc.

From memory there were other requirements indenting the first word of each
paragraph things like that.

As the article alludes to it was a surprisingly complex problem - we also had
to worry about memory allocation as we were using C. I remember I was quite
proud when I got the sample text (which was a few paragraphs from "The
Hobbit") to render correctly.

I've never thought about writing a code formatter I just trust emacs to format
my code for me. I'd be interested in digging up my old code and seeing how
easily I could modify it to operate on source code.

------
chriswarbo
I think code formatters are a great idea, but they're not _quite_ clever
enough for me yet.

For example, if there's a common pattern among a set of lines, I'll often line
them up vertically to make the repetition clear and focus attention on the
differences rather than the commonalities; for example:

    
    
        if (foo  ||
            quux ||
            baz) {
          ....
        }
    
        let foo  = 10
            quux = foo  * 2
            baz  = quux + 1
         in baz * 2
    
        fields = ['name', 'address', 'country',
                   'dob',  'status',  'salary']
    

To me, those few extra spaces make it easier to glance over the code than
without:

    
    
        if (foo ||
            quux ||
            baz) {
          ....
        }
    
        let foo = 10
            quux = foo * 2
            baz = quux + 1
         in baz * 2
    
        fields = ['name', 'address', 'country',
                  'dob', 'status', 'salary']

------
gchpaco
Pretty printing is surprisingly tricky. Last time I had to do it, I ended up
with this:
[https://github.com/rethinkdb/rethinkdb/blob/next/src/pprint/...](https://github.com/rethinkdb/rethinkdb/blob/next/src/pprint/pprint.cc)
which has become my favorite algorithm for it as it's quite tweakable for
specific needs. Fun little algorithm.

~~~
AceJohnny2
"Amazingly, surprisingly, counterintuitively, the indentation problem is
almost _totally orthogonal_ to parsing and syntax validation. I'd never have
guessed it. But for indentation you care about totally different things that
don't matter at all to parsers. Say you have a JavaScript argument list: it's
just (blah, blah, blah): a paren-delimited, comma-separated, possibly empty
list of identifiers. Parsing that is pretty easy. But for indentation
purposes, that list is rife with possibility!" \-- Steve Yegge, 2008 [1]

That really struck me back then, and I've kept it in mind whenever I hear
about code beautifying/indenting.

[1] [http://steve-yegge.blogspot.com/2008/03/js2-mode-new-
javascr...](http://steve-yegge.blogspot.com/2008/03/js2-mode-new-javascript-
mode-for-emacs.html)

~~~
sklogic
And yet, the best place to add your pretty-printing and indentation hints is
parser. Hints are attached to the grammar, so it makes sense to merge the two
things, and then generate two different tools out of the single source. Three,
actually - an AST pretty-printer, a textual code formatter and, finally, a
parser itself.

~~~
eltaco
There's an issue for a CST (AST with whitespace, comments, etc) in the estree
repo [1]. JSCS is planning on using
[https://github.com/mdevils/cst](https://github.com/mdevils/cst) for future
autofixing rules.

[1]
[https://github.com/estree/estree/issues/41](https://github.com/estree/estree/issues/41)

~~~
sklogic
The beauty of this pretty-printing solution (merging it with the parser) is
that you don't even need any parsing tree to be constructed. The parser will
simply walk the stream and annotate it with the pretty-printing instructions
(pushing an popping the indentation context, adding the weighted break
candidates, etc.).

------
al2o3cr
" Even if the output of the formatter isn’t great, it ends those interminable
soul-crushing arguments on code reviews about formatting."

Similarly, covering all your food in Doritos dust doesn't always taste great
but it ends the interminable soul-crushing arguments about what flavor things
should have.

~~~
rfrey
Flavor (as well as texture and aroma) is the whole point of cooking
(otherwise, Soylent). Formatting is not the point of programming.

~~~
tremon
I'd say nourishment is the whole point of cooking. The advent of cooking made
more foods digestable to our intestines and other foods less risky.

Flavour, texture and aroma are learned appreciations, and are not universal.
So I think the comparison with code style is quite apt.

~~~
jdbernard
No, nourishment is just the bare essentials of cooking. In fact, we have a
very large problem, literally and figuratively, because we often don't care
about the nourishment. Consider junk foods and others that we call "empty
calories" because they have no real nutritional value.

My point being that I don't really see the persuasive value of the analogy.
It's a false-equivalence. The point was that yes many people often care deeply
about the formatting of the code (myself included), but discussions around
formatting are almost always a form of bike-shedding. A better analogy would
be "randomly selecting the restaurant doesn't always lead you to your favorite
place, but at least it prevents the interminable discussions about where to
go."

------
pjtr
I've never worked with a fmt tool, but run C# StyleCop[1] on each build to
warn about style violations. Naively to me that seems to give the same
benefit, but is probably significantly easier to write, is easily configurable
and extensible, and leaves me in control.

Isn't it annoying when a globally optimizing tool switches back and forth
between "all arguments on one line" and "all arguments on separate lines"?
E.g. producing overly complex whitespace changes in diffs for small
"triggering" changes?

[1] [https://stylecop.codeplex.com/](https://stylecop.codeplex.com/)

------
Kenji
_Every surviving line has about three fallen comrades._

My first thought was CSS.

------
lolptdr
Has anyone done any comparisons to other code formatters of other languages?
Or even other code formatters within Dart?

Wish I could gain more context on how big an arena of these types of programs.
I'm a bit lost as to how important code formatters and beautifiers were until
reading more on the difficulty of writing such a program by Mr. Nystrom.

~~~
eltaco
For javascript, there's been jsbeautifier [1], jsfmt [2], uglify.

JSCS [3] added autofixing a while back for most whitespace rules, and ESLint
has just begun autofixing as well [4]

[1] [http://jsbeautifier.org/](http://jsbeautifier.org/) [2]
[https://github.com/rdio/jsfmt](https://github.com/rdio/jsfmt) [3]
[http://jscs.info/](http://jscs.info/) [4]
[https://github.com/eslint/eslint/pull/3635](https://github.com/eslint/eslint/pull/3635)

------
qznc
This whole "Pruning redundant branches" stuff essentially reduces to "do A*
search".

It is fascinating that we don't have a definitive method for formatting, yet.

[http://beza1e1.tuxen.de/articles/formatting_code.html](http://beza1e1.tuxen.de/articles/formatting_code.html)

~~~
munificent
A* implies that you have a heuristic function pointing towards a known
destination. It's about knowing _where_ you want to go, but not _how to get
there_.

In this case, we don't know where in the solution space the best solution will
be. It's not even that easy to tell if we've found it. So that rules out
simple pathfinding algorithms like A*.

------
eliben
Yup, all of this is fairly tricky for YAPF as well
([https://github.com/google/yapf](https://github.com/google/yapf)). We ended
up reusing clang-format's algorithm

------
amelius
Now try to write a formatter that runs incrementally (i.e., keeps formatting
while the user types).

------
_ZeD_
I suggest you guys to play a little with eclipse and its _configurable_ source
formatter.

~~~
jdbernard
I work in a team that uses the eclipse formatter to enforce our style-guide,
so I get the value. Having said that, _configurable_ defeats the point of a
community-wide formatting tool. From the article:

 _Even if the output of the formatter isn’t great, it ends those interminable
soul-crushing arguments on code reviews about formatting._

With a configurable formatter you just move those arguments into the style-
guide discussions and still have disagreements between people and teams using
different configured values.

------
adultSwim
Good article; poor title.

