
A Boggling Return to C - ColinWright
http://thraxil.org/users/anders/posts/2011/04/17/A-Boggling-Return-to-C/
======
yason
I started with assembly language and I thought C was heaven. Of course, there
are easier languages, there are more powerful languages, there are fancier
languages and whatnot. I like Python and I adore Lisp. But I still love C
most.

C is the sweet spot where I can extend my programs to do high-level stuff
while still keep my hands down on the actual hardware I'm programming. I like
that a lot, probably because I grew up banging hardware. Or maybe it is
because I like to keep in touch with the actual device that I program: bending
some definitions, _C is what my machine does_. Even if I embed Lua and script
parts of my program, I'm still conceptually working on a C runtime, complete
with addresses, pointers, integers and registers. That's why I also like LLVM
as it abstracts away different instruction sets into a generic high-level
instruction set, much like C abstracts different assembly languages into a
generic high-level assembly language.

C also has the property of being most enjoyable code to read. I've spent a lot
of time just reading C out of sheer enjoyment. C is tricky: it can be a total
mess or it can be terse and beautiful and clear. No matter what, it describes
exactly what it does to my machine. Read some of D.J. Bernsteins source trees
to get an idea of how neat C can be.

~~~
X-Istence
Reading DJB's code can be both awesome in that his code is written clearly and
neatly, yet at the same time it can be an excercise in frustration because he
has foregone most of the standard library and written his own (especially
string management, which is superior, in my humble opinion), so it can look
foreign or weird to people looking at it.

The source to qmail/daemontools is a pleasure to read though and having read
it I feel like I have learned a lot from what it has to offer.

~~~
CUR10US
djb once wrote a FORTH implementation for the IOCC. It would be cool to get it
running under current BSD/Linux/Solaris.

Arthur Whitney is another I would put on par with djb. He's a bit older than
djb.

For expertise you can't beat W. Richard Stevens. He also studies and wrote
about FORTH before focusing on solely on C.

Both Whitney and djb have a true appreciation for speed, efficiency and
succinctness; both have solid foundations in maths; both can build very level
abstractions. But they have different areas of focus.

djb - secure systems administration and networking. (Stevens - documentation.)
Whitney - Lisp background; big data.

Whitney has proven that it's possible, using a matrix-based approach, to meet
or beat the speed of C with an interpreted language.

But it's difficult to write UNIX systems or networking code without knowing C.
For guidance on navigating the many pitfalls of C, djb and Stevens are as good
as it gets.

------
babarock
I absolutely love writing C. Up until late 2011, it was by far the language I
used the most. I am now learning Lisp (not exactly, I'm _using_ Lisp in SICP),
and use Python more than before.

One thing I realized, is that _reading_ C is more tedious than code in other
languages. Sure that's a gross generalization and is not true for every piece
of code out there. However, I find I have less troubles picking up a Python
project, understanding how it's written and start contributing than I have
with C.

A few weeks ago, I was looking at the code of Qemu. The code relies heavily on
preprocessor macros and some weird gcc-only syntax that made my head hurt. It
was difficult.

I guess what I'm trying to say, my only problem with C is that it doesn't
force the programmer to write in a clear understandable way. Or maybe that's
just me.

~~~
luriel
> The code relies heavily on preprocessor macros and some weird gcc-only
> syntax that made my head hurt.

The preprocessor is generally recognized as one of C's biggest flaws, is not
for nothing that Ken Thompson, the first C programmer and the greatest
influence on the language other than DMR himself, cut down most of the
preprocessor when he wrote his own set of C compilers for Plan 9:
<http://doc.cat-v.org/plan_9/4th_edition/papers/comp>

And of course Go has no preprocessor.

As for your second complaint, one can't blame C for gcc's extensions ;)

You should try Go, many people would consider it C's spiritual successor (or
as somebody put it: the language the people who created C would come up if
they had 40 years to think about how to polish and improve it), it keeps all
the simplicity of C, while being probably the most readable language I have
used, it is concise but keeps everything explicit, and figuring what code does
is very easy, because code does what it says and says what it does, no dark
magic needed.

~~~
eli_gottlieb
It keeps all the simplicity of C, without actually being able to write a
kernel or device driver!

~~~
4ad
You can write kernels in Go, it even used to ship with a bare metal runtime
and several people wrote kernels using it.

You cannot easily write device drivers for _existing operating systems_ in Go
because the existing operating systems provide a particular environment
unsuited for Go and expect certain constraints from the device drivers
themselves, constraints which Go breaks. In principle, it could be made to
work.

------
andrewcooke
maybe the biggest improvement in c programming over the last decade is
valgrind, making purify-like debugging available to everyone. it's completely
removed a whole class of annoyingly hard to fix bugs from my code.

~~~
KonradKlause
valgrind, is only available for a few platforms/architectures. If you are
unable to writer proper C programs without valgrind's help, please go home.
:-)

~~~
amouat
Depends what you mean by "proper". Any fool can get a C program to work, but
ensuring there are no memory leaks etc is very hard without a tool like
Valgrind.

If you don't believe me, just run Valgrind on random sampling of C programs
that didn't use such a tool - I reckon it will find issues with most of them.

~~~
dap
Valgrind (and tools using its methodology) isn't the only way to solve these
problems. Libumem finds memory leaks too, and without imposing an immense
runtime cost. It also finds many types of memory corruption without the
runtime overhead that often changes the program's behavior that you're trying
to debug.

------
DanielBMarkham
I'd love to have time to play around with C again. I picked up C back in the
late 80s when I became convinced that no matter what other programming
languages I knew, I wasn't going to be a "true" professional unless I could
sling code in C and C++. Fun times.

I also love the idea of using a trie here. That's something else I've been
wanting to play around with for a while. Although now I'd do it in a
functional language.

He brings up a good point: people coding in C tend to stop and think about
data structures, memory usage, and clock cycles in a way people using higher
languages very rarely do. It's part of the way to "think in C" Internal data
structures are also much more important in FP. Interesting how different
languages cause you to think in different ways. (Sapir-Worf anyone?) :)

~~~
X-Istence
I also think it really depends on how you are taught or where you start...

I started programming in C/C++ (my first book, I'll admit at age 9 was a C++
for Dummies book, it came with a compiler :P).

I have learned and use a lot of higher level languages, but I still think
about data structures, I still think about what the best way would be for
handling the data most efficiently, mainly because I don't want to rely on the
language doing the right thing.

I've seen Java programmers though that then start programming in C++ or even C
and never pick up the art of thinking about their data structures. I work with
one co-worker now that went through the extreme trouble of implementing Java
like enum's which have caused all kinds of "warts" and all kinds of issues
because they are not enums and they aren't "real" classes.

Watching Java developers turn C++/C developers is really interesting, they
bring all kinds of "bad" practices back with them and the code is worse off
because of it!

------
Xion
"By way of comparison, the Python program takes 1.5 seconds to run, so that's
about a 10X speedup."

Only tenfold? Interesting. While Python is surely not the slowest interpreted
language around, a result like that borders on the performance of Java. That
seems unlikely, especially given the fact that Python version uses worse
algorithm.

I would think about how big is the portion of time eaten by I/O - that is,
actually reading the `words` file from disk. I wouldn't be surprised if it
eats most of the ~100ms that C needs to performs the task, leaving only a tiny
percent for actual computation.

~~~
shaggyfrog
> a result like that borders on the performance of Java

I haven't seen the "Java is slow" chestnut in years.

~~~
ZeroGravitas
I _think_ he's saying Java is fast (compared with Python at least).

The point is, if you assign nearly mystical properties to writing in C, but
when you rewrite a brute force approach Python program with a much fancier
algorithm in C and you "only" get 10x speedup then _something_ is amiss.

~~~
Xion
That's exactly what I meant, yes. And I used Java as comparison because of its
speed being not that far from C itself - and certainly surpassing that of
Python by a long shot. I definitely didn't intend to propagate the outdated
"Java is slow" myth.

------
fferen
It's funny, because I recently tackled this exact same problem in Python as
well. And guess what my first step was? Writing a prefix tree implementation,
in order to use the exact same approach the author took in C. It never
seriously occurred to me to do it any other way. It may be because I recently
got into C as well after spending years with only Python, but honestly I think
I would have done the same thing before that; that's just how I think. So I
don't believe it's the language that dictates how much you think about
efficiency, it's the programmer.

~~~
rodelrod
Well I didn't get back into C and last time I had to solve a similar problem
typed:

    
    
      from Bio.trie import trie

~~~
fferen
Right, if time was an issue I would have used an existing implementation, but
since I largely code as a hobby I figured I may as well take the time to
practice implementing stuff like this.

------
thraxil
Aha. Someone posted this up here. That's why I've gotten a huge burst of
comments on the site in the last couple hours.

------
dmansen
I wrote the same program in Clojure somewhat recently. Boggle seems to be a
popular problem. <https://github.com/dmansen/boggle>

~~~
thraxil
Nice. Implementing the same problem in Clojure has been on my todo list. Now
I've seen yours though so I guess I'll have to think up another problem :)

~~~
_sh
May I suggest a wander through <http://programmingpraxis.com/>

------
galactus
The author implements an inneficient algorithm using python and then a better
one using C. He seems to feel python is for dirty brute force and C is for
"real" programming...

~~~
thraxil
Yes and no.

I wrote the Python version in probably under an hour. My girlfriend had gotten
into playing some stupid Facebook version of Boggle and I just wanted to see
her face when I came out of nowhere with implausibly high scores. I didn't
think hard about the problem, just reached for the tool I know best and
implemented the first obvious approach that came to mind. It worked as needed
and I moved on. You make it sound like I think that's a bad thing.

Later, when I had a bit of time to think about it, it occurred to me that a
trie would be a better approach, so when I was feeling like getting back into
C and wanted a toy problem, re-implementing the boggle solver in C with a trie
seemed like a good choice.

The experience of programming in the different languages does feel different
though and I think it can affect how one approaches solving problems. Python
is so good at just letting me solve the immediate problem that sometimes I
rush and don't think things through or settle for a less than optimal
solution. This will come back and bite me if that suboptimal code ends up
getting built on and re-used elsewhere.

When I write C (or Go, Erlang, Haskell, etc. basically any language that
requires me to think a little more up-front about how I'll implement it), I
know going in that I'm going to be putting some serious time and effort into
the code, so I tend to be more careful about things at every stage. The game
changes from "get a result as quickly and painlessly as possible" to "write
something that is elegant in itself". That's not always a win. Sometimes you
are much better off building the prototype quickly, seeing flaws that you
never would've thought of and then being able to approach the problem in a
whole new way. Sometimes you just need a result quickly and time spent making
things elegant or efficient actually is wasted (I'm not going to build a
framework out of the boggle solving code anytime soon, eg).

I code in Python pretty much every day. I have for years. I probably will for
years to come. It works for me. I'm just saying that sometimes other languages
push you in different directions and I can see why, despite taking more lines
of code to write, taking longer to write, having more potential for segfaults,
and so on, languages like C still find a niche for writing systems and
platforms. And that reason isn't just that it runs a little faster.

~~~
IsTom
From my experience Haskell has two faces. One is the one that says "write a
piece of software _right_ with the tools I provide" and another is a messy
"one"-liner (happens to me particularly when using pointfree style) with the
"line" being an indecipherable mess of operators and library functions that is
right as "fire and forget" kind of code. Haskell has a very rich range of
libraries on the Hackage.

------
abecedarius
My Python answer [http://stackoverflow.com/questions/746082/how-to-find-
list-o...](http://stackoverflow.com/questions/746082/how-to-find-list-of-
possible-words-from-a-letter-matrix-boggle-solver/750012#750012) runs in 200ms
on my laptop. It probably isn't actually within a factor of 2 of this C code,
since I don't know what input board it was tested with to get 100ms, and it
matters how much of the dictionary gets pruned while loading. I just stuck in
a random 5x5 board and got 412ms real time, still pretty tolerable.

Edit: The next Python answer there uses a trie and takes 16.7 seconds on the
4x4 board. I like tries because they're elegant, but I hardly ever use them
because the built-in collections are well-engineered even for problems you'd
think are made for a trie.

------
rohit89
>> The C version is functional but will probably make more experienced C
programmers cringe.

Oh boy. This looks like the type of C code I write. Could some experienced C
programmer please point out what parts are cringe inducing ?

~~~
VMG
Not being a experienced C programmer by any measure, but this doesn't look
right to me: <https://github.com/thraxil/boggle/blob/master/boggle.c#L92>

    
    
        struct foo f() {
            struct foo f;
            return f;
        }
    

isn't returning a stack-allocated struct a bad idea?

~~~
tosseraccount
It returns a copy of it. Example:

#include <stdio.h>

struct foo { char space[1024]; };

struct foo f()

{

    
    
        struct foo f;
    
        printf("address of f is %p\n",&f);
    
        return f;
    

}

int main()

{

    
    
        struct foo g;
    
        g = f();
    
        printf("address of g is %p\n",&g);
    
        return 0;
    

}

produces this output:

address of f is 0x7fbfffec30

address of g is 0x7fbffff050

~~~
Jare
If I remember correctly (been a long time), most compilers would implement
many cases of struct return by letting the caller pass the address of a block
of (stack) memory to contain the return value. The function can then optimize
things to get rid of its local copy, operate directly on the intended target,
and the caller doesn't need to perform any copies.

------
dminor
If you wanted to get really fancy, you could convert that trie into a DAWG:
<http://en.wikipedia.org/wiki/Directed_acyclic_word_graph>

------
finnh
Writing a boggle solver was the first assignment in CS106X at Stanford, which
was still taught in C when I took it in the fall of 1994.

The trie code as well as the display UI were provided - you only had to write
the board-walking code.

Being as this was the first time I had ever written any program, I remember it
being quite challenging but also really fun. It was great to see your own
program utterly house you when you played against it.

I wonder if I still have that code somewhere ... it would be fun to look at /
cringe.

------
AlexFromBelgium
I'm a student who learns C#, Java, Php, javascript... at school. I bought "C:
A Refence Manual" to learn this summer...

Hope it will make me a better programmer.

~~~
luriel
If you want to learn C, K&R is the right way to do it.

C: A Reference Manual is excellent and is probably the only other C book you
need, but only if you are already a C programmer, and only for what the title
implies: reference.

~~~
batista
K&R is a book not suitable for beginner tutorial in C, especially modern C. As
a reference type work, ok.

~~~
wnoise
No, but it's perfect for an experienced programmer in other languages to pick
up C.

------
programatico
Did anybody noticed eC language www.ecere.com. Still under early development,
documentation incomplete. Some issues with preprocesor and other stuff fixed.
Definitely has potential.

------
Roybatty
<butthead>Uhh...hehe..uhh..hehe...he said boxen</butthead>

