

Literate Jenks Natural Breaks and How the Idea of Code is Lost - tmcw
http://macwright.org/2013/02/18/literate-jenks.html

======
randomdrake
Very interesting read. As someone who struggled through the implementation of
Jenks[1] a few years ago in PHP, I can attest to the fact that it was
extremely difficult to find any programming of Jenks that was understandable
and clear. I was able to find some examples of code, but nothing could explain
what was actually happening and, as you said, there were no comments or poor
comments in those examples.

What I was able to find, were a couple of different explanations of the
mathematics. Instead of trying to decode some crappy programming
implementations, I spent my time teaching myself the math to understand what
the representation of the process[2] actually meant. From there, I was able to
understand the iterative process involved and dream up a working solution.

Instead of concluding programmers should be better or knowledge should be more
free, I came to the conclusion that I just needed to understand the problem
and the math behind it better.

Could it be that maybe the assertion regarding the problem of availability is
actually just a problem of lack of understanding or desire for proper
comprehension? I'm not accusing the author of laziness, merely wondering if a
different conclusion and solution could be arrived at with an alternate point
of view on the problem.

[1] - <https://github.com/randomdrake/jenks> \- feel free to check out my
implementation. I wrote it a few years ago, so it's probably not the greatest,
but it's commented, and worked fine and fast.

[2] - <http://randomdrake.com/jenks.gif>

~~~
tmcw
Hey - I should have mentioned your implementation - I stumbled upon it and was
like "whoah, this guy actually wrote this from scratch" :)

In this case, it's a combination - the algorithm Jenks arrived at is not just
the math-implemented, but a clever solution that (afaik) has not been
expressed in pure-math terms.

~~~
randomdrake
Heh, glad you found it entertaining.

This project was very unique and fun for me.

Interesting that you mention the "combination." When I was able to find the
aforementioned image that showed the Jenks method, I knew I had to understand
what it meant and what the symbols were. I found it very cool that the
mathematics were simple and it was the process that made the method work so
well.

Basic exponents and algebra were all that were really needed, but the magic
was in the method; much like good programming.

The insight into the power of elegant processes wrapping simple mathematics is
something I've repeatedly experienced in my programming career. That moment
when you hit run and all the data comes out how you wanted it to. It brings
upon the realization that your result would only be possible with a true
understanding of the process you were implementing. A simultaneous victory and
confirmation of comprehension is a good feeling.

Beauty in the bytes.

------
sklam
Here's my implementation of Jenks in Numba:
<https://gist.github.com/sklam/4979921>

It uses numpy array instead of list. Doing so without Numba is a lot slower
because numpy array indexing and operating on array scalars are slow.

------
NelsonMinar
Fantastic article. I like to think of this phenomenon with a positive spin
though.. I can use the Jenks algorithm without understanding anything about
how it works, just plug it in and go. And with a few test cases I can even
port it to a new language without really understanding it. I admire Tom's work
in digging in and doing it right, but as a journeyman programmer I like that I
can just use it without really understanding it. Sometimes cargo cults work.

------
stcredzero
_> If you’re a coder, consider whether the abstraction of software can be
misused to mask ignorance of basic principles._

One of the problems with Computer Science and Programming, is that such a
phenomenon works most strongly within the field.

I also love this quote the author included: The lack of interest, the disdain
for history is what makes computing not-quite-a-field. – Alan Kay

------
kybernetikos
I do love the docco style documentation, but I tend to think of 'literate' as
a specific thing, <http://en.wikipedia.org/wiki/Literate_programming> which I
don't think this quite matches.

------
shared4you
> In basic benchmarks, it’s 12x faster than a Python implementation

Oh well, my friend, why didn't you use Numpy?

~~~
tmcw
Numpy would be faster, and so would PyPy as I point out. I'm comparing
unoptimized implementations on purpose and not trying to incite some kind of
language-speed-flamewar.

~~~
stcredzero
_> I'm comparing unoptimized implementations on purpose and not trying to
incite some kind of language-speed-flamewar._

Think about this sentence for a few minutes, then come back so we can all have
a good laugh at ourselves and the fragility of human pride.

EDIT: Downvoted? This was not meant with any kind of meanness at all. No
matter what choices an author makes with benchmarks, someone will complain.
The situation is so catch-22, a laughter is the only effective self defense.

------
drewda
For those who care, here's a previous implementation in Python that I've used:
[http://danieljlewis.org/2010/06/07/jenks-natural-breaks-
algo...](http://danieljlewis.org/2010/06/07/jenks-natural-breaks-algorithm-in-
python/)

------
saosebastiao
This is awesome. I feel like I owe you a couple of beers or something.

------
JoeAltmaier
Thanks for working this out of course. But, if the old code produced nice-
looking plots, what can be said of the new, except you like the way it reads
better?

------
ynniv
This deserves a "Documentation FTW" gold star.

