Hacker News new | past | comments | ask | show | jobs | submit login

I'm not asking this to be snide, but as somebody generally unfamiliar with the literate programming concept - for practical purposes, what makes this different from having well-encapsulated functions with thorough commenting?

For an example of what I mean, see underscore.js [1] and the prettyfied version of the source [2].

[1]: https://github.com/jashkenas/underscore/blob/master/undersco...

[2]: http://underscorejs.org/docs/underscore.html




Most documentation is non-linear. Functions are documented without regard to their calls, etc. The trouble with them is that it's hard to scale up to large sub-systems. If you are modifying some cross-cutting facet of a large system, it's often hard to find the right place to add an illuminating comment given all the different places you are modifying simultaneously. Literate programming is a way to give that place by introducing linear narrative structure in documentation. It can be quite valuable at large scales.

That said, my critique: http://akkartik.name/post/literate-programming (I actually read many of the programs in OP to write that.)


You have an excellent point in there: if you aren't free to put your code in any order you like, then it probably won't fit very well into your prose.


Using LP, comments can be arbitrarily formatted, with TeX equation handling, beautiful figures, tables, cross references, indices, table of contents, etc.

Code can be presented in a way that best suits the reader (versus the writer or the compiler). Though Knuth's examples usually seem to dive right in, lacking the introduction to be provided by the latest stuff in TAOCP.

When I use it, I write it like a paper decorated with code, where the code can be extracted and run. I usually include the necessary make file and perhaps some test data. Entries in figures/tables are sometimes generated automatically by the program being described.

There are lots of old papers and critiques about LP, by Knuth and others. You should check 'em out.


As a fairly modern example, let me suggest "Physically Based Rendering" by Pharr and Humphreys [1]. It is a complete (and fairly large) book in the LP style. There are couple of sample chapters available online that you can view [2][3]. To me, one of the real advantages is that you can present the code and commentary in a logical structure that goes in an orderly progression of concepts for human understanding rather than the more strictly physical structure that a compiler expects.

[1] http://www.pbrt.org/

[2] http://www.pbrt.org/chapters/pbrt_chapter7.pdf

[3] http://www.pbrt.org/chapters/pbrt-2ed-chap4.pdf


As Knuth himself said, literate programming facilitates the task of thinking about a problem without having to use a formal language to construct such understanding bottom-up. Therefore literate programming can, for example, produce much less stack depth in a program while making it easy to understand and maintain. This is quite an interesting aspect with regards to performance of the resulting code, that is often overlooked. He also claims that if TeX had been written without it it would not have a depth of 4-5 subroutines on the call stack, but around 50-100 [1]. He also states that he would not have been able to produce MMIX [2] without literate programming because he would not have been able to reason about it's design bottom up [1].

So, a lot of complexity is added to a program just so that it can be understood and maintained in a formal language; but he also states that many modern comment styles go a long way towards literate programming.

Having worked on large codebases I concur that using literate programming to construct these seems far-fetched and too academic, yet I am enticed by the idea of optimizing code by literate programming since I have seen far too many abstractions for the purpose of maintaining easy to read code.

[1] http://www.codersatwork.com/ [2] http://mmix.cs.hm.edu/


One of the primary differences is that the document can be structured, i.e. ordered, to benefit readability and not for the needs of the software. At the point that source code is needed for tools the literate version can in one step be used to generate it. Even multiple files in a hierarchical directory structure can be generated from a single literate software document.


I like to point out that when Literate Programming was first proposed, software was a great deal less malleable than it is now. LP vs. the rigid structures of the time is one thing, LP vs. the malleable structures of today is quite another. Cast back in time I might prefer LP in the past, but today, it's an awful lot of complexity and a huge layer to add on top of our systems for what is of significantly less relative utility. Very little stops you from having more narratively-coherent code today, in existing systems; the problem is more human than tool now.


To clarify, could be said, literate code is ". . . ordered, to benefit readability and not for the needs of the _compiler_ or _interpreter_". Tangling and weaving are concepts by which compilable or runnable source code is created from the "literate" source, which will not generally be directly compilable. Here's one of many links to more info: http://www.ross.net/funnelweb/tutorial/intro_what.html




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: