
Ask HN: What are best practices for literate programming? - almist
I am going to translate a complex algorithm I wrote, and want to document it using literate programming. What are best practices to do that? What works well in your experience?<p>Also, how to find the best arrangement for code and logic? How to document all the small details without the prose becoming boring or irrelevant?
======
CyberFonic
There is no single way to do literate programming. Certainly Knuth's approach
is getting a bit long in the tooth.

Personally, I simply write a document using MarkDown and then extract the
source code from the generated <code> blocks in the HTML output with a simple
script that adds some boilerplate which renders the source code compliant with
my in-house standards.

I also find that I write the document (specification) first and then extract
the code from that. I have never written algorithms first and then translated
them to literate programming structure. If it is a program first, then I
simply add comments that will allow me to understand the code months / years
down the track. If there are a bunch of files then I will often have a
README.md file in the directory which provides some very high level details.

~~~
Jtsummers
What do you mean "Knuth's approach is getting a bit long in the tooth"? Do you
mean his toolset?

------
thedevindevops
google: '<<your chosen language>> literate programming' and use the templates
there but remember comments evolve with code, so consider them 'first draft'

------
Jtsummers
I like org-mode for literate programming. My better examples are from work and
cannot be shared. Some _very_ rough examples that are online are from my
Advent of Code solutions last year. I used the approach, with org-mode and
Common Lisp, of having an interactive notebook (not unlike Jupyter notebook
and other systems) to develop my solutions. That's why they're rough
(essentially all are first drafts) [0].

At work, I've used it for C++ code to good effect. I've commented on this in
the past, my use at work is for my own understanding or producing a report for
others. Not for development (no buy-in so it'd be a bad idea to push it). I
start with a single code block per file. I'd create one heading for each file
(NB: By the end it could look quite different).

    
    
      * main.c
      // code block of main.c
      * common_structs.h
      // code block of common structs
      * some_file.c
      // code block of some-file.c
    

Then I'd start tearing them apart. Pull out the (in C) usual major blocks of
code: includes, structs and typedefs, function declarations, function
definitions. Then pull out each function (if you have related functions, pull
them out together), same thing with structs and typedefs. Then start
documenting what needs to be documented. If I see:

    
    
      struct point { int x, y;};
    

Maybe I don't go any further. It's pretty clear what's happening. But if it's
more complex, add some documentation. Same thing with functions:

    
    
      int max(int a, int b) { return a>b? a:b; }
    

Why write anything about it? I'd probably create a whole section called
_Utility Functions_ just to contain stuff like that, and only document the
non-obvious ones.

For the complex functions, I break them down into several code blocks based on
their internal logic. Again, there's your signature, then your setup (variable
declarations and initializations), then some series of conditionals, loops,
and other logic. If the flow is unclear, create a block to emphasize the
particular details.

    
    
      This section of code uses a switch/case statement with 20 cases. I'll address each one in turn:
      ** Case 'a'
      code for the case and describe it
      ** Case 'b'
      code for the case and describe it
    
      ** Cases c-t
      All of these are relatively similar, addressing all of them at once here...
    

Repeat this until you hit a level of detail that you or your intended audience
(get proofreaders) are happy with. Reorganize to emphasize critical aspects.
Sometimes the control flow is less important than the primary "kernel" of the
program. I applied this to a CFD program, all the noise in loading data and
printing out results was useful, but not the critical part. I moved them to
later sections and had the main section just discuss the math of the CFD
algorithm and the code that implemented it.

[0] [https://github.com/rabuf/advent-of-code](https://github.com/rabuf/advent-
of-code)

