Hacker News new | past | comments | ask | show | jobs | submit login
Four kinds of documentation (divio.com)
717 points by arnsholt on Oct 18, 2019 | hide | past | favorite | 199 comments



I am not sure if it's missing or it's part of one of these four, but another very important part for me is the introduction/README. Probably the most important one. Introductions include:

- Project health indicators, all green. [tests | passing] and such.

- Quick general description of the problem the project solves.

- A simple code snippet showing how easy it is to use it. Not the most complex way of using it as many do, please.

- Screenshots and gifs if it's UI-related. Very important if it's UI-related.

- Quick installation guide if using a common way, or a link to an in-depth guide if it's not easy to install.

- Links to other parts, in-depth articles, etc.

Some examples where I think I got it right (feedback welcome!): https://github.com/franciscop/server https://github.com/franciscop/ola

That said, good documentation takes a lot of effort and time.


A great example of the simple code snippet is Flask https://palletsprojects.com/p/flask/


franciscop/server

> Powerful server for Node.js that just works so you can focus on your awesome project

This is not a "quick general description of the problem the project solves" and I don't even know what the code does after reading this. You can do better than this.


Agreed, I haven't liked that line for a while. What about something like this?

> A server for Node.js that works out of the box with modern Javascript

It is a Node.js server with a bunch of middleware so that you don't need to do common things like body-parser, cookies, etc. It's also based around async/await instead of callback-based, which makes it easier to work with more modern JS and that prevents me from calling something like "express wrapper" or similar.


My first problem is that "a server" only tells me it probably listens on a socket.

Is it a web server, a web framework, a PBX, telnet or general TCP, UDP or Unix socket support functions? Which protocols does it run? Is it a library? Is it a daemon?

It says so little, it could be literally anything.

I guess it has something to do with HTTP and I guess it's a library/framework because I guess a web developer wrote this, but only because it mentions JavaScript and doesn't specify further. But these are still just guesses, I wouldn't know from the text alone.

> works out of the box with modern Javascript

Does it work with other languages too, just not out of the box? If not, the "out of the box" doesn't add anything here. Doesn't similar code work out of the box anyway?


> Doesn't similar code work out of the box anyway

No, very notably both Express and its modern counterpart Koa don't work out of the box and devs using them have to learn, install and configure quite a few packages (middleware). This includes common functions like parsing the body of an HTTP request, parsing cookies, etc. This is the reason I created `server` in the first place, to do `npm install server` and not worry about these things on a per-project basis :)

Thanks for all the feedback, I'll replace "server" for "webserver" in my previous sentence. That alone is a great improvement over the current text IMHO.


That sounds like a marketing slogan, and honestly the JS ecosystem has so many of these that they've become meaningless.

What your thing looks to be is a wrapper over Node's `http` module, with a simplified API. "Powerful" is really not an appropriate adjective, since obviously your lib is limited to what the underlying Node's http module can do (and I'm going to go on a limb and guess that your library doesn't natively handle things like streaming or chunking or UPGRADE for web sockets or HTTP/2 push or N number of other things, considering you claim the library is also "simple").


Instead of trying to over-simplify things for a nice one-liner:

> A server for Node.js that works out of the box with modern Javascript

Just give me the more verbose, but more conversational, explanation.

> It is a Node.js server with a bunch of middleware so that you don't need to do common things like body-parser, cookies, etc. It's also based around async/await instead of callback-based, which makes it easier to work with more modern JS and that prevents me from calling something like "express wrapper" or similar.

Sure it's more words, but I didn't have to think as much :)


Not GP but my question when I read this was “why not Express tho” and “modern JS” (as in ES6 imports etc) did click.


Node.js web server that bundles and configures a lot of middleware so you don't have to. Or something along the lines. Instead of hinting what is the gist of how it works, say it straight away.


It seems to me that a good README is a summary of all four of his documentation types: it should have a brief explanation, a short how-to on getting started, a tutorial or usage example of the system, and a reference in the form of pointers to further documentation. And something he doesn't mention, graphics and screenshots!


> a good README is a summary of all four of his documentation types

Came here to say/reiterate this. A README is very flexible and can be any of the four types, the decision of which will depend on the contents of the repository and on what the README is trying to accomplish.


Half these things should be on the website for the project, not embedded in the source code attached to a readme file (where I might expect a link to the website, not a replacement for the website).


One thing I always want to see in a README: what command do I type to build and run this thing :P


That is not really docmentation. That's Github Geek Marketing. And sadly nowadays most thing published on Github stop there.


I don't really care what you call it. Sure, it's a lot like the sticker on the car window, but it's like business hours or a menu on a website--these are specific things to get me engaged and trivial things to let me kick the tires on a project.

If it's a UI toolkit, I'm not going to download, grab dependencies, and compile it, mock up a basic app just to find out it's the not something I want.


I don't deny it is useful. I just do not conflate it with proper Documentation.


- One sentence describing what a project does (preferably the first sentence of a README)


That should be the introduction (or other elements of) the Explanation, possibly of the Reference.

These suggestions generally correspond to document sections within the Linux Documentation Project HOWTOs, as an example.


> A simple code snippet showing...

And the variable naming to make clear what is user vs. system defined.


My pet peeve is auto-generated documentation from configuration files or source code. It is absolutely useless and I would rather prefer no documentation than auto-generated.

Some time ago Swagger (nowadays OpenAPI) got really popular and many projects "had an API" and pointed users to their green autogenerated API documentation clusterfuck. When time went on this green page would become an indicator for me, that the project does not work and I should be very sceptical - I am sure there are projects that do it better, but for me no-content auto-generated documentation is a real code smell.


I do not share your experience, because in my experience the auto-generated docs will be kept in sync with the code/API while a separate specification will become outdated over time.

This does of course require human-readable description in all the endpoints. But that's the same as only an autogenerated function signature in code documentation vs an added human-readable description.


I've said before that the killer feature that launched Java wasn't garbage collection or checked exceptions, but javadoc. Autogenerating HTML documentation based on strictly-typed interfaces was absolute genius and I really haven't seen it topped by any modern language.


The problem with this family of tooling is that they are essentially fill-the-blanks forms and that all the surrounding ceremony is generated indiscriminately of wether the blanks were actually filled or not. Or worse: (pre-)filled with redundant placeholders like "@return returns the $Typename".

The frustration pointed out by OP is that when you see a page of "blank" generated documentation you never know if there is valuable information waiting for you maybe just one or two clicks away or if it's placeholders all the way down. Consuming a sparsely filled doc almost feels like being trapped in an illustration of the halting problem.

A javadoc/-like implementation that somehow put the actually authored subset into the spotlight while not completely skipping the inferred bits could be very valuable.

(also: a javadoc/-like compiler that detects as much delegation as possible and aggressively pulls in stronger documentation when it is available further up or down the call nesting)


>The problem with this family of tooling is that they are essentially fill-the-blanks forms and that all the surrounding ceremony is generated indiscriminately of wether the blanks were actually filled or not.

Still better than 90% of projects/libs at the time, who didn't have any reference documentation at all.

Just seeing the signatures and packages in an organized manner with cross links (e.g. to parent class, implementing classes) etc, was a vast improvement...


Exatly. A "/ @return String */" is absolute garbage Javadoc, but that does not mean that the concept is bad or cannot be put to good use.


Sure. Even the most sparsely populated javadoc suddenly turns from a nightmare into a valuable improvement when you stop trying to pull information with a browser and just enjoy what the IDE presents when implementing an API client, if it is presenting something. More authored content is still better than less, but the amount of empties remaining stops being a hindrance when consumed through an opportunistic push mechanism.


This is also a great feature of Haskell, especially Hoogle, which is what I miss the most when working in Java. If I want to find a function which, say, removes items from a Map based on a function over values, in Java I have to look and see if it's in the Map class. Nope. Is it in Guava Maps? Ah, there it is, "filterValues".

In Hoogle, I can type `Map k v -> (v -> Bool) -> Map k v` into the search bar, and it finds the function, even though I got the order of the arguments wrong.

https://hoogle.haskell.org/?hoogle=Map%20k%20v%20-%3E%20(v%2...


I'm impressed. Being able to search for functionality by the function signature seems incredibly useful. In the "Verb-Noun vs Noun-Verb" thread from a couple days ago [0], people were saying that OO languages make autocomplete much easier because you start with the parameter you're operating on. But, autocomplete (at least in IntelliJ) only lets you search by method name. There have been lots of times where I want to search by return type or param type instead.

When you're looking for functions, do you generally use Hoogle, or do you have a local autocomplete-like feature hooked in to your editor? I really want to be able to use this while writing code.

[0] https://news.ycombinator.com/item?id=21271212


If you try, you'll discover it is much less useful in Java. That's probably the reason it's not available there.

Pure languages get a lot of hate, but this is the kind of thing you get when you enable better static analysis of your code.


Interesting, because when I switched to Java I absolutely hated (and still do!) the autogenerated documentation. Included was every variation of the constructor. Missing was HOW I would actually call it, and where I would get the params to it.

It's been the better part of a decade, so I can't actually quote correct APIs, but I recall trying to connect to an LDAP server - I just needed to call one of four constructor methods, which seemed to imply I needed an LDAPContext object. Looking at that object told me the 10 bajillion values it had, but no idea how to set them.

Once I saw an _Example_, which IIRC was basically calling a method to clone the default context object and setting the one or two params (such as server url), I could then pass that to one of the constructors I saw the docs for.

The generated documentation was 100% _correct_, but not _useful_.

Other languages I had been in had very example-focused documentation, and were far more accessible and usable as a result, even with occasional challenges where the docs might slip behind - that almost always tended to be corner cases, while the Java approach made the most common need into a corner case.


At the end of the day, it's up to the developer to provide useful documentation.

Especially in the enterprise world, I often come across completely useless code docs, created purely to satisfy SonarCloud or an otherwise stupidly dogmatic gated checkin of the "all public methods must have code docs!" variety. I'm sure many of us have come across it - code docs for a constructor that say "Constructs a Widget", or for an `AddWidget` method that says "Adds a Widget"; utterly pointless. I think of this as "dogma driven documentation".


I'll definitely assert that Rust has nailed the automatic generation of documentation from comments in source code. It's a somewhat unusual language so getting used to how the docs are laid out toook a bit of time, but once I was using the language, the docs are the best I've ever used. Some examples:

https://docs.serde.rs/serde_json/

https://doc.rust-lang.org/std/vec/struct.Vec.html


Typedoc [1] is an excellent documentation generator for Typescript.

And every other modern language, be it statically typed or dynamic (with the help of annotations), has had some sort of auto-documentation generator either bundled or at package manager's reach, and probably any can top javadoc in many ways.

[1] https://typedoc.org/


The .NET standard library docs are a thing of beauty because they intermingle autogenerated javadoc-style documentation with generally well-written freeform "remarks" sections that include more general explanations, context, and code samples.


I've struggled through doxygen generated docs for C++ before, and I think one of the things that helps make javadoc more readable is due to the "stricter" (in a sense) structuring of the java language vs C++: namespaces & objects everywhere make grouping more likely to occur in consistent ways in the code base, which then gets translated to more readable auto-generated documentation.

This is something I think rustdoc (for rust) also has succeeded at, partially for similar reasons.


> I do not share your experience, because in my experience the auto-generated docs will be kept in sync with the code/API while a separate specification will become outdated over time.

Does it? You can just alter the code and forget to alter the documentation above the functions/methods, so I don't think there is much of a difference. And wrong documentation is worse then no documentation.

You have to write your documentation, keep it up to date and take it seriously. I would argue, if you have auto-documentation you are more likely to miss that, because you have some kind of documentation somehow - irregardless if it's actually useful.


With Rust, we help mitigate this by running code examples in API documentation as tests. That doesn't stop people from opting out, and it doesn't solve every problem, but it's still quite useful!


I'm sure you know because you're Steve Klabnik, but for other readers: you can enforce documentation for public items in a crate by adding `#![deny(missing_docs)]` to the `lib.rs` file.

Then it just takes some self-control and gold code reviews to make sure you're writing good documentation rather than just short stubs to silence the error.


You’d think, but I have certainly committed (in both the colloquial and git sense) a dummy /// TODO comment to silence this warning... oops


I'm going to hijack this thread to recommend Rust Skeptic - https://github.com/budziq/rust-skeptic

It checks for rust snippets in markdown files and attempts to build them when you run `cargo test`. It helps keep docs and code in sync.


"Does it? You can just alter the code and forget to alter the documentation above the functions/methods"

Yes you can forget, but the barrier is much, much, much lower than having to modify a document god knows where.


For abstract descriptions of what the code does, sure. But for small things like documenting the purpose of arguments, an IDE should warn about changes that are not reflected in the doc comment (type or a new arg altogether). IntelliJ IDEs are pretty good at that.

Rust for example also warns about code included in the doc comments example section which is invalid.


I've used swagger with java and golang and both of them generate docs directly from the code, no comments needed.


But what's the point then? If there's a tool, that can make "documentation" out of source code, I can just look at the source code?


Mostly it collates all your endpoints more easily, generates clients for different languages, genearates an online documentation and test page that makes your service self describing, and can be published out to third parties that dont have access to your code.


The tool parses out the bits that you're interested in. If you want to know what year the proverbial apple fell on Isaac Newton's head, you can get a biography and start reading, or you can Google and let the machine filter out the irrelevant bits.

If I'm looking at documentation, I want the inputs and outputs and a brief description of what it does. I care very little about the implementation, or I would write my own.


Presumption that all code is open source. The war is won :-)


The article talks about four kinds of necessary documentation for large projects: explanation, reference, how-to guides, and tutorials. Auto-generated docs, even when they're very good, only provide one of those -- the reference.

Even if you have great inline documentation that can be turned into a great external reference document, you need separate documentation.


I've been trying to get my head around a particular Swagger project. Here is a funny email exchange from my request for documentation:

... Hi, I reaching out to to ask if I could get my hands on some documentation because the API is somewhat a black box to me.

... The api documentation for [product] can be found here: https://api.[product].com/

... Sorry. That's not what I mean by "documentation". It's certainly non-linear. I don't know how to "read" this site to gain an understanding. It's kinda sparse:

   GET /v2/adjustments > Implementation Notes: Fetches a list of adjustments.

   GET /v2/reportCategories > Implementation Notes: Fetches a list of report categories.
... Hm, I think swagger documentation is pretty standard among APIs I've worked with before. I'm pretty sure it's all they have.

... "Swagger Documentation" is a special class of documentation for sure; Nobody likes writing documentation.

Talk about insider (them)/outsider (me).


Isn't that the point of an API though? To be a black box? The example you provide is fairly simple: you hit these endpoints and get this kind of data.

If you need to add data to one endpoint and see how that travels to other endpoints, that makes sense as to why you want documentation. In that case, a product / API tutorial or recipe (like the author suggests) might be useful.


I work with a very large, complicated piece of software which has quite a comprehensive API but it's basically CRUD on top of a database. There is zero documentation about what happens when you update an object - only OpenAPI. To find that out, you would have to dig in to the database triggers. Half of working with it is trial-and-error and the other half is hope-and-pray.


Same here. Why is documentation standard so low? Tell me how that buffer management works (do I provide it? delete it? when? how?); how threading is supported (reentrant? send/receive at the same time/different threads? interprocess?); dependencies (necessary initialization? teardown? states in between?); efficiency (can I hold a lock around the call? does it block?).

Instead, we often get nothing but a method name and argument types. Ridiculous.


>> Same here. Why is documentation standard so low?

Because those who control resources make a conscious decision to prioritize new features and/or bug fixes rather than documenting what exists already.


I really like writing the first draft of documentation. I get to run through the endpoints and even fine tune some of the details to fit to the big idea I'm creating about what the software does and how to use it best.

The problem comes months later when I'm in the weeds and a colleague asks a question. I try to fob them off to the documentation, but some details are out of date. I pray it's just one detail, and that I don't have to stop what I'm doing to rewrite the documentation now.


That's part of it. Plenty of working developers will also omit tests and documentation even without the feature-factory time pressure though. A lot of times, the need for technical documentation isn't even a blip on the TODO radar. If I had a dime for every README where the last change was "Initial commit" and the file only contained

# <PROJECT NAME> Service

This is the new service for <TASK>.

....I could buy a good README.


Because docs take time to write, and good docs require a passionate dev who cares to write them


Good docs take a dedicated tech writer who cares to write them. A passionate dev may or may not be a decent tech writer.


Also because it's hard work that reduces the employer's dependency on that developer.


...plus they tend to get outdated and out of sync with the code pretty fast!


Not if the first step of updating the code is updating the documentation to reflect the intended state after the code update, preferably with embedded doctests that form part of the definition of done for the code changes.

Sure, if documentation is treated as an afterthought it tends to reflect that attitude.


In my experience, once you've got a bit of documentation, finding everything relevant to an intended change can be non-trivial and error-prone even if it's done first.

For documentation to remain relevant there needs to be some kind of process actually checking every part of it against reality.


Plus you cannot really automate checks for documentation up-to-dateness.

I mean, with actual code you get some help from the tools. No silver bullet, but at least you have type checks, compiler errors, something. But with docs there's no way to automate checks to see if the doc is still relevant and accurate. And what terrible tools are there tend to lead to "boilerplate docs", like those javadocs mentioned in a comment elsewhere.


I mean, where it's applicable doctest is amazing. But that's not everywhere.

I've been wanting some kind of system to add references to tests to documentation, in the spirit of citation. Maybe I will build it at some point.


> documentation is treated as an afterthought

Bingo.

Do note that people here on HN live in a bubble where, for example, writing tests (any tests, not even good tests) is a given. But out there in the world there's plenty of software coding, a lot of it in major companies, where developers think testing is some cute but useless thing they teach you in college and which can be safely skipped, and managers are completely oblivious about this. Same with writing useful documentation.


> Why is documentation standard so low?

Even in companies where good documentation would raise revenue in a way that the sales team notices[1], someone still needs to write it and someone still needs to make the business case for writing it.

Engineers could, but many don't. If you're passionate about good documentation, but don't think you can deliver, it would be foolish to unless someone else is doing the writing.

I'm a bit of an extreme case, but Many engineers feel so incredibly uncomfortable with writing prose that they avoid it. Why? The standard of writing education for STEM-minded people is low. Why? Writing education in high school is focused on literary analysis essays rather than on learning to describe facts and systems with vibrant clarity.

[1] https://getputpost.co/overhauling-api-docs-with-gocardless-9... ---

Anecdote: At age 14, my school had poster which listed the professions one could use mathematics in. Someone pitched us on how much need there was for people who could program. Shop classes and science classes had assignments which were miniature versions of problems we could see in the real world. Nobody did this for literary analysis. I didn't know how to ask "why are we doing this?" other than as a snotty teenager saying "Hey english teacher! Justify why your life's work has meaning." In reality, I wanted to say "I'm having trouble getting oriented around this subject. I'm having trouble understanding what it means to make progress or make something good. Can you help me?"

I searched for writing advice devoured works like Politics and the English Language and Strunk and White. But they just helped me get better at editing, not at putting thoughts onto a blank page.

Anecdote: At age 17, I told my English Literature teacher that I wanted to write really good physics tutorials. She looked confused at me and said "Why? Thats so boring." At age 17, I didn't have the self-confidence to persist to find a different teacher who would be interested in that.

Anecdote: At age 20, in an engineering university, I knew that I struggled with getting the first draft of an essay done. I went to the writing center at my school. But I never built a good workflow with them for how to get the first-draft-writing process. I didn't know how to learn to write without an anxiety so strong that I felt compelled to dig my nails into my skin. I didn't know how to ask professors or TAs for help. I accepted that writing was just staring at the paper until my eyes bled. I wasn't going to learn to write. I endured my required writing classes. hoped that once I graduated, I might be able to work in a way to

Anecdote: At age 29, I had to quit a visa-sponsoring software engineering job and very quickly find a new one, because of my failures with writing first drafts interacted with a business process for immigration-law compliance.

---

I've now found two coaches and plan to spend this Saturday working on a first draft of a blog post and trying some of their strategies. Wish me luck.


One book that (unexpectedly) gave me good tips on 'how to write' was Zen and the art of motorcycle maintenance. At some point the main character, who's a teacher in rhetoric, finds out that no one of his students know how to write. They all learned through analysis how great author have written this or that way, but they don't know where to start. They don't know the great authors probably didn't consciously decide to say 'here I'll put a metaphor, now a simile, and here you go, ellipsis'... They probably started with something simple, badly expressed, and refined, polished, restarted, embellished, simplified... The bigger hurdle of his students seems to be 'the first draft', the initial idea, the first words. Not even a 'blank page' problem. He gives them exercises in pure description, simpler and simpler and most of his students can't even start the first words. The funny 'solution' is just to start and write... Something... (and I'm not doing the book justice here sorry... I really like it, not as a philosophy reference (I've read many negative critics on this), but how it mixes the pain of being a father or a kid, the pain of mental illness, the pain of feeling cleverer than your peers, and the pain of being a teacher, a writer and a friend. All wrapped in a beautiful and sad storyline, lots of beautiful American scenery, and some advice on motorcycle maintenance...).


>Anecdote: At age 17, I told my English Literature teacher that I wanted to write really good physics tutorials. She looked confused at me and said "Why? Thats so boring."

I laughed and then I got sad


Good luck and, for what it's worth, that was a very well-written comment.


Writing comments doesn't feel like "writing" to me; It feels like talking.

I've actually written some pretty long comments on reddit. Yesterday, I talked with one of my coaches and put some thought into why:

1) I don't have any memories of feeling anxiousness from commenting on reddit. This is unsurprising since it has never been assigned to me by a teacher/parent. If I ever feel like "Its unclear why I would respond to this or what I would say to this", I just choose not to comment.

2) I have memories of writing a comment and other people upvoting it or telling me that it was helpful. I don't have this for essays. I driven by making people happy, so that is a meaningful reward.

3) Because of those positive memories, as I am writing, I can imagine that a sentence I am about to write is going to be helpful. That imagining is a bit of positive re-enforcement that I can chase, inherent to the task. It is like when I was a kid and I would do math homework and I would solve a problem and see that I'd solved it. It is one of the tricks of TDD.

So, my plan this Saturday is to seek out the things that could possibly be intrinsically rewarding about writing:

A) Look for interesting phrases that I can craft to clearly explain something.

B) When I start on a section, write a question that someone could ask on a reddit thread, which this section answers.

C) When I write a section, imagine myself saying this as an explanation in response to that question and imagine someone else expressing gratitude for that explanation.

D) To avoid procrastination, mentally rehearse the act of starting and getting into the task. Simulate the trigger-response-reward in my mind so I can build the neural pathway. The reward I imagine should not be tied to completion, but come from the "I've just gotten started" state.


I think free writing such as journaling can help one get better at writing first drafts. Its important to practice the skill of getting something written.


Yes, since my commute changed, I've been regularly freewriting on the train and it has been helpful.


Another (possible) factor driving this is service- and support-based business models.

Good documentation, enabling user self-support, eats both cost and revenue.

(To what extent this is a conscious strategey and not simply decades-of-experience-born cynicism, I'm not entirely sure.)


Documentation pulled out of code is only as good as the doc comments that have been written. It does have the advantage of making it far easier to write and maintain. If you're editing a function, if the documentation is just above then you're more likely to update it.

This from the Rust standard library is a good example - https://doc.rust-lang.org/std/result/index.html . I think that's great documentation, and it's entirely generated from the source code.

Most rust libraries won't have this level of explanatory detail, the core team have put a lot of effort into making it as easy as possible to learn, and documentation effort is part of that (the Rust book is another important part).

Something else that Rust does well is that 'examples' is a standard part of project layout. For libraries that haven't done their top level documentation well, the examples folder will usually give a good demonstration of how to use the code, and they usually exist because that's the easiest thing for the library author, and the usually compile because they're automatically built by `cargo build`.


I have some Swagger-based documentation for a few APIs I've written, and I find it very frustrating because you can see in the renderers for Swagger that nobody actually documents with it. Nominally, several of the fields are specified as Markdown, allowing for useful formatting in your documentation, but what fields are actually rendered as Markdown in a given renderer is semi-random. The top-level description usually is, but a lot of the other fields come out random. In the worst cases, not only is the "Markdown" content rendered as plain text, it also is simply slapped between <p></p>, so the newlines are eliminated.


> My pet peeve is auto-generated documentation from configuration files or source code. It is absolutely useless and I would rather prefer no documentation than auto-generated.

For most applications this approach is generally useless and should not be used. Comments should be in the code itself, and you expect people who want to work on the application to read at least some of the code. I.e. no reference / API documentation for applications, instead high level overview (where is what, application architecture etc.) and guides (how to set up your dev environment, how to contribute, how to prepare releases etc.).

For libraries it often makes sense to generate a reference documentation from the code itself. The drawback is that the strictly formulaic nature of comments parsed by the documentation generator has to be always kept in mind when writing the code itself. I.e. the comments need to make sense and be comprehensive when you remove them from the code surrounding them.

Some modules in the Python standard library are a good example of this. Quite a large amount of prose, separate from the code, and then a reference section generated from code. However, many modules have pretty bad documentation, where even the reference is missing crucial information (quite often very basic things like what a function returns).


> Comments should be in the code itself

Rust handles this nicely; its generated docs are based off source code comments. Same with Golang.


When I was super green I argued about this with the principal engineer for quite a while about swagger.

Docs generated from code do not define the contract, they describe the code-defined contract, bugs, accidental mutations, and all. How is that not a fatal flaw?


It depends on how you're party to that contract - whether you're involved in making it, or you're a consumer in a take-it-or-leave-it situation.

If I'm a consumer of some third-party API, there's no practical difference between intended behavior, accidental mutation and a bug which the supplier won't fix any time soon - all of these things are equally part of the contract of how The Thing v1.2.3 works, and that's what I want described in the documentation. Any part of the documentation that says what The Thing should do (but doesn't actually do) is worse than useless, it's actively misleading; it describes some wishful thinking with no connection to reality.

If the contract documentation describes an interface between two parts of the system that I control, and I have the ability to fix discrepancies between contract and code by altering the code, then sure, that's a different situation; but if I don't have the ability to make these changes because it's an API to code made, maintained and controlled by someone else, then accurately describing current reality is the most important thing.


You are mostly correct. However the contract can say "X in the API will change without notice so don't use it", which is valuable to know. Often there are things that must be exposed in the API for "reasons" but the user shouldn't use themselves.

Likewise if v1.2.3 has a bug I want to know to not rely on that because I'll probably update to 1.2.4 which fixes it.


I'm for swagger but only if the router uses it.

I keep hearing swagger / contract first. But then they still manually specify `/api/v1/user/login`.

Vertx web api contract router. Which takes in a swagger file, and routes based on the `swagger` operation id. Is the closet I've seen. https://vertx.io/docs/vertx-web-api-contract/kotlin

I've also written a library to route into ktor in a type safe way.

But if you're doing swagger. To me it should be written, then consumed by the back-end service. Then generate front-end clients. Anything less will result in bugs.

I've seen companies pile on so many services. To double check code generated swagger. Hit a bug, then have to maintain the swagger spec outside and not enforce it.


> To me it should be written, then consumed by the back-end service.

This is an option, but I find that it works better to have OAS documents generated from what the server is actually doing. Specifying routes based on what the server actually does is, IMO, a more rigorous way to create an OAS spec than to hand-write the spec and then generate a server from it.

I've written a couple of libraries that do exactly this:

https://github.com/eropple/nestjs-openapi3 - OpenAPI3 library for NestJS that standardizes input validation

https://github.com/modern-project/modern-ruby - a Ruby web framework built around OAS3 concepts + rigorous validation


That's my mantra since 5 years now. Swaggerize-express or swole for the routing, but also swaggering-mongoose or objection's schema loaded from the yaml file for the database model. As a plus, validation is available out of the box.

less code, standard approach, less bugs.


A separate specification works much better as long as that specification is also enforced during the build.

A separate openapi spec that is not enforced can quickly become outdated, then an auto-generated from code is better.


You can add two new columns to your Kanban board called "Documentation" and "Documentation Review".

Then tasks cannot move to your "Done" column unless documentation is written and passes review. If you enforce column limits documentation it will also block other tasks if not completed.


In addition to this (and going a bit off topic). I've been adding checklists to Github PR templates (it's really easy[1]) for things like, "Did you re-read the relevant API docs? Do they need to be changed?" and it helps me a ton.

[1] https://help.github.com/en/articles/creating-a-pull-request-...


Checklists work so well. A kanban board has all the same qualities if used correctly :)


I have about as much confidence that is gonna work as I'd have in a (non-automated) "Test" and "Test review" column


Cool.


I would argue that's even better than documentation that defines how the code should behave.


I use javadocs all the time. Those are generated off comments in code. Is that the kind of thing you mean?


Yes.

Now if you use human language to document your functions (methods) that is not a problem, but too often I see something like:

    public class BookStore {
        ...
        
        /**
        * @param book The book.
        * @return The price.
        */
        public static float getPrice(Book book) {
            return book.price()
        }
    }
No shit sherlock! I admit that this is a contrived example, but you get my point.


> No shit sherlock! I admit that this is a contrived example, but you get my point.

Not a contrived example. There are an absurd amount of libraries that do this and call it documentation. There will be a nice, tidy example of how to use the library with a toy example. That's fine. The intro probably doesn't need to go that deep. Then I move to the technical documentation or API, and that's what they have to offer.


I've written MANY javadocs like this, and I agree they are useless. They are derived from draconian build processes that fail builds when there's not a javadoc present. So for trivial methods like setters and getters this is the kind of comment/javadoc you get from me.

Of course, the other side of this coin is that without these draconian build processes I probably wouldn't write the useful kinds of javadocs I write for significant methods.


Yeah, I was coming to say it can be done right, and that's your point too. If you spend the time to put what a thing does, why it does it, and how it does it, with meaningful hyperlinks to related things, then the auto-collected docs can be really slick.

Maybe "auto-collected" is a better term for this than "auto-generated". I agree that auto-generated docs almost by definition don't add much. But if you go in and write narrative and have it get nicely collected into a slick hyperlinked webpage by things like doxygen and Sphinx, then that's great.


It seems to me that the issue with auto-collected code is that, if done well, it captures the behavior of the code itself. However, it doesn't capture the specification of how the system should work (as opposed to just how it does work) or the higher level design and strategy of the system.


I agree with that too. There needs to be a lot of pure narrative in addition to the auto-collected API Docs. My big project has User Guide (with intro, vision, tutorials and how-tos), Developer guide (with architecture description, requirements specs, implementation overview) and then auto-collected API docs with all the details of how it's currently implemented. In the "notes" admonitions throughout the API docs, there's a some historical information and description of why it is the way it is and how it should ideally be (as appropriate). This feel like it works pretty well. Then again, I wrote a lot of it so I'm biased.

There should be a roadmap somewhere as well, possibly in a Wiki or the developer docs.


It's not a contrived example. I see it all the time in Java codebases, and it drives me mad. I always flag it in code reviews and demand an explanation: "what purpose does this comment serve? What would be unclear if we removed it?".


There is also "@param p price". How about not giving argument single letter name so that you don't have to explain it in comment.


That comment says the code author actively stopped to think about the function and didn't find anything worth noticing about it.

It is a completely different situation from the lack of such a comment, that implies that the author didn't stop to consider the function, and you can find any kind of strange things when calling it.


My favorite by far is

    y++; // Bump y
Instead of

    y++; // Do we need error checking for top of y axis?


My pet-peeve is technical documentation that is not versioned with the code. I think you're confusing garbage-in / garbage-out or interface rendering with auto-"generated" documentation.

Just like you can write bad code, you can write bad docs, not update them etc.

The point of code generated documentation is not to render the interfaces but rather to keep code and docs in sync in the same place. It's more likely you'll se an out of sync / undocumented piece during a code review, in context, etc. then to assume it was updated somewhere else.


We have a similar four-part documentation strategy: Tutorial, Technical introduction pages, Auto-generated API, and Samples

Many people hate auto-generated API documentation because library authors do not write enough of it.

For example here are my project's auto-generated documentation from source code, for two classes:

https://gojs.net/latest/api/symbols/Diagram.html

https://gojs.net/latest/api/symbols/GraphObject.html

That's 1238 words and 1408 words before you even get to the constructor.

There should be a lot of information that comes out of the auto-generated API: What it is, what to know, different kinds of classes interact, and where to go next.

Then of course a primary tutorial: https://gojs.net/latest/learn/index.html

And then conceptual Intro pages: https://gojs.net/latest/intro/index.html (62 of them, covering everything from high level concepts to printing)

Then, since so many people learn by example, hundreds of samples, organized with pictures and tags for each, with an explanation and commented code: https://gojs.net/latest/samples/index.html


If you want your project to get real uptake and usage it's important to have fantastic documentation. Autogenerated documentation does no provide this. It is good as supplemental rather than primary documentation.


I have experimented with some tools to generate 3rd party docs for quick reference (didn't get far though). Not for my projects, but to analyze 3rd party project APIs.

Why do you think it's useless? What part is useless? I am considering taking on this project at some point in the future.

Edit: also I am referring to a library's API while I think you refer to a RESTful API. Would your comment also apply to libraries's API?


might be a swagger/openapi issue. golang for example does generated doc quite well imo. i wish other languages had the same.


I find auto-generated documentation useful for libraries, and for powering intellisence kind of docs for libraries - it's pretty useless for anything other than libraries/APIs.

Unfortunately, in my experience a lot of devs turn on auto docs in their project's settings and call it a day, especially if it's not a library/API!


I feel similar though it’s better than nothing to me. It’s feels like describing what a forest looks and operates by listing every tree in the forest. I want the higher level overview of component parts.


As a practicing technical writer I can testify that these content types are a common way to organize your documentation collection and identify gaps.

It’s a useful exercise to list each doc as a row in a spreadsheet, and then mark whether each doc is a tutorial, guide, conceptual overview, or reference, or a confused combination. Many times you’ll see that you have explained how feature A works but have no tutorial that shows how to use feature A, or vice versa.


As an OSS author this is very interesting, could you share more info or references about this please?

Also I'm curious, how do you become a technical writter? Does it involve writing articles/blogposts/etc to promote the project?



*writer

;-P


Back in the early days, the printed manuals for Research Unix and BSD consisted of two volumes. Volume 1 was the reference and consisted of all the man pages for every command (section 1), system call (section 2), library function (section 3), etc. Volume 2 contained longer documents - what this article calls tutorials, how-tos, and explanations.

The man command let you read all the pages in volume 1. Volume 2 only existed in print, with the troff source in /usr/doc but no obvious way to find it if you didn't know where to look. So naturally volume 2 fell by the wayside. When I was learning Unix in the late '90s and early '00s I had no idea there was supposed to be "official" documentation besides man pages, and filled in the gaps with random web tutorials and borrowed O'Reilly books and other "unofficial" sources.

Nowadays some "Unix purists" are insisting that man pages are all the documentation you could ever possibly need, and if the man page is too long that means the software is too bloated. I find that attitude to be ahistorical. Like anyone's going to learn to effectively use troff and eqn from a cut-and-dried syntax description.

(I could ramble a bit about the other documentation formats that have sprung up to replace troff and how, nice as they can be, they don't replace the convenience of manpages, but this comment is long enough.)


> The man command let you read all the pages in volume 1. Volume 2 only existed in print, with the troff source in /usr/doc but no obvious way to find it if you didn't know where to look. So naturally volume 2 fell by the wayside.

I guess my feeling that man pages were insufficient is not without basis.


Notably, Linux had a whole lot of howtos and faqs—not sure about authoritative sources but I guess The Linux Documentation Project is/was the largest. That was how people learned to do stuff without tearing their hair out, in the early–mid-2000s. I probably still have some of them lying around thanks to stockpiling like the apocalypse is nigh.

Of course, sparse or arcane documentation also leads to proliferation of educating books, by people ready to help for a reasonable sum. The existence of which market should say something about the truthfulness of ‘manpages are enough.’


I've had the cynical thought that Eric Allman made the documentation that came with Sendmail intentionally shoddy to increase sales of the Bat Book. Likewise Larry Wall with older versions of Perl and the Camel Book.


> Volume 2 contained longer documents - what this article calls tutorials, how-tos, and explanations.

Looks like this one: https://wolfram.schneider.org/bsd/7thEdManVol2/


I use lack of ambiguity as a measure of documentation quality. The best (honestly the only good) documentation that I've ever found is at:

https://www.php.net/<keyword>

For example:

https://www.php.net/echo

Note how even this simple arbitrary example tells us "No additional newline is appended." It's shocking to me how many other guides would leave something that critical out of the manual.

Then there are even helpful examples beneath that showcase users' experiences and any errata that they've discovered.

Contrast this with Ruby's manual:

https://docs.ruby-lang.org/en/master/ARGF.html#method-i-prin...

I can infer that to_s is probably to_string. But I'm already hit with several new concepts like $, $_ and $\ which aren't clickable, so now it requires work to track down what they mean. The related methods beneath (like puts) are similarly cryptic. A good percentage of the time in search engine results, I click both the Ruby documentation and Stack Overflow links.

I don't remember ever really learning PHP, because I realized quickly into it that it was a thin wrapper fixing any operating system shortfalls, generally leveraging concepts and contextual cues from C, C++ and the shell. Meanwhile Ruby had one of the steepest learning curves I've ever encountered outside of functional programming, even though it's similarly based on Perl and the shell.

So where Ruby is a "convention over configuration" language, PHP is more of an "existing context over surprises" language.

Writing the documentation for a language or framework can reveal these surprises, and over time, improve the tech itself and lead to a better experience.


Also a major overlooked factor in PHP's documentation success is having a specific URL for each function. Some documentation sites use #anchors to jump to spots in a document (ex: boostrap), but it's not good enough. What ends up happening is people search Google for something granular like "mysql concat" and end up on tutorial sites like w3schools. Why? Because the MySQL documentation throws CONCAT() into a giant messy page called "String Functions and Operators": https://dev.mysql.com/doc/refman/8.0/en/string-functions.htm...

Google "mysql concat" and see for yourself. Giant one-page docs are terrible. PHP got it right from the beginning.


> having a specific URL for each function

Which also results in each function having its own comment section, where users can post further examples and pitfalls with that function. These often end up significantly larger and more informative than StackOverflow.


I have to agree. Php is probably the easiest and moat productive languages I ever learned. The docs were great and the error messages always pointed me im the right direction very quickly. It's no surprise that the language caught on as it did. Ruby, by contrast seemed so hard to learn and I could never see what the advantage was over php so I never had the motivation to push past the difficulties of learning it.


What I’m always looking for with technical documentation - which I rarely if ever am able to find - is, what problem is this thing is trying to solve? How is, say, Angular better than plain-old Javascript? How is Spark better than a shell script triggered by a cron job? How is Spring better than Java by itself? What sorts of problems are they most appropriate for? Sometimes I suspect that I can’t find this information because there are no problems that this thing actually solves…


And how it's worse! It's extremely rare, but some projects do tell you, "If you need X, consider Project Y instead."


Webpages of many many projects do this very badly. I mostly resort to reading the corresponding wikipedia articles, if any, which often are clearer about what the project does and how it relates/compares to other projects.


> Webpages

I've had hit or miss success even when I drop $50 on an O'Reilly book on the topic.


Agree. Comparisons (head to head) are a different category again. And hard to find (apart from as partisan sales pitches).

Particularly useful if you know X and wondering if/why to consider Y.

I often look at “alternativeto.net” to find products/services because we often choose things based on similarity and points of difference with things we already know.


I'd generalize this with a "why". Why does this project exist? Why does that functionality exist? Why does it uses those types? etc

(though you could say that this is part of the "Explanation" quadrant)


What we've learned is that you can roll the how-to guides and tutorials into one. Once a users has learned how to use a service, they only need some inspiration on how to use it in different scenario's

Van der Meij, H Wrote a nice article [1] about minimalism in documentation referring to the first how-to guide (First_Minimal_Manual) [2] on how to use smalltalk for an IBM Displaywriter System (1980). This guide is all about just getting started, and if something goes totally wrong, you just reboot the machine.

People learn by doing. For new user, unfamiliar with your service, you have to reassure them that they can undo everything. This allows them to explore the system freely without any anxiety of doing something permanently wrong.

People prefer to be shown what todo instead of being told what to do. Ikea and Lego manuals for instance never tell you to put screw B1 into hole 7A in the right side of panel 14B Screenshots in your help center articles help a great deal with this. But these are hard to maintain, that is why we created Cliperado [3]

[1] https://www.utwente.nl/en/bms/ist/minimalism/

[2] https://www.utwente.nl/en/bms/ist/minimalism/displaywriter.p... (PDF)

[3] https://cliperado.com


That is interesting - my experience is the opposite, that how-to guides and tutorials are the most-commonly confused types of documentation. My aim is to make the serve totally different needs:

* tutorials: I'm in charge (the teacher) and I know what the new learner needs to grasp and become comfortable with so that they gain sufficient basic confidence and skills. In the tutorials, the beginner doesn't even know what questions to ask or what language to us when asking questions

* how-to guides: the user is in charge; they are able to formulate the questions, and have the basic confidence and skills. What they need from me are the recipes.

As described, it's the difference between teaching a child to cook, and a book of recipes for somebody who already knows the basics of cooking and the kitchen but wants to know how to cook a particular thing.

If you get teaching a child to cook mixed up with a book of recipes everybody concerned will have a bad time. It matters most for the child, they will never want to learn how to cook with you again.

The same go for tutorials in my experience.


The issue is that tutorials and how to guides signs like the same thing. I prefer to use getting started and how to guides.



As long as your tutorials aren't so basic as to be useless in practice you can roll them together sure.


The secret

Documentation needs to include and be structured around its four different functions: tutorials, how-to guides, explanation and technical reference. Each of them requires a distinct mode of writing. People working with software need these four different kinds of documentation at different times, in different circumstances - so software usually needs them all.

And documentation needs to be explicitly structured around them, and they all must be kept separate and distinct from each other.


Python developers: you can now make teaching tutorials in Jupyter notebooks and have them get automatically executed during the documentation build process and converted into theme-matching HTML by Sphinx with an extension [1]. I fired it up the other day and it's really glorious for tutorials. They're guaranteed to be up to date when you build the docs. Before that, I had a unit test that ran the tutorial with comments all over both the doc and the test that said YOU HAVE TO UPDATE THE OTHER IN SYNC!

[1] https://nbsphinx.readthedocs.io/


Not just Python, Jupyter has plenty of backends. Including C++.

Word of warning though, you might be tempted to use the tutorial prototype style for an actual application. That doesn't work in general.


nbsphinx is super handy! A cool tool to combine with it is jupytext, so you can keep your notebooks as rmarkdown files, which is a bit more human readable / GitHub editable.

I did this with a recent library, siuba, and have not regretted it!

https://github.com/machow/siuba/tree/master/docs


Thanks for sharing this guide. It's fitting like a ring to finger as I am in the process of setting up documentation for the features of my app [1] because I realized that as an early-stage startup one of the best ways to teach your users how to use your product is by writing great documentation. I'm finishing the setup of this site within my landing now using Gatsby, on the main domain, so that it can also help to bring in more traffic from search engines.

On the same topic, today I was listening to a podcast [2] titled "Getting traffic to a new website without blogging" which is excelent to match using Divio's guide.

[1] https://standups.io

[2] https://podcasts.apple.com/us/podcast/episode-344-getting-tr...


Agreed. As someone making a very technical product, I can see how lack of documentation hinders my sales process -- potential customers want to try out my software, but the lack of documentation makes it difficult for them to overcome their inertia.

As you did, I plan to spend the next couple of weeks just writing docs. Just want to lend weight to your comment. :)

Thank you for posting the podcast.


Seems we're both in the same boat! Exactly. If on live demos, they go like "wow, didn't know this case". That's exactly what you should go write after. I have a huge list of things to write about.

Hope you enjoy the podcast, there are some gems there about SEO. Ruben Gamez —the person in the podcast— was also technical and learned his way around SEO.

Mind sharing what your product is?


Very happy to share - my co-founder and I started a company called Simiotics, where we offer metadata stores for data, preprocessing functions/transforms, machine learning models, and statistics. We also have tools that integrate with these metadata stores to automate work that most data science teams perform manually today - running preprocessing jobs, updating models in production, monitoring the distribution of data and predictions in production models, things like that.

Our pitch is that, instead of having to do complicated things like set up an Airflow cluster, spin up a Kubernetes cluster and build helm charts, manage Spark, etc., a data scientist can just call out to our APIs from their Python programs (which may be running in notebooks), and we take care of the stuff they need to do but don't want to do.

This is our website: http://simiotics.com

These are our docs: http://docs.simiotics.com (They are in a very sorry state, and it embarrasses us to post them here, but we are going to use that embarrassment to push us to make them better!)


Forgot to add also the guy from SimpleAnalytics, he's doing exactly the same with documentation [1] in his app.

[1] https://docs.simpleanalytics.com/


My experience has been that having names for things makes it easier to think and communicate about them; including in documentation.

For example, once I learned the term "tail" I no longer had to say "every element except for the first one".

As another example, learning about "complete" versus "partial" functions gave me the vocabulary to better understand and communicate about certain types of errors.

Does anyone know of any resources that describe different types of useful vocabulary such as this?


Being THE term is the holy grail for consumer brands. You can only get huge when your brand is the synonym for a whole concept or industry. Like ColaCola, Uber, Facebook, Tweet etc etc. For example Mark's explanation of Facebook [1] or the Photocopier sketch [2]

Edit: fixed link order

[1] https://youtu.be/cUNX3azkZyk?t=135 (video)

[2] https://www.youtube.com/watch?v=PZbqAMEwtOE (video)


You've got your hyperlinks on backwards.


I find "first" and "rest" to be better names for "first element" and "all but first element". Whereas "head" and "tail" sound like the same type of variable, but are different types (element vs list of element).


It's a bit of a generic issue. Until tail becomes a norm, it's hard to understand. I found embedded test/examples pretty great to quickly get the meaning of an idiom.

in python for instance:

    def tail(l):
        '''
        >>> tail([1,2,3])
        >>> [2,3]
        >>> tail([])
        >>> ValueException("undefined on []")
        '''
        # actual logic


I think it is also useful to be aware of the specific terms your team might need and introduce words to those more abstract concepts. This frequently happens with naming specific code patterns or important classes in the code, but can also be for how the team operates in general. A DSL for how your team operates.


A glossary is a good thing to maintain. We keep one often updated in our wiki (at work)


"Documentation needs to include and be structured around its four different functions: tutorials, how-to guides, explanation and technical reference."

I'm a sysadmin and most of the documentation I write is... well it's for me! I do something once and I know I'll do it again, I copy and paste everything I did into our "docs" area so I can just copy and paste it again. I guess that falls under tech reference. My theory for these docs is if I'm not around, someone else should just be able to copy and paste things and not need to learn my job. Doesn't apply to EVERYTHING, but it helps for all the little things.


Presumably, someone else is writing the other 3 types of documentation for the actual users?


Throughout my career as a software engineer, I've found myself reading source code to figure out how a piece of code works when there isn't sufficient documentation. As an example, writing a plugin for collectd is not very well documented IMO. So what does one do? Well, I know C, so I dove into the source code of collectd and was able to figure out how the API works.


At last! I wanted to write something like this myself, only I identified two types so far: tutorials and reference. Lots of projects only have reference docs, some only have tutorials. Using one instead of the other is a pain. Having at least these two would be splendid for most projects.

My ‘favorite’ example (in the bad sense of ‘favorite’) is Ansible, which had only tutorial docs for its YAML-based programming language―which they didn't want to recognize as a programming language. As a result, whenever I needed to look up some feature, I had to guess where in the tutorials it's likely to be introduced. Notably, plenty of important details are delivered as side notes sprinkled liberally all over the tutorial.

(This was the situation with Ansible a couple years ago, something may have changed since.)


It would be cool to have a documentation style that embeds reference in the tutorial à la Tufte notes.


> if the documentation is not good enough, people will not use it.

Counterexamples: people use operating systems, web browsers, various "productivity apps" and games without reading a shred of documentation.


Is that completely true, though? Almost all modern games include a (sometimes optional) tutorial, explaining the basics of how to play the game. Some number (a few? many? most?) of productivity apps will have in-app tutorials to get you up and running. Operating Systems... might have a tutorial? It's been a while sine I booted one up, and I'd likely skip it if present.

On top of that, the GUI nature of these apps makes it easier to get started, I think, and even if there are _no_ tutorials, you can use your previous knowledge of similar apps and play around to understand it - click buttons, tap menus, etc, and learn by doing.

I'm not sure where this fits into the documentation quadrant, but it's important, and is _why_ users can get away without reading documentation.


When Windows was new, Microsoft was militant about everybody sticking to the common controls. They wanted people to get used to how they operated, so that they would be able to instantly operate a new app for the first time.


Kubernetes... (although the documentation has gotten better over the years, to the project’s credit)

Ruby on Rails as well, for the first few years of its existence.


I always found Kubernetes API reference very useful.


Yeah, that part was always fine, but they sorely lacked a theory of operations -- which is essential for any sort of state machine or orchestrator! -- and basic "man page"-type documentation around processes, config files, etc.


API's and programming languages are indeed very difficult to use effectively without documentation.

They are also not operating system user interfaces, productivity tools, games or web-browsers, and so not the topic of my little sub-thread here.



Bad documentation covers obvious things very well, and doesn't cover tricky parts at all. This is especially what you get if the writer of the documentation is paid by the amount of written text: there is no incentive to spend time investigating the hard parts, but a lot of incentive to explain the easy parts in too much detail.


In my experience, it's explanations and how-to guides which are the most essential, with a comprehensive reference filling in the gaps. I've virtually always found tutorials almost completely useless.

I also have a major gripe with the guidelines for how-to guides: these should explain things, most especially where:

1. Not following the process precisely, or appropriately to circumstances will lead to major issues.

2. Where the function, significance, or mechanism of a given step is critical to understanding and correctly applying the tool.

3. Where the reason(s) for choosing amongst a set of options is helpful in making that decision.

One of the best concise distinctions between science and technology I've found is from John Stuart Mill: technology is the study of means, science is the study of causes or mechanisms. Technology tells you how and science explains why. Both are crucial to advanced understanding and use.

This doesn't mean that a cookbook approach needs to have detailed "why" explanations, but it should at least touch on these.

The other hugely useful aspect of a good cookbook is that it shows you the range of performance, capabilities, or applications of a tool. Readers can either hunt through for their specific problem (or something close enough to it to be adapted), or look through the range of applications to get new ideas for projects or products.

One of the best cookbook texts I've ever encountered is O'Reilly's Unix Power Tools, first published in the early 1990s and still relevant. Kernighan & Pike's The UNIX Programming Environment is strongly similar, and despite dating from the 1980s, and being substantially obsolete in part, remains a valuable reference.

Straight syntax guides, say, the Bash manpage, are useful, but are complex and difficult to navigate especially for a novice, and even a user with decades of experience. Tools such as vim and emacs share this problem, and whilst references can be useful for specific command or feature syntax and behaviour, do little to expose the power and capabilities of such tools. Cookbook approaches are far more useful.


this is a great summary. I also like to add proper information about the context, the required knowledge to tackle each part, and the goal of that part of the documentation itself. in most cases, you can write a couple lines at the start of a document saying: "this document explains A. you will be interested on it if B. you should already know about C and D before proceeding. otherwise, you might be interested on E or F instead."

I also noticed an interesting situation when trying to write good documentation: if you start too early, you will have to update it and change it a billion times (both writing the docs and testing the systems will reveal a lot of parts that can be improved). but if you start too late, everything will seem to be ok until you try to write it down. when you have done a few high-level overviews and detailed technical references, that always reveals how the are a number of important parts that could be simpler and/or more harmonic. we always try to make code simpler, but sometimes we only discover simpler ways to express things when we are thinking about them in natural language. or the other way around. perspective++


Good article. The division between different types of documentation is something that I've understood intuitively, but never codified in this way.


Two more categories of documentation: FAQ and trouble-shooting guide. Maybe you could call online chat-bots "documentation" (for trouble shooting or how-to?) but I've never seen one that actually did any good.


Yes, there is a place where you can hear about those things: The Write The Docs community: https://www.writethedocs.org/ (They also organize conferences every year.)

It was really eye-opening when I visited a conference and heard those things the first time, it is highly recommended for everyone! https://www.writethedocs.org/conf/


Thanks. Did not know about writethedocs - that looks like a great community.


I like the way the author use Quadrant Analysis (derived from Gartner's Magic Quadrants research methodology[1], well not exactly the same quardrants but similar way of thinking.) to explain the "right" way to do documentation.

[1]: https://www.gartner.com/en/research/methodologies/magic-quad...


There's more than four kinds; the main point should be that they need to be written with their particular purpose in mind. Here's some categories that apply just to system operations:

  - Reference
  - Discussion
  - Planning
  - Tutorial/Educational
  - How-To
  - Process Template (many, many kinds)
  - Process Implementation
  - Q&A
There are also attributes: Local/Global, Draft, Approved, Certified, Published, Restricted, Versioned, etc. There's the venue: Internal, Customer-facing, Regulatory, Quality Assurance, Development, Managerial, Executive, etc. Then there's the scope of the document: high-level, deep dive, navigation, etc.

When you write documentation, you must know your audience, what they need your document for, whether your document gives them everything they need, whether it's clear & concise, and whether anyone can find it when they need to. They should know when it was written & by whom, what it was written for, who it applies to. It should provide references to everything someone needs to know to make use of the doc. And not only should the document be clear, it has to stylistically express detail and make the document easier to process.


docs.microsoft.com has a taxonomy that's a superset of this: overview, quickstart, tutorial, sample, concept, how-to, reference, resources. Most of the tables-of-contents are organized around this. Example: https://docs.microsoft.com/en-us/azure/app-service/


The only problem with Microsoft is that they delight in rearranging their web site. I'll bet that link is dead within a year.


These 4 split ideas are great. I've previously maintained a wiki for our department and the most viewed documents are always shifting from year to year. Until now, I've never been confident enough to act on the bits of information I've gotten from people who said they liked X article because Y, and now I can see that those articles are a mish-mash of the how-to and technical reference.

Now I will be able to use this as a framework for my continuing revisions, and be able to ensure that for any subject I want to be able to expect others to teach themselves to understand, I need to have the 4 quadrants ready to go.

Sidenote: I loath the excuses I hear so much these days about self-documenting code obviating the need for __any__ documentation or code comments at all. I'm always looked at like I'm a woozle for pushing back on that. I can't decide if that POV comes from laziness or a sense of denial (this is fine) but that's a rant for another submission.


With reference docs, I really value the examples.

Describing the syntax, in a formal way is necessary sure. But I often skip down to the examples and that way I get a feel for it quickly.

Bonus points if the examples are thoughtful in the way they start with simple cases and move up to more complex ones while remaining practical and thus easy to imagine their usefulness.


I like this article, and if I may I'd like to add a more meta comment.

In my experience, the biggest issue is getting people to use documentation systems in the first place. For example I have absolutely grown to hate confluence. Without plugins, and even with, it's a mess that becomes a barrier instead of a conductor.

Therefor, for technical people, I think the best documentation tends to be easily accessible raw text. I personally use a combination of emacs org mode and asciidoc/asciidoctor. If I'm already always in emacs, why not use something already right there, and is quick and easy?

The structure is important, but people just need to actually write the documentation in the first place. So, just write, and you will build the skills to differentiate types as the article refers to.


(great article, docs broke somewhere along the way)

so do it while making the stuff

especially once you reach the end, have it working, and are talking about it as if it's right in front of you

do. it. then!..

don't wait till people are asking about it like it's recently forgotten.. even then, you're talking about it like it's in front of you; good docs time

(point: repair broken links before breaking & appreciate and accept broken-ness as default, afair)

(inside: i have a dream of a well-documented world)

(point2: remember)


I saw Daniele Procida talk at Commit Porto on this subject and it was very compelling to me. I’ve often lacked for a way to structure my documentation and this really resonated with me. I’ve since refactored and expanded documentation for one of my projects [1] to be in this format and I think it has resulted in something more more coherent.

[1]: https://lightbus.org


As someone who might be adherent to "good code explains itself", this is a really great explanation of what documentation should actually be and how it should be organized.

It's interesting how one of the projects that I for a long time have believed to have great docs is VueJS, and that documentation more or less adheres to these principles.


I think the zenith of reference documentation was the Windows API documentation on the MSDN CD's of the late 90's.


Absolutely agree, those were incredible. Looking things up on the web is a pale substitute.


Good documentation IMO strongly depends on the project scope. There are many small projects that benefit from having a single, well structured long readme because it's easier to read+use. But there are some other projects that really need the longer format and everything described in this article.


What I have found helpful with documentation is when it is stored in a single location of the project and automatically formatted to various output formats: markdown, html, stdout, and possibly pdf.


Yes, this is critical stuff!! I've seen a lot of effort put into unguided documentation efforts that put out huge, unapproachable times that helped nobody.


Indeed. We have simple process to create maximum support impact with minimal effort in documentation.

With every support request we ask ourselves: Why is this person contacting us? Is there a simple ui change or wording to prevent confusion. Basecamp coined the term "wordsmithing" for this endless process of fine tuning. [1]

Only after we are happy with the amount of support request a feature generates, we document it.

Doing it this way has a couple benefits. You kind of create a long term user test. You can't spoil the user with knowledge from documentation. There is no way you can give hits to the user to perform a the task. With every support request you can multi variant test your explanation.

[1] https://signalvnoise.com/posts/3633-on-writing-interfaces-we... (2013)


This is a nice article (though i think a bit too wordy). Note that what it calls "tutorial", "how-to guides" and "explanation" is sometimes called "guides", "howtos" and "rationale".

As an application of this, I always thought that the Windows API help, especially those around Win3.1/95 had one of the better approaches for an API/library: the API is split in functional parts/groups (windows, fonts, messages, fonts, controls, etc) and for each group there is an "overview" section (e.g. Windows introduces the windows concept), then a reference (often split in several parts itself) and finally one or two examples (though that was optional).

The help itself didn't have "howtos" or "rationales" but those were available through MSDN (later at least) as knowledge base articles.

(modern winapi documentation is a mess on that regard, especially if you do not already have a vague knowledge of what you are looking for, because even though it is largely the same text, they have split and moved things around too much and put irrelevant distracting links everywhere)

On the Unix world, the original X11 documentation follows a similar pattern, though for other more recent projects there is usually a very one-sided approach: as the article says, most projects only provide API references and perhaps a single (often unfinished) tutorial.

GNU projects usually have documentation in texinfo which is laid out as a book and is often very good (most disagreements come from the default GNU info viewer, not the source documentation system that allows for HTML and PDF output nor really the info format itself that has more usable viewers like tkinfo). It also has a similar approach as the Win3.1/95 docs, though i think the lines between "guide" and "reference" are often blurred. This largely depends on the project though (e.g. the glibc manual has these better divided, whereas the bash manual tends to be more "blurry"). Also i'm not fan of GNU's style of function references - i prefer the more common/manpage-like style where for each function/macro/struct/etc you have an isolated page a very brief description about its purpose, its declaration, a list of what each parameter (for functions and macros) does, its return value (if any), a detailed description (if necessary), any requirements (e.g. headers, for APIs with multiple headers) and links to other relevant functions and guides.

Also i find examples for each function to be nice though this is even more rare than guides.

As a sidenote, i loathe autogenerated documentation and "docgen comments" in source code (and not only because they tend to enforce the "reference-only" approach). I think those should be totally separate and not pollute the code with documentation (especially headers as that makes it harder to read the headers that also act as a quick overview for an API).

Though having a tool to automatically check docs and sources for mismatches (missing functions and/or functions with wrong declarations in the docs) is helpful. But i'm not aware of anything that does that.


> I always thought that the Windows API help, especially those around Win3.1/95 had one of the better approaches for an API/library

You might be right but I don't remember being impressed with Microsoft's documentation. Half the time if was missing APIs for common useful tasks (there was a big community around demoing undocumented APIs -- granted some were genuinely only intended for internal use but there were some APIs that really should have been documented but weren't) and some of the example code Microsoft did publish was next to useless.

There was one occasion when I was teaching myself DDE (anyone else remember that?) and the example looked a bit weird because the example application would launch and call itself. "ok," I thought, "I'm obviously missing some logic when reading through. Maybe I should just run it to test it's behaviour." Five minutes later I was forced to reboot after my suspicions were confirmed -- their official DDE example was literally just a fork bomb. Well done Microsoft /s

However I did learn a valuable lesson that day: never trust example code.


I do not think they ever did a good job when it comes to examples, especially in the earlier days. Others like Borland were much better there.

But ever since 3.1 docs you could learn everything you wanted from the help file alone (3.0 help files were reference only) and i mainly refer to their structure. The content was sometimes a miss (though the worst i can remember is not being sure how region object lifetime was managed since unlike other GDI objects there wasn't any function to delete it).


> most disagreements come from the default GNU info viewer, not the source documentation system that allows for HTML and PDF output

Do you know if there's a way to access info documentation in a PDF reader (or failing that a browser)? I often try to read the info pages then quickly give up because I don't want to fuss with the navigation. I would love to be able to say `info --format=pdf --open-with=evince sed`. Is anything like that possible?

EDIT: Another cool approach that would preserve hyperlinks would be a command that starts a local web server on say 4400 and launches firefox on its index.html.


AFAIK GNOME and KDE's help systems have the ability to view info files and there is also the tkinfo viewer i mentioned.

However note that the "source documentation system" i refer to is texinfo, not info. Texinfo is a preprocessor and documentation language (think docbook) for manuals that produces a bunch of formats, one of them being info, a text-only hypertext format (which was one of the earliest hypertext formats AFAICT). GNU info is a viewer for that format, but there are others, however all of them just view info files and have its inherent limitations (e.g. preformatted text, links using a rigid syntax, etc) - though also they have benefits such as support for topic keywords, indices, etc.

Texinfo can also be converted to other formats like PDF (through tex - i guess the initial version generated only tex and info output, thus the name texinfo) and HTML. So you wont be using texinfo to view info as PDF files, but you'd be using it to generate PDF (via tex).

I guess the "texinfo" and "info" names can be confusing and make people think that they are the same thing - the fact that AFAIK texinfo is the only info file generator (and that GNU info is called just "info" which is the same as the file format name) doesn't help :-P.


I think this article offers some very good advice about the need for different types of documentation to support software. I am far more likely to try a new piece of software if it comes with clearly signposted online documentation covering how to install/use the 'ware, and why to use it, in addition to a well presented technical reference.

In my view, getting the documentation balance right is critical - having too much documentation (I'm staring at Google and AWS here) can be almost as bad as having too little of it. Making the documentation easy to navigate is as important, for me, as making sure the information supplied is accurate and up to date.

My personal experience - principally from attempting to document my Javascript library[1-6] - is that generating the documentation is just the start of the process. Keeping that documentation accurate and up-to-date as I developed the library across minor and major versions soon became a massive burden which eventually led me to put further development on hold; the latest work I have done on the library remains in a branch on GitHub while I think of better ways of developing and presenting the necessary documentation around it.

[1-6] - My different attempts to document my Javascript library, as a demonstration of how messy the whole process can get:

[1] - http://scrawl.rikweb.org.uk/ - the Tour page, with "marketing copy" which attempts to sell the library to potential users.

[2] - http://scrawl.rikweb.org.uk/tutorial.html#HTML5_page - the "Simple Docs" page is an excellent example of confused documentation as it tries to combine tutorial, how-to and explanation in the same document.

[3] - http://scrawl.rikweb.org.uk/demos.html - I added the "Demos" page to support the "Simple Docs" page; in fact the demos were (are) the visual testing regime I developed for the code base.

[4] - http://scrawl.rikweb.org.uk/docs/ - the "Technical" documentation - generated from inline comments in the source code. I chose the wrong tool to do this, as it expects the code base to be object-oriented; the library's Javascript (v6) is procedural/prototypal, and decidedly not modular.

[5] - http://rikweb.org.uk/wp/ - at one point I decided that a good way to supply "how-to" information was through blog posts. This was one of my less clever decisions and quickly abandoned.

[6] - http://scrawl.rikweb.org.uk/learn.html#lesson001 - my best attempt at supplying potential users with tutorial documentation. Embedded Codepens make the experience a bit more interactive, but the results are probably too primary school given that my target audience for the library was more experienced front-end developers.


Well I know they never tell you we don't have any. which would be the truth most of the time.


I love the SQLite source code, it has a good ratio of comments per line of code


Some examples of what I consider to be good documentation:

* RethinkDB

* Stripe

* Python


A


This is brilliant.

"Explanation - Topic" sounds a bit wonky as a section/title. Does anyone have a suggestion what to call those types of articles?


I also use "background" or "discussion". Someone else here suggested "rationale".


How about Deep Dive?


Backgrounder? White paper?


Conceptual overviews




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: