

How GitHub Uses GitHub to Document GitHub - tmm1
https://github.com/blog/1939-how-github-uses-github-to-document-github

======
conorgil145
I personally found this write up extremely interesting and exciting.

I have always been interested in documentation and its order in the priority
list of tasks which a development team has to tackle. It is not an original
observation that documentation is critically important to the success of a
project/code-base and yet it is often the last artifact produced (and many
skip it altogether). I have recently been extremely interested in the idea
that documentation should be moved to the top of the priority list and, rather
than being a duplicative post-processing step, should be the "ground-truth"
for generating lots of the follow on artifacts. For example, write API
documentation first and use that to generate client side libraries, an API
test suite, and server boiler plate code/skeleton.

In my search for existing projects and approaches, I came across many
interesting things.

Swagger:
[https://helloreverb.com/developers/swagger](https://helloreverb.com/developers/swagger)

API Doc:
[http://apidoc.me/doc/gettingStarted](http://apidoc.me/doc/gettingStarted)

Slate: [https://github.com/tripit/slate](https://github.com/tripit/slate)

Write the Docs: [http://docs.writethedocs.org/](http://docs.writethedocs.org/)

It was very interesting to read this GitHub post because they presented yet
another approach to treating documentation as a first class citizen with
different methods to write docs, host docs, and keep the docs updated.

I recently updated the API docs at my workplace to use the Slate tool I
referenced above. We manually write docs in a Markdown file, manually use
Slate to compile the MD file into HTML, and then manually deploy it to our
host. This is approach is incredibly basic and non-scalable, but is light
years better than what we had previously, which was API docs directly in the
repo's README file.

I hope to learn more about the projects listed above (and many others!) as I
explore different approaches for treating docs as a first class citizen and
pick the approach which meets the requirements of my current team.

[EDIT] I am also anxiously awaiting a beta invite for
[http://readthedocs.com](http://readthedocs.com)

~~~
lapfi
I always thought it would be a good idea to generate API documentation from
tests. Kind of the same idea you have but the other way around.

The problem I used to have is that if you write your documentation by hand it
tends to get out of sync with the code. You make a quick change to the code
and forget to update the docs. After a while it's a mess unless you stay
vigilant.

But if you generate documentation from tests they can't get out of sync. The
example output that gets written to the docs comes from the application so it
can't be wrong. And if a test fails, documentation doesn't get written. It
also forces you to write tests which is a good thing. If you don't, you don't
have docs.

I don't know if there are tools like this. I created one for Ruby/Rack apps
that I have used in some of my projects. I think this approach works pretty
well.

~~~
gjtorikian
I wholeheartedly agree with everything above. Two additional and related
thoughts:

1\. The original documentation tool I wrote for Atom, Biscotto, kept track of
the number of undocumented classes / methods:
[https://github.com/gjtorikian/biscotto/blob/59f48ba2621a92ae...](https://github.com/gjtorikian/biscotto/blob/59f48ba2621a92aefec0774844c9d4fa4993307e/src/parser.coffee#L300-L302)
. It was hooked up to CI, and if the count fell below a certain threshold, the
test failed.

2\. Right now, we're exploring into working with JSON schema as a means of
providing both testing validation and accurate documentation. If the schema
says "This REST method expects a parameter of this type," it becomes very easy
to write a test to enforce that behavior; documentation can be easily
generated from it; and of course your production code is safer for it.

I'm a huge fan of introducing more cross-overs between testing and
documentation. I think a lot of time is spent on "clever" (and subjective)
validations like [http://www.hemingwayapp.com/](http://www.hemingwayapp.com/),
but not enough time is spent on basic content checks. It's very easy to drift
code, tests, and docs apart. We need to start thinking about all three of them
working together.

~~~
conorgil145
I could not agree more strongly that working on code, docs, and tests as a
single unit is an excellent approach to development. I think the best way to
convince most developers of doing this is to create tools which make it as
automated as possible. If using a tool actually makes their lives easier to
accomplish something they already do (write tests? write docs?) and they get
the other for free, then that is a huge selling point.

I had a similar idea about using JSON Schema to validate my API. However, a
few issues that I thought of and a few I discovered while researching:

1) I am not aware of a decent JSON Schema editor. It has been on my
wishlist/TODO list for a looong time to write one myself which has similar
capability to something like Oxygen for XML [1], but I have not had the time.
Do you plan to write and update the JSON Schema by hand?

2) I have not researched them in great detail yet, but Swagger [2] and API Doc
[3] seem to have already defined something similar to what I had in mind with
JSON Schema. Have you looked into those tools yet?

[1] [http://www.oxygenxml.com/index.html](http://www.oxygenxml.com/index.html)
[2]
[https://helloreverb.com/developers/swagger](https://helloreverb.com/developers/swagger)
[3] [http://apidoc.me/doc/gettingStarted](http://apidoc.me/doc/gettingStarted)

~~~
gjtorikian
> Do you plan to write and update the JSON Schema by hand?

Currently, yes, unfortunately. ;____; We've talked about doing it in an
intermediate format, like YAML, which is slightly less painful, but....ugh.
Years ago I wrote a very badly-coded-but-essentially-functional tool called
Panino, which converted Markdown-to-JSON:
[https://github.com/gjtorikian/panino-
docs/tree/master/lib/pa...](https://github.com/gjtorikian/panino-
docs/tree/master/lib/panino/plugins).

I see that some people have picked up on that and ran along with it, too:
[https://github.com/apiaryio/mson](https://github.com/apiaryio/mson). So there
might be something there to explore.

2\. I've never looked into either. I know Swagger is used by several
companies, like Twitter, but API Doc is new to me.

------
afarrell
I'm curious, how they write internal-facing documentation and how that effects
the development experience for new github engineers.

Source diving through open source libraries, I've often wished for a
"spelunker's guide": a text file laying out where things were and what I
should read first to build a mental model I could use in understanding the
rest of the source. I'm currently trying to figure out what the best way is
for someone to write a spelunker's guide, especially if they've forgotten what
it's like to be a beginner.

~~~
gjtorikian
> I'm curious, how they write internal-facing documentation and how that
> effects the development experience for new github engineers.

I wish I could show up a sample, but I can't, because it's internal. ;)

Honestly, I think a lot of the engineering documentation started organically.
When you have a small team working on a feature, it's difficult to scale
explanations to the rest of the company. One day someone sits down and starts
writing all their thoughts out in Markdown, and just checks it into a _docs_
folder. That's it. It's easy-to-read, short on code, and usually full of
ASCII, like this:
[https://i.imgur.com/KTbyhyq.png](https://i.imgur.com/KTbyhyq.png)

Writing documentation is the best way to get outside contributors involved
with minimal investment on your part. It also forces you to try and explain
what you've built.

If you can't pretend to go back and look at things like a beginner, grab
someone unfamiliar with the project, and have them describe to you what they
would expect, and how they think they should proceed. They may be able to
provide you with insights on what needs to be described.

------
jondot
I'm planning to build a stack for internal company domain knowledge, and I've
been thinking about middleman
([http://middlemanapp.com](http://middlemanapp.com)) instead of Jekyll.

Middleman has impressive workflows and markdown processing (I'm guessing
parallel to that of the Github/Jekyll solution or better). Also conrefs can be
implemented by simple partials (which makes less contention for the probably
huge conref file)

Though I have to be convinced by trying the Github/Jekyll stack, this does
open my mind regarding Jekyll 2.0. I'm happy to see Github tell us their
Jekyll story :)

~~~
mtmail
you probably mean [https://middlemanapp.com/](https://middlemanapp.com/)

~~~
jondot
Much thanks, fixed :)

------
waldir
Unfortunately the repository (as suggested by the screenshot[1]) seems to be
private: [https://github.com/github/help-docs](https://github.com/github/help-
docs)

I assume that's because they may be documenting upcoming features before they
are announced.

1\.
[https://cloud.githubusercontent.com/assets/64050/5449088/7ad...](https://cloud.githubusercontent.com/assets/64050/5449088/7adf83be-84a3-11e4-8c41-1b3448a2f7df.png)

~~~
fidz
I wonder if they host everything on their own site (Github.com) or their own
Github Enterprise site, which is inaccessible from outside network.

------
Animats
Github's convention that web pages for a project are in a different branch of
the same project is kind of strange.

Also, those things they call "conrefs" are just "macros".

~~~
gjtorikian
I think a macro implies something that can be executed, and (rightly) ought to
cause security-minded folks to double-take.

Conref isn't something we invented, it's straight out of DITA:
[http://dita.xml.org/arch-conref](http://dita.xml.org/arch-conref)

~~~
snogglethorpe
> _I think a macro implies something that can be executed_

That isn't true... Traditionally a macro just refers to a substitution, maybe
(but not necessarily) with parameter replacement, rescanning, etc. I'd say
that lisp-style macros which can execute arbitrary code are actually rather
rare historically....

------
bostonvaulter2
Isn't the three second load page they list on the slow side?

~~~
gjtorikian
I wasn't super thrilled with it either, until I dug in and discovered that
it's the global average. We have a ton of international and mobile traffic,
which factors into this sum:

1\. United States (average load: 1.97 s) 2\. United Kingdom (average load:
2.29 s) 3\. India (average load: 7.48 s) 4\. China (average load: 12.11 s)

Average load of these countries is 6 seconds, which seems absolutely
horrid...until I tell you that the US has about seven times more traffic than
the UK.

I didn't want to fudge the graph and take out those slow outliers--the truth's
the truth.

------
bcRIPster
Yo dawg!

~~~
bcRIPster
Awe, negative points? Really? You know you were thinking it when you saw the
link title.

~~~
scrollaway
If everyone was already thinking it, what makes you think they want to see it
in the comments?

~~~
bcRIPster
Well, given that I imagine only mods can set negative value to a post, I'm
going to infer that the mods didn't like seeing it in the comments.

You know, sometimes it's ok to find humor in things.

~~~
dang
Comment scores decrease when users downvote them. Moderators don't give
comments negative scores.

~~~
bcRIPster
I now understand about how the system works after someone else explained. But,
thank-you for the feedback.

------
forrestthewoods
GitHub Pages is one of the most shocking hacks I've ever come across. Not the
worst mind you, just the most shocking. Most of GitHub is clean and good. But
making a magic gh-pages branches is simply horrific. I'm still somewhat dumb
founded that's the best method they could come up with.

------
SEJeff
This title alone is gitception!

~~~
myared
If you like this, also see the talk titled "How Github uses Github to build
Github".

[http://zachholman.com/talk/how-github-uses-github-to-
build-g...](http://zachholman.com/talk/how-github-uses-github-to-build-
github/)

