

GitHub open sources Linguist - DanielRibeiro
https://github.com/blog/881-linguist

======
kbd
Linguist is GitHub's language identifier that determines which syntax
highlighter should be used when you view a file. Some excerpts from the docs:

> Most languages are detected by their file extension. This is the fastest and
> most common situation. For script files, which are usually extensionless, we
> do "deep content inspection"™ and check the shebang of the file. Checking
> the file's contents may also be used for disambiguating languages. C, C++
> and Obj-C all use .h files. Looking for common keywords, we are usually able
> to guess the correct language.

> The actual syntax highlighting is handled by our Pygments wrapper, Albino.
> Linguist provides a Lexer abstraction that determines which highlighter
> should be used on a file.

It also provides other features like generating stats on a repository.

~~~
john2x
So in order to have syntax highlighting for a language, you have to fork
Pygments, write the lexer, issue a pull request, then finally fork Linguist,
add the new language/lexer, then issue a pull request?

