
A Scholarly Markdown - droque
http://www.scholarlymarkdown.com
======
lorddoig
And for everything else, there's AsciiDoc.[0]

For an extra five minutes learning you get a boatload (think container ship)
more features[1] - it compiles to DocBook: a mature, _actually standardised_ ,
highly structured format, and from that you get HTML, EPUB, PDF, slideshows,
and man pages for free.[0] For math you get MathML, ASCIIMath, and LaTeX
(along with a number of ways to render them.) It has a super nice syntax, is
equally good at little docs and huge books, and you could theoretically write
a proper academic paper in it with the LaTeX backend. And you _always_ know
what's going to happen when you try to mix bold and italic...

Also endorsed by Linus.[2]

    
    
        [0]: http://www.methods.co.nz/asciidoc/#_overview_and_examples
        [1]: Well, five minutes to be able to do everything Markdown can do; everything else will take a bit longer
        [2]: https://plus.google.com/+LinusTorvalds/posts/X2XVf9Q7MfV (comments)

~~~
treerex
I have yet to see a good stylesheet for AsciiDoc's DocBook output for creating
a decent PDF. Do you have recommendations? I love AsciiDoc, but am only using
it to generate HTML right now.

~~~
boundlessdreamz
As @jauco said try Asciidoctor -
[http://asciidoctor.org/](http://asciidoctor.org/)

In addition to default asciidoctor style, there are other themes available -
[http://themes.asciidoctor.org/preview/](http://themes.asciidoctor.org/preview/)
(See bottom right for theme switcher).

You can also take a look at Pro Git book styles - [http://git-
scm.com/book/en/v2](http://git-scm.com/book/en/v2) which was written in
asciidoc -
[https://github.com/progit/progit2](https://github.com/progit/progit2)

~~~
e12e
Thank you so much for the pointer to the git-scm-book (in the context of
asciidoc/pdf output etc). Sadly the pdf-version, isn't exactly well laid out.
It suffers from similar issues that a lot of html-to-pdf-based tools (although
I assume it's sgml-to-pdf in this case?) -- horrible breaks, and it also
"feels" wrong wrt. some spacing/etc. Generally standard LaTeX will look (much)
better than this without tweaking, IMNHO.

On the other hand, they also have an epub style sheet, and it appears (a
little oddly) that the layout of the epub is better than the pdf.

FWIW most "heavily optimized" custom LaTeX styles I've come across tend to
feel like being slapped in the face with MS Word -- and I think I've yet to
encounter _any_ that actually improve on the "standard" styles in any
meaningful way (with possible exception of the APA style, which is ugly, but
as it has to conform to APA, it's ugly by design. And looks better than most
other APA conforming styles I've seen).

Still, having a starting point makes the job much easier -- so this is a great
resource.

~~~
tjl
My supervisor has been writing a book (for many years now) and he has a
heavily customized LaTeX style that actually works quite well for his book. He
has a lot of special needs for formatting so none of the default styles really
worked for him. It's actually a pretty well done dynamics textbook. I've been
looking forward to the actual final published text.

~~~
e12e
You should ask him if he's considered publishing the styles under a Free/open
license. Even if parts of it is very specialized, I'd surprised if it couldn't
work as an interesting starting point for other projects.

~~~
tjl
Next time I talk to him, I'll see about it. I don't talk to him that often now
that I've finished my PhD.

------
rhythmvs
John MacFarlane (Pandoc’s author) is heavily committed to the Common Mark
standardization effort of Markdown (in fact he’s the principal designer of the
Common Mark spec).

Both Common Mark and Pandoc serve different purposes: the first is an
initiative to counter fragmentation/balkanization of the Markdown ecosystem
and in being so has to reckon with backward compatibility, consensus and
adoption. The latter is a document conversion library, which, by design, needs
to reckon with interchangeability between formats and may hence be hampered by
the lowest common denominator as regards feature support. Internally, Pandoc
keeps an Abstract Syntax Tree (accessible in json format), and defaults on its
own flavour of Pandoc Markdown (featuring a Markdown superset of content
element types).

Both the Common Mark community and John MacFarlane have made it clear their
first and foremost focus is on standardization, not so much on extending the
feature set. Yet, scholars and technical writers are in dire need for
something more heavy-weight than the rather small set of features offered by
Common Mark implementations or Pandoc Markdown. Hence the Scholarly Markdown
initiative and the scholdoc reference implementation (Pandoc fork).

More on how Scholarly Markdown came about, can be read on the blog of one of
it’s pacemakers, Martin Fenner; e.g.
[http://blog.martinfenner.org/2013/11/17/the-grammar-of-
schol...](http://blog.martinfenner.org/2013/11/17/the-grammar-of-scholarly-
communication/), [http://blogs.plos.org/mfenner/2012/12/18/additional-
markdown...](http://blogs.plos.org/mfenner/2012/12/18/additional-markdown-we-
need-in-scholarly-texts/)

~~~
fiddlosopher
Although I'm involved in both these enterprises, let's not confuse their
goals.

CommonMark is currently focused on the task of giving a decent spec for the
core syntax and robust, efficient implementations; extensions will wait til
that project is done, but certainly aren't ruled out.

Pandoc has _always_ been in the game of extending the feature set. Here are
just some of the Markdown extensions pandoc supports: LaTeX math (which can be
rendered in a variety of formats, including native Word and MathML), LaTeX
macros, inline LaTeX, automatically numbered examples and cross-references to
these, automatically generated citations (using CSL styles), super and
subscripts, strikeout, figures, YAML metadata, definition lists, several
styles of tables, fenced code blocks with syntax highlighting, header
identifiers, and footnotes. scholdoc just adds a few things on top of all this
(and many of them could be implemented in pandoc filters). As noted in one of
the other comments on this thread, most of the features scholdoc adds are
under active discussion in pandoc as well. So it's not that pandoc and
scholdoc have different aims; pandoc just moves more slowly, because it has to
worry about how features are implemented in many more output formats, and it
operates under some other constraints that scholdoc rejects (e.g. trying to
avoid the use of English words like "Figure" for syntax cues).

------
_-_-_-_
I'm still skimming the documentation so I apologize if the answer is obvious,
but does anyone know offhand why this is implemented as a fork of Pandoc?
Pandoc already has extended markdown features and the creator of pandoc is
very much an academic
([http://johnmacfarlane.net/](http://johnmacfarlane.net/)), so is there are a
reason why these contributions aren't part of pandoc proper?

~~~
aaren
This is a fork because making these changes to pandoc itself needs a lot of
consideration.

Internal referencing and attributes on figures are two things that are
currently being discussed for pandoc. The discussion has been going on for
quite a while though - hence people making forks.

Discussion on internal referencing:
[https://github.com/jgm/pandoc/issues/813](https://github.com/jgm/pandoc/issues/813)

Discussion on image attributes:
[https://github.com/jgm/pandoc/issues/261](https://github.com/jgm/pandoc/issues/261)

~~~
_-_-_-_
Thank you for the information.

This raises a couple more questions for me.

First, when searching I was able to find Martin Fenner's very interesting blog
posts about ideas for a "Scholarly Markdown" and, as those issues and the
first link on the scholarlymarkdown.com site reference, he appears to be
associated with a separate "scholmd" project, also called "Scholarly
Markdown," which is apparently a related project that itself is a fork of the
Python markdown science project:

scholmd:

[http://scholmd.org/](http://scholmd.org/)

[https://github.com/scholmd](https://github.com/scholmd)

Markdown Science:

[https://github.com/karthik/markdown_science](https://github.com/karthik/markdown_science)

However, it's unclear what all of the relationships are between all of these
projects and forks.

Secondly, since some (or all?) of the changes are being discussed in the
Pandoc issue tracker, are these changes intended to be submitted to Pandoc in
pull requests? I don't currently see any.

~~~
timtylin
Note: I'm the maintainer of this project

The series of blog posts by Martin (and his efforts with John in getting
citations to work in Pandoc) was the impetus of this project. I've reached out
to Martin several months ago for comments, but I've not heard from him since.
I guess he's very busy with his day job at PLOS. If he's willing, I'd very
much like to reconcile this project with his efforts. The goal is, after all,
better authoring workflows for all academics compared to the status quo, and
it's going to take some concerted effort to get us all out of this giant
energy well we got going for a few decades now.

~~~
mfenner
Scholarly Markdown is very much a group of like-minded people, and we had a
workshop with lots of good discussions in June 2013
([http://blog.martinfenner.org/2013/06/17/what-is-scholarly-
ma...](http://blog.martinfenner.org/2013/06/17/what-is-scholarly-markdown)).
What it has not been until the recent effort by timtylin is a specific set of
tools, or spec.

Everyone seems to have an opinion on how to do this right, and that is part of
the reason why the whole concept is pretty fragmented. Some of my thoughts:

Pandoc is the markdown converter that comes closest to what most people need,
so I am happy to stick with it. I personally don't think that a fork is
viable, things are already hard enough as it is.

Scholarly markdown is a solution for 80% of use cases, people writing math-
heavy texts are probably better of sticking with Latex.

Scholarly markdown needs to be a community effort, I don't see any other way
on how this can succeed

~~~
timtylin
Hi Martin! Thanks for dropping by.

> I personally don't think that a fork is viable, things are already hard
> enough as it is.

I don't think so either. Scholdoc as a fork was always intended to be a stop-
gap measure to quickly test out ideas. Pandoc's use of relatively standard
Parsec is easier to hack, and lots of other subsystems like citeproc remain
crucial. Scholdoc changes Pandoc's AST, so any discussion of re-integration is
going to be a non-starter until at least 2.0

For this kind of workflow to be viable, 95% of the required effort is not
going to be on the syntax/converter anyways. The _real_ hard work is still
ahead.

> Scholarly markdown is a solution for 80% of use cases, people writing math-
> heavy texts are probably better of sticking with Latex.

I agree, except I also think that there can be a 80% situation for math. I
work with a lot of applied mathematicians/electrical engineers, and the math
system in Scholdoc is designed with them in mind.

I really think that the ultimate goal is to arrive at many good ways (of which
this may be one) to produce semantically-relavant open interchange format such
as JATS. I assume this is what PLOS is trying to achieve as well? I do know
that several people at PLOS is _vehemently_ opposed to Markdown and what it
stands for.

> Scholarly markdown needs to be a community effort, I don't see any other way
> on how this can succeed

Definitely. The best we can hope for is to occasionally stir this pot once in
a while and hopefully something will spontaneously nucleate once the time is
right.

------
ivan_ah
A similar effort of introducing math/figures/refs into markdown is the
`softcover markdown` syntax. It's basically markdown, with latex commands
allowed:

[https://raw.githubusercontent.com/softcover/softcover_book/m...](https://raw.githubusercontent.com/softcover/softcover_book/master/chapters/softcover_markdown.md)

The beautiful idea in markdown is that it allows you to mix (non-container)
HTML tags in with the .md and it just works. Softcover markdown is in the same
vein, allowing the more readable markdown for main copy, and intermix LaTeX
tags as required. Beautiful if you ask me. Or at least beautifuler than
```math ... ```math.

The "backward compatibility" of ScholarlyMarkdown with basic markdown is a
cool feature as many tools/plarforms exist that "support" .md now, but to
preview you'll still need something that renders the equations, so strictly
speaking ScholarlyMarkdown is a new markup langauge.

------
ciroduran
It seems that Markdown is starting to get a lot of disgregated reference
implementations. Some months ago some important Markdown users were behind
CommonMark ([http://commonmark.org/](http://commonmark.org/)). I'd think that
it would be best to join this conversation rather than splintering Markdown
even more.

~~~
netheril96
I'm in fundamental disagreement with the principle behind CommonMark. They
prioritize standardization over practical usability, and for that reason
people _will_ keep splintering markdown.

For example, they do not plan to add syntax highlighting blocks (the ```some
code``` on GitHub) to their implementation, because they believe that it is
outside the scope of markdown. Then, because a lot of people actually _need_
this feature, they still have to patch or extend or plug-in the functionality
into any implementation of CommonMark, leading to fragmentation again. And
frankly, the reason that I start writing more markdown is precisely because of
the syntax highlighting ability. Oh, and to write scientific articles, math
formulas are a deal-breaker.

They want to have a standard, unambiguous syntax specification, a suite of
comprehensive tests and a cleanly implemented parser. They want to unify the
community of markdown users and developers. All of those are commendable
goals. But at the end of the day, it doesn't satisfy my needs, so I'd rather
use a messy, poorly specified markdown flavor or even just render the markdown
with GitHub's service.

~~~
fiddlosopher
"For example, they do not plan to add syntax highlighting blocks (the ```some
code``` on GitHub) to their implementation, because they believe that it is
outside the scope of markdown."

To correct the record, fenced code blocks have been there from the beginning:
[http://spec.commonmark.org/0.18/#fenced-code-
blocks](http://spec.commonmark.org/0.18/#fenced-code-blocks).

~~~
netheril96
I guess many has changed since I last looked at CommonMark and perhaps the
developers are pragmatic after all. I stand corrected and will try out their
implementations now.

------
JetSpiegel
Why? This really feels like cramming a square peg into a round hole. Anytime
anyone wants something more complex than simply lightly formatted text they
will have to use other tool (LaTeX), so why not just use LaTeX in the first
place?

~~~
droque
The way I see it, the main feature is being able to plug math formulas in,
rather than formatting documents. It allows you to focus on your content (one
feature of Markdown) while still being able to write mathematics.

~~~
mturmon
"It allows you to focus on your content (one feature of Markdown) while still
being able to write mathematics."

That's what LaTeX does too. Its markup, generally, is semantic.

~~~
sjy
In my experience LaTeX is quite poor at separating content from presentation,
at least when you're unwilling to rely on the default styles. All the layout
commands are context-sensitive, so I often end up writing presentation code in
the middle of my content, or accidentally breaking the layout by moving
content around.

~~~
mturmon
You should be putting your alternate style in a .cls file and not inline. If
you are super picky about format, then none of these tools are for you.

------
stared
This thing is certainly needed (vide this discussion:
[https://hackpad.com/New-scientific-markup-language-
utAjFcYuv...](https://hackpad.com/New-scientific-markup-language-
utAjFcYuvvB)), but what would be really convincing is examples (or I am
missing them?).

As a side note, a lot of Markdown + LaTeX + Code can be done in IPython
Notebook. (Though, there are some things absent, like referencing citations or
other equations).

~~~
lorddoig
I read your link and, per my main comment, I'm pretty sure AsciiDoc answers
99% of your needs.

[http://www.methods.co.nz/asciidoc/#_overview_and_examples](http://www.methods.co.nz/asciidoc/#_overview_and_examples)

~~~
stared
Cool! Any examples of using AsciiDoc for math or scientific notes? (I.e. with
formulae, references...)

~~~
lorddoig
Well here's some formulae:

    
    
        http://www.noteshare.io/section/the-fundamental-class-of-projective-space
        http://asciidoctor.org/docs/user-manual/#using-multiple-stem-interpreters
        http://www.methods.co.nz/asciidoc/latex-filter.html
        http://dblatex.sourceforge.net/example/dblatex/example_mathml.pdf
    

As for references, it has it's own lightweight bibliography system out of the
box, but there's a plugin[0] for BibTeX too, and DocBook has full-on support
for BibTeX so it's just a matter of tooling. AsciiDoc gives you DocBook, and
DocBook gives you pretty much everything.[1] The whole thing is completely
extensible at multiple levels (macros, XSL stylesheets), so adding any
essential features it doesn't already have is certainly much simpler than
starting from scratch!

    
    
        [0]: https://github.com/petercrlane/asciidoc-bib
        [1]: http://pub.hdcrd.com/kb/Dev/Documention/LaTeX/Tool/Dblatex%20%28DocBook%20to%20LaTeX%20Publishing%29/0.3/manual.pdf

------
dllu
I created a similar thing for my personal website:

[http://www.dllu.net/programming/dllup/](http://www.dllu.net/programming/dllup/)

which handles math using svgtex instead of clientside MathJax for faster
rendering. It also compiles to both html5 and LaTeX. However, dllup is overall
less polished, missing some features (labelling equations, sections).

------
samatman
Interesting to see this come up. I've been migrating a couple projects to
babel, the emacs org-mode literate programming style. Starting with the hello-
world of babel, my emacs config file.

Babel and Org have clunky syntax but the mode takes care of that. The
combination isn't perfect but it's very powerful, in particular the ability to
chain several languages together in flexible ways.

Coming up with more ways to combine pretty formatting with syntax highlighting
is polishing a pretty smooth surface. I'm dreaming of a really slick syntax
and operating environment, like babel if it wasn't tacked onto org. In the
meantime, using what we have.

------
nikdaheratik
Alot of interesting work, judging from the site, but this still feels like
reinventing the wheel to me. In addition to AsciiDoc, there's MultiMarkdown
([http://fletcherpenney.net/multimarkdown/features/](http://fletcherpenney.net/multimarkdown/features/)
feature list is nearly identical), and probably another 3-4 that I haven't
come across.

Of course, having spent the past few months writing an app that basically
reinvents this same wheel
([http://www.eqeditor.com/writer/](http://www.eqeditor.com/writer/)) I suppose
I'm in good company.

~~~
nikdaheratik
Another thing I'm curious about is how the ScholarlyMarkdown authors would go
about trying to get papers formatted in this way to go into a database like
PubMed but that's pretty far down the track, and not addressed by any of the
other solutions either.

~~~
timtylin
The goal is eventually to be able to, like what Authorea is trying to do now,
just "slap on" various LaTeX style templates during final rendering (similar
to how static blog engines format plain md posts using templates). It's going
to be PAINFUL but I think it is eventually doable with enough elbow grease.
Markdown is a nice starting point because the features are so spartan that you
actually have _some_ hope of finding the "greatest common denominator"
document model that can pretty much map injectively to all the major journal-
specific LaTeX document-classes.

I'm imagining some kind of online component like ShareLaTeX that will become a
clearinghouse for a number of tested and proven conversion paths, and can
handle compiling a ScholarlyMarkdown document to different formats. This
project won't go anywhere if this can't at least be done for major houses like
Elsevier, PNAS, Phys Rev, etc.

------
xiaq
If you want to write scientific documents with markdown, beware that pandoc
already supports inline TeX (math with $$ and TeX commands starting with \\)
when doing md -> latex conversion. That has been my setup for quite some time.

------
j2kun
If you want me (as an academic) to use ScholarlyMarkdown, then you can't force
my collaborators to use ScholarlyMarkdown as a consequence.

There should be a nice ScholarlyMarkdown -> LaTeX cross compiler for starting
simple documents in TeX and then sharing with collaborators (pick some obvious
defaults or allow a config file to get fancy). But more importantly, if I am
joining a project that already has a bunch of LaTeX wizardry going on, I
should be able to seamlessly and implicitly edit the text parts in
ScholarlyMarkdown without my collaborators knowing.

Can ScholarlyMarkdown do this? If not then I'm not really interested.

~~~
timtylin
I've actually been wondering if this is possible from a theoretical
standpoint. I'm thinking you can use Pandoc's LaTeX to MD conversion mode,
save changes to a copy, wdiff to get the total change set, then somehow
convert that into a LaTeX patch. Whether you can guarantee that this won't
break anything is going to be a really challenging problem, as my head is
already swimming with edge-cases. I guess we'll know the day that some genius
comes up with automatically generated non format-breaking Critic Markup diffs.
At least the problem should be easier than LaTeX --- look how long it took for
the latexdiff tool to be what it is today, and it _still_ breaks often when
you have anything slightly exotic.

I'm currently using a workflow in my thesis where I use ScholarlyMarkdown to
write individual chapters for final inclusion into an existing LaTeX book
document. I find that ScholarlyMarkdown works quite well this way, and it
potentially allows collaborators, since individual parts are isolated.

~~~
j2kun
In any case, my point is plain and simple: I collaborate in almost everything
I do, and my collaborators are my primary concern when I choose a tool. If
they use notepad and Dropbox, I need to make sure my tools don't conflict with
that. There's no way in hell I would ask a collaborator to learn to use git or
a new markup language just to work with me.

I'd love a tool that works with this philosophy, and I feel certain anything
like ScholarlyMarkdown won't catch on in my field (theoretical computer
science) without such tools.

~~~
timtylin
As a thought exercise, assuming for simplicity that you can isolate the parts
you're authoring, so that your changes only involve additions to other
people's work: what kind of additional metainfo would you need for this
workflow? All defined labels and macros? Available bib entries? What else?

~~~
j2kun
I think it depends on how I'm viewing the file. If I'm just editing in SM and
building the tex as usual, then I don't need anything extra. If I'm trying to
convert to all sorts of other documents (which I haven't ever really done)
then labels, theorem environments, one-line macros, and bib entries cover
almost every tex-specific thing I use.

Here is an excerpt from a typical paper's macro section [1]. As you can see
they're mostly one-liners to remove the need to keep typing textup and mathbb,
simple mathoperator definitions and such.

[1]: [http://pastebin.com/RL1gejEZ](http://pastebin.com/RL1gejEZ)

~~~
timtylin
I think a good way to build this into SM is to run an existing document
thorough latex and looking at the aux file. This approach would be faster and
almost certainly more robust, similar to how most latex build scripts look at
the fls recorder file for the list of external includes.

The only only thing you can't get will be the user-defined macros (and of
course bib entries that doesn't already exist), but there is already a
consistent mechanism to define your own macros in SM via the "math_def" block.
They "do the right thing" in the sense that if you render to latex snippets
then it wouldn't redundantly include these macros in the output.

------
bshimmin
I wonder how long it'll be before Gruber complains about the name.

------
Skywing
Part of the beauty of markdown, imo, is its simplicity. If I ever have a
question about markdown syntax, I can glance at the original docs and have the
answer in less than 30 seconds.

------
0942v8653
Looks like an interesting project. I may be missing something, but I think in
terms of math especially it would be hard to use Markdown instead of something
more robust like LaTeX which supports proper mathematical notation.

Edit: yes, I was missing something. Scholdoc supports MathJax (which has
browser, Office, and LaTeX support).

Small issue: The title isn't showing up in Firefox (or Chrome) because there
are two title elements and the first one is blank.

------
mitchi
I actually like the HTML/CSS solution too.
[http://thomaspark.me/2015/01/pubcss-formatting-academic-
publ...](http://thomaspark.me/2015/01/pubcss-formatting-academic-publications-
in-html-css/)

It's more difficult but I feel like there's a lot more potential.

~~~
timtylin
Yep, that's the idea. Published work today should be able to take advantage of
modern screens and its ability to resize, reflow, and animate.

------
zzleeper
This looks very interesting but I have a problem with this. I think MOST of
these features could have been build just with clever uses of Pandoc filters.
This means that i) it's fully compatible ii) it's very easy to extend

------
2mur
Why not asciidoc?

------
smilekzs
Around 2 years ago I wrote a patch for chjj/marked that supports MathJax.
Despite heated discussion, author never noticed and it wasn't merged. Just a
glance at how fragmented this markdown world has become.

------
kbd
I'm annoyed that there's still no way to underline in markdown.

------
samuell
My spontaneous reaction is: Why don't use the existing MediaWiki syntax? It
has gone through the test of years (decades?) of real-life needs, has support
for (latex) formulae, everything you might need around images and references
etc, one of the most powerful template systems I've ever seen, and the list
goes on and on.

Additionally there is a very stable, performant and flexible reference
implementation implemented as a web app (mediawiki) with excellent
import/export in XML format (and the list goes on ...), versioning with
syntax-highlighted diff, etc etc etc.

What have I missed? :)

~~~
duskwuff
MediaWiki syntax is incredibly grody ('''bold''' / ''italic'', tables,
template logic, _etc_ ), and has exactly one implementation in common usage
(MediaWiki itself -- written in PHP, and horribly convoluted), and has no
formal specification. The diff feature you're referencing as a "syntax-
highlighted diff" is a simple diff of the source -- it's entirely ignorant of
formatting, and the XML export is similarly just an export of the raw source.

Nobody likes MediaWiki syntax, not even its own users. It's awful. The only
reason it's still used at Wikipedia is because there's too much content to
reasonably convert.

------
bambax
> _It is a fork of Pandoc and is build upon the same parsing engine._

=> built

