Hacker News new | past | comments | ask | show | jobs | submit login
Why Markdown Sucks (2016) (joearms.github.io)
51 points by MilnerRoute on April 25, 2019 | hide | past | favorite | 58 comments

For me it sucks because it ignores single newlines

Almost any place where I have to type markdown feels like it's plain text (typing a readme in an actual plain text editor, a git comment in vim, a plain text box like this one here in hacker news and reddit that also ignores newlines, ...), and plain text is supposed to listen to your newlines

But markdown will happily turn your two things you put on separaate lines into a single line, ruin source code or ascii art without adding special formatting, etc...

It should wrap long lines, but should listen to explicit newlines. The simplest text editor can, and does, do that.

I'm old school. I like my source files to be at most 80 characters wide. So for plain text, yes a markup language should reformat paragraphs.

Pre-formatted text is different. But that usually requires a separate context to make sure that all space stays in place, not just newlines.

The problem is that in most cases where I have to use markdown, it's not a source file. E.g. if in a git comment I use a single newline to separate some points, it's often a surprise to then find them concatenated on a single line.

Like this very hacker news comment box here. I know it's not really markdown, but it also ignores single newlines. Why?

Markdown is not really presenting itself as being source code, or at least it's not being used as such in many places. If you want source code, why not use something more powerful than something that tries to look like plain text, except it alters your whitespace?

Also, in programming source code it's natural to try to keep it within 80 characters. But markdown is for writing natural language, isn't it a nightmare then to keep all lines 80 characters if you edit previous sentences or insert a word somewhere?

Markdown has a syntax for bullet lists, why not use that?

The nice thing about standards is that there are so many to chose from. It would be nice if there were a single light weight markup language.

In many contexts you need something where text can be reflowed. Hacker News is a good example. It would be bad if all comments with long lines would show up with scroll bars.

So if the output is going to be reformatted anyhow, then it is better to use a light weight markup language that provides a quick way to provide the most common rendering hints.

And at that point your input text is source.

> It would be bad if all comments with long lines would show up with scroll bars.

No it woulnd't, more newlines (by not ignoring single newlines) would make lines shorter on average, not longer.

It should of course auto wrap long lines (and plain text does), horizontal scrollbars suck.

So markdown only makes you choose between ignoring your single newlines or preformatted text with horizontal scrollbars but cannot combine best of both worlds? That makes it even worse.

> The problem is that in most cases where I have to use markdown, it's not a source file. E.g. if in a git comment I use a single newline to separate some points, it's often a surprise to then find them concatenated on a single line.

Git commit messages (I assume that's what you mean by comment) are conventionally fill-paragraph'd with the traditional 72/80 limit, so I would expect this reformatting to occur. Markdown has proper bulleted list supports, use that. Or paragraphs. It also supports explicit linebreaks (two trailing spaces) but that's usually hard to read.

> Like this very hacker news comment box here. I know it's not really markdown, but it also ignores single newlines. Why?

Well in no small part because it blows chunks.

> But markdown is for writing natural language, isn't it a nightmare then to keep all lines 80 characters if you edit previous sentences or insert a word somewhere?

1. markdown (and most other lightweight markups) grew from email conventions and conventionally formatted text files (check out the text RFCs)

2. in emacs, re-filling a paragraph is an M-q away

> But markdown is for writing natural language, isn't it a nightmare then to keep all lines 80 characters if you edit previous sentences or insert a word somewhere?

No, press M-q, done. But it is nightmare, if the lines are long, or additionally cut in the middle of the word.

Yes - to add to that, Markdown was based on the ways emails were traditionally formatted. Since breaks at 72-80 characters in that context don't reflect a new paragraph, Markdown helpfully reflows them when converting to a variable line width context. This is actually one of my favorite features of Markdown.

So much so that I wish email clients would treat incoming text emails as Markdown, reflowing and formatting them as needed.

I actually set max-width on Hacker News to 80 characters, so I get this (much easier to read) behavior everywhere. :-)

I really like that I can have the source file broken up over several lines but that it still defines a semantic "paragraph" that is rendered properly when convert to HTML or Latex.

If it respected single newlines, then I'd feel compelled to avoid newlines, but I'm trained too strongly on respecting column-formatted text files (though I don't stick religiously to the 72 or 80 char limit), so this would be a problem for me.

yes, it can be unintuitive, but single newline ignorance has its advantages too, for example one can make semantic line breaks[1] for easier diffability.

[1]: https://sembr.org/

It actually comes from plaintext emails. If you read emails like that a lot in a terminal you might agree a blank line between paragraphs will ease the reading.

Markdown is designed around the assumption that you are writing text for humans to read. Thus for a paragraph break, leave a blank line.

ITT people giving completely contradictory opinions on what features markdown should or should not have.

Markdown is (relatively) simple and opinionated. If you don't like it, use a different markdown flavor or different markup language. But don't expect other people to agree with you on the superiority of that markup language, because they probably would have wanted a different one.

This is an old post. After doing some quick searches I think this rant was a bit premature. I think the problem boils down to Jekyll _config.yml options that should have been specified in the GitHub instructions. Specifying kramdown is not enough, I think you also have to specify the Jekyll defaults for kramdown which includes "input: GFM"


Backquote fencing for code blocks seems to be the default in GFM so it not working for the OP seems like a config issue. I'm not sure if the default "smart_quotes: lsquo,rsquo,ldquo,rdquo" is compatible with the LaTex quotes the OP uses but there may be a configuration that works for him as well.

Markdown, like all serialization formats, will always have baggage. Once any syntax makes its way into the wild, it is forever unless you are willing to make backwards incompatible changes. This is also true for public/published APIs. SQL is a bit of both serialization-format-DSL and API-DSL. There are all kinds of brain-dead SQL language choices, both standardized and vendor specific, that are supported forever.

I use Markdown and find it quite useful, even with its problems (which the article does a good job of describing -- the main one being "The nice thing about standards is that there are so many to choose from".)

Apropos of GitHub pages (which I use to host my blog), are you aware that it is possible to generate HTML locally from Markdown, and then publish the generated HTML? FWIW, Octopress (http://octopress.org/docs/deploying/github/) does this (I didn't even realize at first that was what was happening), and that happily insulates you from any changes in GitHub's Markdown support -- as long as your local toolchain remains consistent, content should be rendered the same.

People using the term "sucks" seldom have a valid criticism (and never a subtle one).

The problem here was GitHub making an abrupt change (and switching defaults), not Markdown itself.

Which has its problems, but those are not it.

AsciiDoc is pretty good because it's similar enough and standardized better. Also more extensible, yet a bit more complicated to get running.

AsciiDoc is really underrated. It's the only format that is simple and readable like MarkDown and yet is powerful enough to write whole books.

That may be because of the vicious cycle of lack of popularity, which leads to worse tool support, which leads to people preferring Markdown-based tools, which leads to increased spread of markdown and less popularity for AsciiDoc.

One lesser known fact of AsciiDoc is that it's a markup syntax for DocBook, so it supports the full feature set of this book-creating toolchain. Markdown can only dream of having support for more than simple web pages and single-page documents.

AsciiDoc is also supported by GitHub, I just use README.adoc instead of README.md.

The IntelliJ plugin is excellent.

And https://asciidoctor.org/docs/asciidoctor-diagram/ is amazing. I use PlantUML in it and love it.

FWIW, this blog is written using TiddlyWiki[1] whose WikiText syntax is markdown-like[2]. (Click on the pencil icon to see the raw text).

So perhaps "This Wiki's syntax" can be added to the list of alternatives that the author likes, given that he's written this whole blog in it :)



The post complains that markdown engines render supposedly-the-same-thing in ways different enough to constitute breakage, and that neither engine in question uses a formal grammar to indicate how the language is parsed, making translation between the two very difficult.

The complaint is not actually about any kind of syntax at all -- he mentions that pen and paper is his favorite alternative -- but rather the language's ability to yield consistent and reliable results across different implementations.

And maybe fitting, the blog entry contains broken formatting at the end:

"LaTeX is better than markdown by far, but it has a fundamental

flaw. The computer decides where the text goes, not me - I want to be in command."

>"The computer decides where the text goes, not me - I want to be in command."

Is that possible for any program to do it in plaintext? The only things I know that can fully control character positions are something like inDesign, I don't see anyone would like to write a paper in it.

> Is that possible for any program to do it in plaintext?

Sure, TeX.

The above quote mentioned the quoted one wants to take full control of character position, is this achievable with plain TeX?

Yes, but it'd be painful.

The main benefit of Markdown is its simplicity and intuitivness, it is based on email behavior.

There are certainly way better systems, such as reST. The problem is that reST is way harder to learn and less forgiving then markdown.

I used python Sphynx for project docs and recently moved to mkdocs. With enough customization[^1], I would say that experience is almost identical. The one thing I like more about Sphynx is that you don't need to depend on document locations on file system, and as a older platform it has much better plugins.

On the other hand markdown is everywhere now, IT people usualy know it and non-IT people can learn it too, which is valuable in many domains - for example I ask my BA to write functional specification in markdown and PlantUML so that it can be put in a feature branch along with other code, so devs can follow how it progress withs diffs etc. rather then having one big Word file which is typical otherwise.

[^1] Here is what I use: https://github.com/majkinetor/mm-docs

I guess this isn't so relevant now that CommonMark and GFM specs and reference parser implementations exist, though I know they don't cover some fairly popular things like footnotes from "markdown extra"...

Oomph. XSL-FO. I used that for a custom PDF templating engine once. I think it was called XSL-FOP. Worst XML nightmare that doesn’t lend itself very well to version control. I would prefer markdown any day (or, indeed, using LaTeX with a non-PDF output). Even plain text is preferable.

FO stands for "Formatting Objects" and FOP is "FO Processor", e.g. Apache FOP. The format is not meant to be user-editable, of course, it's an intermediate step, but as a page description language it's very good. I myself use it when converting reStructuredText sources to PDF.

The author makes the point that a standard is required.

I've heard of various implementations like kramdown and redcarpet behaving differently, so it seems that that is the case.

If I'm not wrong, you can use directly the br tag in your markdown (instead of the invisible double white space) to mark end of line. That's enough for me

Markdown sucks because the lowest level headings, the ones you use most often, require the most typing of #s.

Counterargument: you don't always know when you start writing a document how many levels of heading you're going to end up needing. A rough outline is inherently top-down, not bottom-up, so keeping your granularity options open makes sense.

Counter-counter-argument: you can always dynamically adjust heading levels based on how deep you have headings in your document. So maybe:

#### = h1; # = h4 initially.

If you have ##### (5) anywhere in your document, then ##### = h1 and # = h5.

If you have ###### (6) anywhere in your document, then ###### = h1 and # = h6.

Not sure I follow. You're not going to suddenly discover after two days of writing that you need a top-level header for the whole document, but you might very well discover that one of your "leaf" sections needs to be broken up further. The difference being that in your scheme I'd need to change all existing headings in the document to handle that.

Another (minor) thought: if you're the type who starts by outlining the whole doc with headers only, the Markdown approach naturally makes indentation follow structure.

Meaning that if you start off by writing all your top level headings like a normal human being, you'll have to go back and change every single heading once you realise you need a h6?

Atleast in most common use, markdown parsers aren't that smart. They don't do much lookahead, they parse text in a stream with little buffering.

The most complex behaviour is tables, and even there you can easily spot where the parsers simply go character by character along the line with no difference other than being in "table mode" (or for code fences, in "code mode")

The proof of a markup language is if you can document itself in it without too much problems.

Neither markdown nor any of the other light weight marktup languages mentioned here (so far and likely till expiry) can do numbered lists ;)

Perhaps you could explain that a bit more, please? AFAIK, HTML's "<ol><li>" tags work just fine, as does Markdown's "1." syntax...

I'll add to this one: Markdown is perfectly capable of using numbers as ordinals. What it's not capable of—bizarrely—is using letters as ordinals. You cannot have the following:

a) foo

b) bar

c) baz

You have to do that manually. And it won't indent, or re-number automatically, or anything else. I don't know whether Gruber thought "I'll get to that later" or if he isn't the sort of person that ever writes a/b/c/d/e lists, or if he figured it was too hard to adapt the parser for it, but the result is that you can't do that in vanilla Markdown, which means you also can't do that in documents published in Github.

Which is why we've moved our document chain to Asciidoc. :)


I wasn't counting HTML's ol tag, that's not fair :)

1. Nissan 2. Tesla 3. BYD

To (god forbid!) remove Tesla, you have to write

1. Nissan 2. BYD

Now generate a diff:

  -- 2. Tesla --
  -- 3. BYD   --
  ++ 2. BYD   --
Ideally the diff should be:

  -- <some marker> Tesla --

Although I would argue it looks better, you don't have to use sequential numbers when defining ordered lists.

Taken from the Markdown Syntax Documentation (https://daringfireball.net/projects/markdown/syntax):

It’s important to note that the actual numbers you use to mark the list have no effect on the HTML output Markdown produces. The HTML Markdown produces from the above list is:

If you instead wrote the list in Markdown like this:

    1.  Bird
    1.  McHale
    1.  Parish

you’d get the exact same HTML output. The point is, if you want to, you can use ordinal numbers in your ordered Markdown lists, so that the numbers in your source match the numbers in your published HTML. But if you want to be lazy, you don’t have to.

Although not exactly aesthetically pleasing, I actually prefer the 1. 1. 1. layout. Why? Because when you write something it's common to revise and re-order things. A lot. Having 1. 2. 3. means I'd have to go and fix the rest of the list.

I see 1. 1. 1. like appending commas to every field in a struct declaration or an object constructor. It allows me to go back and arbitrarily move things around.

Notice that in HTML5 you do not need to close the li tags

You can use "1." for all items in a numbered list.


    rjp$ diff -u test.md test-notesla.md 
    --- test.md     2019-04-25 10:58:23.000000000 +0100
    +++ test-notesla.md     2019-04-25 10:59:03.000000000 +0100
    @@ -1,4 +1,3 @@
     1. Nissan
    -1. Tesla
     1. BYD

> To (god forbid!) remove Tesla, you have to write

You don't actually have to do that, markdown will convert numbered lists to an HTML ordered list, the numbers are just numbered list markers and otherwise ignored: https://hackmd.io/EVPo3hJpQBenp08uy4ZEuA

This behaviour is usually annoying.

reStructuredText supports "#" as a numbered list sigil (as well as various numeral styles e.g. roman or alphabetic). It also supports a custom start, however if the list is explicitly numbered it does not support "holes" e.g. 3 4 5 is fine (and will keep this numbering in the output), 2 3 5 is not.

Markdown doesn’t suck.

Markdown parsers suck, because they’re all different.

Markdown has a spec, but nobody follows it because they want it to do more.

Obligatory xkcd: https://xkcd.com/927/

Markdown doesn't have a spec. There's a "syntax description", which leaves a lot of ambiguities, and the original Perl script, which you could consider a reference implementation. But there certainly isn't a spec.

(Or, alternatively, there are several competing specs, like CommonMark, Standard Markdown, and GitHub-flavoured Markdown. But your comment seems to be suggesting that there's one authoritative spec, which there definitely isn't.)

Markdown does have a spec created and posted by the person that invented it, John Gruber[1]. The problem is that people keep adding features to it and no one follows the actual markdown spec.

[1] - https://daringfireball.net/projects/markdown/

That's not a spec, and it doesn't claim to be one. It's the syntax description that I referred to above. A spec needs to unambiguously <em>spec</em>ify the grammar.

John didn't invent it for everyone to implement it on everything on the internet. He invented it for him to use on his blog because he's a writer and wanted to focus on his content, not on writing HTML. Just because it's not "specified" that doesn't mean it's not the canonical spec. All of the other additions that aren't on that page aren't based on the canonical version of Markdown.

If your language:

a) covers so little that every implementer feels a need to put their own conflicting extensions on it;

b) cannot be implemented consistently enough that one can be reasonably confident a document will probably render nearly-enough to not be described as 'broken' across engines;

and c) provides a "spec" in the sense of prose describing the author's intended meaning, but not a grammar or other formalization, test suite, etc. that can be considered authoritative for certifying that a given implementation is compliant (where compliant means that engines taking the same input will produce output close-enough to not be considered "broken" when compared to each other);

then your language probably sucks. Maybe it doesn't suck in the theoretical world where everything goes exactly as the language designer fantasizes, but it sucks in the actual world where people live and operate.

Ever tried linting and using prettier with markdown? While working with a team some use atom, some vscode editor? Different linters apply different rules. Ended up having to avoid advanced markup features like lists within lists. Which is annoying when you are writing a readme step-by-step document and need a step 2.b) for example.

This article suggests using XML. XML is based on an older markup language called standard generalised markup language ‘SGML’. Why not go the whole hog and use that?

I found this great little markup language descended from SGML, that is specifically designed for documentation and is great for the use case in this article of blog posts. (The markup language has support for links and is designed to work on screens of any size, all of this was baked in to the language from the start, not bolted on to a scientific paper markup language like LaTex.)

It’s called hypertext markup language or ‘HTML’.

Also, your document can be styled using a styling language called cascading style sheets or ‘CSS’. [But this is optional]. And you can change the styling of your document years later without changing the text.

For blogging, there is also a popular graphic user interface to ‘HTML’ and ‘CSS’ called Wordpress.

In 2019 it powers 1/3 of the Internet.

I tried it and it's awful. Finger-twisting angle brackets everywhere. You have to balance not just closing brackets but also match the exact formatting "tag", so rather than "* Foo" it's "<li>Foo</li>" (urgh). It's actually even worse than XML because you have to do these "closing tags" even when they don't make sense, so e.g. rather than an embedded image being "![mouseover text](path)" it's "<img alt="mouseover text" href="path"></img>" (yes, really). To make it even more fun, if you "cross tags" like "<b>blah <i></b>blah</i>", rather than giving an error it will silently work most of the time except when it doesn't.

The available formatting tags are a complete mishmash. There are two different ways to do a bunch of things (e.g. "<em>" and "<i>" do the exact same thing but you're supposed to use "<em>" because it's "more semantic" - apparently screenreaders don't understand "<i>" even though it's older??). There are distinct tags for acronym versus abbreviation but no-one uses either of them because they don't display consistently. But for actually structuring your text all you get is paragraph and headings - there's no way to mark a section as a summary. Strikethrough is a mess (early versions had a sensible "<s>" tag, but now there are two different "semantic" tags that no-one can agree on).

Don't even get me started on CSS. "You can change the styling of your document years later without changing the text" - it seems like no-one working on this nonsense format ever realised what a terrible idea that is. Your text might refer to the figure on the left, but the "styling" that defines whether the figure goes on the left or the right is in a completely different part of the source. Better still, you have to use this ridiculous action-at-a-distance "selector" language to define which "tags" the styles actually apply to, so you cut and paste a paragraph to a different place and magically all its formatting changes.

Version 3 of HTML was just about usable - you could do formatting inline and there was only one tag for how to do a given thing, and it usually had a sensible name (e.g. b for bold, i for italic). The CSS-based versions are just bad design at every level and I've no idea why anyone would choose them as a source format over literally anything else.

> This article suggests using XML. XML is based on an older markup language called standard generalised markup language ‘SGML’. Why not go the whole hog and use that?

That is exactly what I have done at several occasions at customers that made use of either Docbook or DITA, with graphical tooling like Oxygen XML, Framemaker, DITAworks and so on.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact