
The sad state of markdown processors - soundsop
http://eigenclass.org/R2/writings/fast-extensible-simplified-markdown-in-ocaml
======
tptacek
Wow does simple_markup ever totally miss the point of Markdown. By replacing
#-headers with !, and replacing 1.-lists with #, and replacing _-emphasis with
__, "Simple" is violating the spirit of the format to make implementation
easier.

The point of Markdown isn't to create a super-flexible text2html replacement.
The point of Markdown is to come up with rules that match the way people write
existing email/Usenet messages as closely as possible. Any time you add a
semantic to the format --- like !-headers --- that are virtually never used in
email or Usenet, you might as well go all the way off the reservation and
implement Textile. I hear Textile does tables, too; I'm sure someone's got an
implementation that will generate HTML FORM tags as well. Knock yourself out!

~~~
rmaccloy
That's interesting, but it doesn't really have any bearing on the thrust of
the article; none of those things are cause for the slowness of current
Markdown implementations.

(Also, numbered lists are one of my pet peeves with Markdown, since
renumbering lists sucks. His way seems more usable.)

~~~
tptacek
You don't have to renumber lists in Markdown; it numbers them independently
anyways.

~~~
whughes
Still, I can see that bugging me. I could number a list with all 1.s, but
that's like writing code with no indentation or extra spacing. The style of
source material matters to me.

------
scott_s
The paragraph type in his OCaml program is a compelling reason for why it
OCaml is well-suited for writing compilers:

    
    
      type paragraph =
          Normal of par_text
        | Pre of string * string option
        | Heading of int * par_text
        | Quote of paragraph list
        | Ulist of paragraph list * paragraph list list
        | Olist of paragraph list * paragraph list list
    

The type definition is stated almost the same as the grammar rule would be.

~~~
silentbicycle
OCaml is exceptionally well-suited to writing compilers, or anything dealing
with complex structures of tagged data, really. (See, for instance,
<http://flint.cs.yale.edu/cs421/case-for-ml.html> )

It's a neat language. It's got some implementation and usability quirks, and
it seems to have a rather small / quiet community (so there aren't many books,
though this one is quite good, IMHO: <http://caml.inria.fr/pub/docs/oreilly-
book/> ), but it's worth a look.

~~~
scott_s
I've been reading through The Objective Caml Tutorial, <http://www.ocaml-
tutorial.org/>, but I've yet had an opportunity to implement anything with
OCaml.

I've actually that essay already about how OCaml is good for compilers, but it
really struck me with this example.

------
rmaccloy
Summarized: parsers are better than heaps of regexes, and idiomatic OCaml is
pretty fast.

~~~
silentbicycle
...and the OCaml version is a tenth as many lines of code as the C, and a
quarter as many as Python, Ruby, and Perl. Also, it uses only a seventh more
memory than the C version.

For what it's worth, the OCaml version is a markup processor of the author's
design, and not _exactly_ Markdown-compatible. He argues that it's close
enough for benchmarking purposes, though. (I agree.)

