

Ask HN: How to develop/plan a document format - mgualt

This is a question about how to approach the design and development of a
document format.<p>The basic problem is this: in math (and other disciplines),
we write a LaTeX markup file and then compile it to PDF.
PDF is basically a paper simulator and lacks many features
which webpages/sites can have.<p>An alternative would be to have a versatile but still constrained markup
language (e.g. an extension of LaTeX, or a constrained org-mode) which
can be compiled or "exported" to HTML among other formats.<p>Some of the benefits would include<p>- Folding of text: hiding parts of the document until further details are desired<p>- more sophisticated linking between (parts of) documents<p>- nonlinear/hierarchical document structure<p>- including media such as animation/tutorials<p>- running code in the document<p>The question is: how to approach the development of the <i>system</i>.  For example,
many of the features could be implemented, say in jquery or some such,
but the system should be independent of implementation.  How does one proceed
in a future-proof way whereby the choices are not regretted down the line?<p>Note that this is not about typesetting -- I am aware of the web typesetting
problem. This is about the inadequacy of PDF as a document format for the future.
======
quink
> How does one proceed in a future-proof way whereby the choices are not
> regretted down the line?

What's wrong with HTML that has classnames, or the new data attributes that
HTML5 has? They're not dependent on any one implementation and HTML seems like
it's the most future-proof thing out there apart from plain text.

In order:

\- <div class='expando'>Hide/Show this text.</div>

\- <a href="#something">Go somewhere</a>

\- <a href="otherdoc.html">Hyperlink</a>

\- <video src="hello.mp4" />

\- <code>console.log('hello world!');</code>

Almost any JS developer can make these work without any library, from what the
browser provides, alone. Give the innerHTML to eval, for example. It seems
like the most future-proof way to go, and I'll be happy to be proven wrong and
find out about something I don't know about. But Markdown nor any of its
derivatives don't seem to be good enough.

Of course, you could also write a Markdown parser that has constructs that
will compile to the above too.

> This is about the inadequacy of PDF as a document format for the future.

And I'm guessing, since we're doing this over an HTML document, that that'll
be sufficient instead.

~~~
mgualt
Thanks for the reply:

I don't know about HTML5, so I will look into that. But my concern is more
about how to design the system. Let me explain.

LaTeX is quite a formidable system nowadays, with a well-developed markup
language optimized for mathematics and technical writing, as well as a
compiler which does advanced typesetting (for PDF output, that is). Because of
the way the system is organized, there are thousands of packages available for
LaTeX which essentially extend the markup language and provide new
functionality to the compiler to produce diagrams, for example, or certain
alternate PDF formats, etc. The amount of work done in the LaTeX domain in
terms of development is quite amazing.

In view of this, is there a way of extending LaTeX in such a way that all this
functionality ports over to the creation of more active documents/webpages?

