
Towards LaTeX in the Browser - jxxcarlson
https://hackernoon.com/towards-latex-in-the-browser-2ff4d94a0c08
======
jdleesmiller
(I'm a co-founder at Overleaf.com, which does collaborative 'LaTeX in the
browser' in a different sense.)

I like the idea of a 'sane' subset of LaTeX that is easy to publish to the
web. There are tools like LaTeXML and TeX4ht that try to convert general LaTeX
documents to (X)HTML, but it's a very hard problem.

Some difficulties arise from the fact that TeX is just very hard to parse in
general. Even the first stage of parsing TeX is Turing complete [1]. This
makes it hard to write tooling e.g. for linting (though tools exist, e.g.
chktex) or creating a WYSIWYG editor backed by LaTeX [2]. (edit: or creating a
good LaTeX auto-complete [4])

Others arise from TeX's extensibility --- there are many thousands of packages
that define their own commands and environments for different types of
documents and different disciplines. This extensibility is on the one hand one
of the main reasons that TeX and LaTeX are still actively used some 40 years
after TeX's initial release, but on the other hand a major challenge for
conversion to HTML. The LaTeXML project has many custom bindings [3] for these
packages, but it's far from complete.

I guess the main question is whether we can find the right subset, and this
project looks like a great start.

[1] [https://tex.stackexchange.com/questions/4201/is-there-a-
bnf-...](https://tex.stackexchange.com/questions/4201/is-there-a-bnf-grammar-
of-the-tex-language)

[2] [https://www.overleaf.com/blog/81](https://www.overleaf.com/blog/81) \---
my first attempt at rich text on Overleaf, many years ago

[3]
[http://dlmf.nist.gov/LaTeXML/manual/customization/customizat...](http://dlmf.nist.gov/LaTeXML/manual/customization/customization.latexml.html)

[4] [https://www.overleaf.com/blog/523-a-data-driven-approach-
to-...](https://www.overleaf.com/blog/523-a-data-driven-approach-to-latex-
autocomplete)

~~~
moultano
I really like the idea of adding math support to Unicode via combining
characters. It's more complicated than anything Unicode currently deals with,
but not that much more complicated, and the idea of being able to put math
into anything that currently accepts strings is just so enticing. We should
treat math as it's own language, and rendering it as we would any other human
language with an unusual way of laying out characters.

~~~
Bromskloss
It's an interesting idea. At what point, though, do we draw the line between
what a character set (like Unicode) should handle, and what should be handled
by a higher-level layer? I'm thinking that things like boldness,
italicisation, and super script aren't really the job for a character set.

~~~
yorwba
Unicode already has 𝐛𝐨𝐥𝐝, 𝘪𝘵𝘢𝘭𝘪𝘤 and ˢᵘᵖᵉʳˢᶜʳⁱᵖᵗ variants of the Latin
alphabet.

~~~
colejohnson66
Maybe it’s just iOS, but the “bold” characters are serifed and the
“superscript” one’s aren’t a consistent size between each other.

~~~
yorwba
Unicode only defines the codepoints for characters, it doesn't require anyone
to actually make them look good. Since those characters were specifically only
included to represent mathematical texts where formatting needs to be
preserved, it's unlikely anyone is spending much effort on making them look
good as text.

Regarding serifs: there are actually two variants of both bold and italic (and
bold italic), one serifed and one not. Wikipedia has a chart of the different
options here:
[https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symb...](https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols)

------
kovariance
I have found KaTeX to be the best currently-available solution. In particular,
it can be rendered without client-side javascript.

~~~
lilgreenland
Agreed. KaTeX is much much faster than MathJax.

[https://www.intmath.com/cg5/katex-mathjax-
comparison.php](https://www.intmath.com/cg5/katex-mathjax-comparison.php)

------
marvy
Just a bit of historical correction. The article/post says:

"Ten years later, in 1978, his work bore fruit"

This gets things pretty wrong. He got the idea in 1977, and his estimate of
"this will take 6 months" was pretty close, in that the initial version was
finished sometime in 1978. It then took about another ten years to be
"actually done". (Rewrite, add features, fix bugs, create Metafont, create
WEB, etc...)

~~~
jxxcarlson
Thanks very much for the correction!

~~~
marvy
Much better :)

------
DavidSJ
Completely besides the point, but that integral evaluates to sqrt(2pi), not
sqrt(pi).

~~~
curiousgal
I concur. It's easy to verify using the pdf of the standard normal
distribution.

------
applecrazy
I wonder if somebody has taken TeX and compiled to the browser in wasm using
emscripten. That would be easier to port but heavy on load times.

Edit: it exists!
[https://github.com/manuels/texlive.js/](https://github.com/manuels/texlive.js/)
is a limited port of LaTeX to JS, rendered to PDF

~~~
TheRealPomax
Classic TeX would be damn near useless in the age of Unicode, so you're
looking at something like XeLaTeX or LUATex. The problem is that it's really
easy to implement a really basic form of TeX, but unless you already planned
for the really hard cases, maintaining your implementation is going to become
intractible. TeX's real text typesetting is almost always woefully ignored
even though _everything_ has to type beautifully, not just makes, and in
modern version of TeX, that has to happen without insane syntax just to get a
Unicode character we can already "just write" rather than needing all kinds of
dedicated macros just for diacritics, it something as simple as mixing two
writing scripts that necessitate two different fonts entirely.

~~~
badsectoracula
> Classic TeX would be damn near useless in the age of Unicode

Unless you want to write in English, which i am going to bet it still has a
somewhat large audience :-P

~~~
PeterisP
As soon you want to mention names of people, English text often requires
Unicode characters. Looking up some examples, the first random paper I took
from arxiv mentioned three surnames that needed Unicode, the second needed
four, including the name of one of authors herself.

Even if you're talking purely about people in USA - for example, a page of MIT
faculty [https://www.eecs.mit.edu/people/faculty-
advisors](https://www.eecs.mit.edu/people/faculty-advisors) includes names
like Jesús, Corbató and Tomás.

~~~
badsectoracula
Doesn't TeX already handle that? A quick search shows
[http://vjimc.osu.cz/TeXform.html](http://vjimc.osu.cz/TeXform.html)

FWIW my own name would need Unicode too (Κώστας Μιχαλόπουλος) but i always use
its romanized form (Kostas Michalopoulos) in English. I think that is common
when writing English text and names from languages that do not use the latin
(or derived) alphabet.

~~~
TheRealPomax
An answer here would be way too long, but the short answer is "no". The
technologies that were available at the time of TeX meant that TeX had to do
all kinds of things that in today's world are bizarre.

TeX has seen a lot of improvements over the last 30 years, and modern TeX
engines such as XeTeX and LuaTex have removed a lot of the insane painpoints
that came with traditional TeX, which worked well only because there was
literally nothing better at the time.

A modern TeX engine will let you just write what you want to write, using all
of Unicode as your playground, using modern OpenType fonts, and with real
vector graphics. None of those things can be done with original TeX, not just
"it's hard to", it's literally impossible without rewriting it from the ground
up. Which is why we HAVE modern TeX engines: just because it worked, doesn't
mean it was good. It was merely the best available at the time.

Time moved on.

------
svat
I wonder whether this is the right approach. TeX itself is one of the most
heavily documented programs in existence. Not only are its workings documented
in detail in _The TeXbook_ (and a host of other books by other authors, such
as Eijkhout's _TeX by Topic_ ) but even the program itself has been written in
a “literate programming” style, with pretty formatted source code (with
profuse comments) available in print (Vol B of _Computers and Typesetting_ )
and as a PDF ([http://texdoc.net/texmf-
dist/doc/generic/knuth/tex/tex.pdf](http://texdoc.net/texmf-
dist/doc/generic/knuth/tex/tex.pdf)), there's a detailed history/retrospective
and log of every change that went into the program (see Chapters 10 and 11 of
the book _Literate Programming_ , though the log without explanation is also
available online [http://texdoc.net/texmf-
dist/doc/generic/knuth/errata/errorl...](http://texdoc.net/texmf-
dist/doc/generic/knuth/errata/errorlog.pdf)), and there are even 12 hours of
video of Knuth talking about the internals of the program
([https://www.youtube.com/watch?v=bbqY1mTwrj8&index=12&list=PL...](https://www.youtube.com/watch?v=bbqY1mTwrj8&index=12&list=PL94E35692EB9D36F3)).

So when the article says:

> To reproduce all of LaTeX in the browser is too much to ask

I wonder why? The file _tex.web_ is less than 25000 lines long, much of it
comments, so I'd estimate that TeX itself is only about 20000 sloc (in fact
_tangle_ on _tex.web_ generates a Pascal file _tex.p_ which is only 6115 lines
long). This is not a lot IMO, and it would be a lot better to actually re-
implement this, with additional support for things like getting the parse tree
etc.

------
patte
I was wondering recently if/how it would be possible to piggybag latex’
georgous typesetting (place the letters) to bring justified-text to websites.
I want to do a PoC for absolut positioning all letters of a basic document
placed by tex for my screensize.

Did anyone ever see such an approach?

------
gravypod
Are there any other solutions to document typesetting with latex-like
features? TeX is very obtuse for someone who hasn't been using it for a long
time.

~~~
flother
A common solution is to use LaTeX, but to use it indirectly: write in Markdown
and convert to PDF using Pandoc [1], which uses LaTeX in the background. This
is (part of) the process used in RMarkdown [2], for example. That way, you get
all the benefits of TeX and LaTeX but without most of the pain.

[1]: [https://pandoc.org/index.html](https://pandoc.org/index.html) [2]:
[http://rmarkdown.rstudio.com/](http://rmarkdown.rstudio.com/)

~~~
curiousgal
I just use Atom with the markdown-preview-plus package for live preview.

------
martyalain
What do you think of this project {lambda way} as an alternative to LaTeX in a
browser: [http://lambdaway.free.fr](http://lambdaway.free.fr)

For instance, from this wiki page
[http://lambdaway.free.fr/workshop/?view=oxford](http://lambdaway.free.fr/workshop/?view=oxford)
I could directly generate a PDF paper,
[http://lambdaway.free.fr/workshop/data/lambdatalk_20170728.p...](http://lambdaway.free.fr/workshop/data/lambdatalk_20170728.pdf),
and slides,
[http://lambdaway.free.fr/workshop/?view=oxford_slides](http://lambdaway.free.fr/workshop/?view=oxford_slides)

Some other pages in this workshop:
[http://lambdaway.free.fr/workshop/?view=factory](http://lambdaway.free.fr/workshop/?view=factory)
[http://lambdaway.free.fr/workshop/?view=NIL](http://lambdaway.free.fr/workshop/?view=NIL)
[http://lambdaway.free.fr/workshop/?view=teaching](http://lambdaway.free.fr/workshop/?view=teaching)
[http://lambdaway.free.fr/workshop/?view=lambdacode](http://lambdaway.free.fr/workshop/?view=lambdacode)

Your comments are welcome.

Alain Marty

------
etaioinshrdlu
I used
[https://github.com/phfaist/pylatexenc](https://github.com/phfaist/pylatexenc)
to convert LaTeX to unicode text, with math symbols and superscripts etc.

It's of course never going to be as good looking as MathJax or something like
that -- but it may be more appropriate to be able to treat it as plain Unicode
text in some cases.

For instance, it works in title fields across the web and search engines will
understand it better than anything else.

------
emeryberger
There is not really a need to modify LaTeX at all to make it run in the
browser. It already exists. Without modifying a single line of code, we have
implemented a full browser-based port of LaTeX as part of our Browsix project,
which makes it possible to run full, unmodified Unix applications inside the
browser. See [http://browsertex.org](http://browsertex.org) and
[http://browsix.org](http://browsix.org) (and
[http://bpowers.net](http://bpowers.net) and
[https://jvilk.com/](https://jvilk.com/) and
[http://plasma.cs.umass.edu](http://plasma.cs.umass.edu)).

------
abritinthebay
I love the output of LaTeX but the language itself (and it’s dependencies and
packages) are an absolute horror show.

I’ve never understood how people can learn be it so, writing it is painful,
it’s tooling is _abysmal_ , and it rarely seems to work except on the person
who wrote its machine.

We’ve got to be able to do better.

~~~
mkl
> it’s tooling is abysmal

It seems like you haven't tried many editors. Have you tried TeXStudio
([https://www.texstudio.org/](https://www.texstudio.org/))?

> it rarely seems to work except on the person who wrote its machine.

I and many others edit the same documents at the university where I work,
without significant issues. Distributions like TeXLive
([https://www.tug.org/texlive/](https://www.tug.org/texlive/)) provide a
consistent all-inclusive cross-platform solution.

~~~
abritinthebay
TeXStudio would be a perfect example of its abysmal tooling. It’s better than
the CLI tools but it’s an _awful_ editor and highlights how incompatible with
a good writing experience LaTeX is.

Yes, many people produce good work in it - it’s output is fantastic after all
- but an editor that would have been a substandard user experience in the 90s
is the best LaTeX has in tooling.

That’s exactly what I mean!

~~~
diffeomorphism
Can you try to phrase that more precisely/constructively by including a reason
why it is "awful" or give an example of "good tooling"?

As far as editing goes, latexmk, syntax highlighting and good shortcuts are
all I ever use and am perfectly happy with (emacs+auctex). It is a different
paradigm than WYSIWYG, but different does not say anything about good or bad.

Now writing new latex classes, I agree. That is very unintuitive and would
greatly benefit from simplification, templates and tools.

~~~
abritinthebay
I could go into a long, detailed, breakdown of how bad TeXStudio is but,
frankly, if they want UI/UX work they should pay for it. Which they _clearly_
don’t.

It’s... decent enough in the pack of “open source UI” but that isn’t a high
bar.

Here’s the thing about that (oft repeated) line about WYSIWYG vs WYSIWYW: it’s
bullshit.

There’s no justification for it other than the deficiencies of the tooling and
tool chain. It’s _an excuse_.

------
djuerges
I actually did 'LaTeX in the browser' as a master thesis in 2014, but never
went to continue developing it afterwards, be it as open-source project or
with a commercial intent in mind. Although I though, at that time, I was at
least up to the few solutions that were out there and solved the task of
instant updates and real-time collaborative work on a document pretty
gracefully.

Some neat improvements would have been version and so on, but you know, never
made it that far after picking up a job. Kind of a shame...

[https://github.com/djuerges/cotex](https://github.com/djuerges/cotex)

------
jessriedel
I read the post but I still don't understand: is it possible to define new
commands using \def or \newcommand? At first I thought these are what the
other meant by "macro", but later he says

> We are exploring ways for users to define non-default environment behaviors
> in the browser. The same goes for macros used outside the dollar and double-
> dollar fences.

But I can't use \def or \newcommend to define things that appear _inside_
dollar signs either.

~~~
jxxcarlson
Here is an example:

$$ \newcommand{\bra}{\left<} \newcommand{\ket}{\right>} $$

$$ \bra a | b \ket $$

If you go to
[https://jxxcarlson.github.io/app/minilatex/src/index.html](https://jxxcarlson.github.io/app/minilatex/src/index.html),
press the "Clear" button, then paste the above text, then press "Render", you
should see the macros \bra and \ket properly rendered.

~~~
jessriedel
Oh I see, thanks. For what it's worth, I would definitely include this example
in the demo; it's basically the first thing I wanted to use. Given your
pipeline, it makes sense that the \newcommand definitions _themselves_ has to
appear inside dollar signs (not just when they are used), but for people with
a TeX background it's pretty unintuitive.

Also, you should definitely use \lange and \rangle in place of < and > for
bra-ket notation :)

------
angarg12
Just for fun here is a little web game I made to look like a maths paper using
MathJax.

[https://angarg12.github.io/TrueExponential/](https://angarg12.github.io/TrueExponential/)

------
jimhefferon
PreTeXt from
[http://mathbook.pugetsound.edu/](http://mathbook.pugetsound.edu/) has gotten
some mindshare.

