Hacker News new | past | comments | ask | show | jobs | submit login
MathML Progress (igalia.com)
112 points by ubavic on Aug 12, 2021 | hide | past | favorite | 50 comments

I think the nearly 25 year effort spent on MathML should have been put towards standardizing browser engines natively rendering LaTeX à la MathJax. The initial vision of MathML was that people would write it in a WYSIWYG editor (since it’s far too verbose to write by hand), but that grossly misjudged users’ preferences—LaTeX is the lingua franca of mathematical typesetting, and people much prefer writing (and reading) LaTeX. Realizing this over a decade(!) after MathML was first introduced in 1998, the MathML community has since focused some of their efforts on LaTeX conversion, which raises the question: why not just focus on rendering LaTeX directly, rather than translating it to (and maintaining standards for) a clunky intermediary format no human will ever directly read or write?

TeX is a programming language, and a quite peculiar one at that (it's not possible to parse it without executing it, for example); while MathML is merely descriptions of mathematical expressions.

The idea of having native support for describing mathematics that the browser can render is not so crazy. On the other hand, MathML is sadly based on XML, which is an abomination that can't "decide" whether it should be human- or computer-readable, so it ends up being none, as opposed to S-expression-like formats which are both. But, at the end, SVG and HTML are already in the same (SGML?) family, so it might not be so bad that MathML belongs there, too.

> it's not possible to parse it without executing it, for example

It’s possible to define useful subsets of LaTeX (e.g. math markup only) for which this is not the case. MathJax does just that, in fact.

Actually, in SGML (but not the XML subset of SGML) it's common to define char sequences (shortrefs) an SGML parser would replace into tags, which together with tag inference (also left out from XML) can make it possible to at least type casual and high school math naturally as ASCII math expressions and have those render into presentational MathML. Though that approach clearly has limitations (fraction style, roots, infix expressions, etc.). It's also conceivable to attempt to customize SGML's concrete syntax to recognize LaTeX (but not TeX proper) such that it treats "\bla{...}" as <bla>...</bla>, etc.

But the fact that nobody had bothered probably tells us it's not all that useful.

Having MathML being XML based plays rather nicely with both HTML and SVG. And I am saying that while not being exactly a fan of XML-based formats.

> The initial vision of MathML was that people would write it in a WYSIWYG editor (since it’s far too verbose to write by hand), but that grossly misjudged users’ preferences

To be honest, it probably depends heavily on a particular audience. I know people who barely wraps their heads around the concept of BBCode, and those (although a much smaller group) who would probably advocate for exchanging Microsoft Word for a LaTeX editor. Most people out here are probably completely content with a WYSIWYG editor and don’t care about the inner mechanism of its implementation as long as it gives an expected effect for them.

In this case though, the people who need to typeset a lot of mathematics are more likely in the LaTeX camp.

You can use a WYSIWYG editor to produce LaTeX (e.g. LyX). Regarding your point about ease of embedding in HTML/SVG, we embed other formats in markup all the time, e.g. CSS/JavaScript/images/etc. with nary an issue.

MathML is the equivalent of having to compile CSS/JavaScript into an XML-based “bytecode,” which then gets parsed and executed by the browser.

You make it sound as if you write HTML5 by hand when entering content in web pages. Do you have the same complaint for SVG diagrams or HTML tables, which are even worse for writing by hand than MathML?

I hope not. The key gain here is that the syntax used for mathematics in HTML needs to be easy to integrate in the web platform of 2021. Can you still use the DOM with MathML? Yes. Can you naturally interleave it with other HTML elements to create multi-modal constructions? Yes. With diagrams? SVG+MathML is possible, yes.

Indeed it is designed for technologists, solving a problem on the world wide web, as well as in other structured formats. And no, no one is asking mathematicians to type XML or HTML by hand in 2021.

An author can keep using latex, asciimath, Word or OpenOffice, MathType, MathQuill, or even Mathematica/Matlab/scipy syntax, as they've done until now. And then have their toolchain of choice prepare MathML to be served on the web, (or epub ebooks, etc), in a uniform manner understood by all vendors. Mere mortals can then finally create web apps that can natively access the math, without writing half-baked tex parsers that have hundreds of awkward and undocumented special cases.

TeX syntax has no path to richer integrations into the web platform and keeps mathematics out of reach for all modern ecosystem trends. I certainly think MathML can still be improved quite a bit - mostly by making it smaller and simpler, so that anyone can pick it up in an hour and write a new app in a day.

MathML is supposed to be generated by software and rendered by browser. You're free to write LaTeX JavaScript convertor which could convert LaTeX to MathML (and probably those convertors already exist). Nobody is going to ask users to write MathML, just like nobody is asking users to write proper HTML.

Supporting MathML instead of LaTeX makes sense, because it's machine-readable language which is easy to generate and parse. HTML uses XML-like notation, SVG uses XML notation, so it only makes sense for equations to use XML notation instead of completely different language.

The DOM in most rendered web sites is far removed from any HTML directly read or written by humans. People can use whatever markup language they want and know that if it compiles to MathML, it will render in any browser.

LaTeX support was a really long shot to be standardized and implemented consistently across browsers. It's just too powerful. For LaTeX users, the only thing that MathML standardization changes is that MathML is now part of the target language that a LaTeX implementation compiles to. Maybe that will be useful; maybe not.

MathML dates from a time when XML was seen as the One True Way by most standards people. The fact that TeX is not based on XML was probably enough to discount it at the time. Nevermind that TeX is more convenient to write by far, and was already strongly established as the standard way of encoding mathematical formulas.

I am sad that Unicode Math never got wider acceptance.

It is eloquent and can be picked up in an hour or two by end users.

One of the sadder ignored standard, right up their with Bluetooth media control which is also super simple and easy to understand, and only ever partially implemented. (Seriously, go through both specs!)

Just tried https://fred-wang.github.io/MathFonts/mozilla_mathml_test/ in Canary with enable-experimental-web-platform-features. As they say, the stretchy characters (e.g. think \left, \middle, \right in LaTeX) all have wrong heights at the moment. Still, that's a lot of progress made.

Edit: Actually, when I select any font other than "Default fonts (local only)", the stretchy characters' heights change in response to content they embrace (though still not correct -- they tend to be taller than they should be), unlike with "Default fonts (local only)" where the characters won't stretch at all. I wonder why this is the case.

Why is the goal to make it look exactly the same as TeX? TeX is great but it's rules are based on Knuth studying existing typesetting and I'm sure a lot of rules are fairly arbitrary. When I look at this in Firefox it looks finished. Brace heights don't change the meaning of the mathematics. It's a purely stylistic choice.

Knuth didn’t arbitrarily make stylistic choices, he followed long established conventions in mathematical handwriting and printing where possible. IIRC he explained this at the beginning of The TeXbook. In this specific case though, it’s not hard to realize that parentheses, braces, radical symbols, etc. two ems taller than necessary both waste a lot of vertical space and look like crap.

> As promised, after a short break, we have also renewed our work on upstreaming MathML-Core in Chromium. […] You can try these for yourself in Chrome canary by enabling experimental web platform features.

Finally! The lack of MathML support in Chromium based browsers was quite frustrating to me. Though, it will be probably still quite a lot of time before the libraries like MathJax and KaTeX became redundant for basic use cases.

> Though, it will be probably still quite a lot of time before the libraries like MathJax and KaTeX became redundant for basic use cases.

I think that libraries like MathJax and Katex will always be used for translating TeX notation to MathML. Stil, I look forward to day when will have native rendering on all browsers.

As far as I'm concerned, this is great news. Of course I get the criticisms from the "MathML sucks" and "why not use TeX" crowd. And that's fine... I'm not even sure I disagree with those folks. But the big difference, to me, is that MathML-in-the-browser is here (more or less) today and is just shy of being ubiquitous and widely usable by most browser users. That is, IMO, a big deal.

And the people who really, really, want something other than MathML are free to start (or finish?) writing code, raising money to have other people write code, etc., and work with Google, Mozilla, et al, and get their system integrated into browsers. I'd just be surprised if this effort yields any meaningful results in less than 20 years.

So at least now we'll have native MathML rendering to tide us over until the New Math Browser Thing becomes reality.

Why start/finish writing code? There are already plenty of websites, starting [with Wikipedia](https://en.wikipedia.org/wiki/Help:Displaying_a_formula) that can display formulas and all of them are using some versions of TeX. I don't really see a point in working on MathML at this point.

Why start/finish writing code?

I'm talking about people who want "native in-browser" rendering for something other than MathML. If they want their favorite approach implemented in the browser, they're welcome to make it happen. My point is that MathML native-in-browser rending is basically already here.

Of course you can use Javascript libraries like MathJax or something similar to implement whatever you want. But for people who care about native rendering, that's neither here nor there.

The existing ways of putting math on the web are all already here too, and have been for a long time. Apart from the old way of generating images by running TeX (still used by Wikipedia, for instance), both of the most common libraries used for typesetting (MathJax and KaTeX) can produce either HTML+CSS or SVG which are well-supported by browsers, and both of them (KaTeX since the beginning, and MathJax since version 3.0) can be run server-side so that no JavaScript needs to run on the user's browser. And regardless of whether or when the "native in-browser" rendering of MathML becomes good enough and widespread enough, the "native in-browser" rendering of HTML+CSS and SVG will continue to exist and be well-supported. (See the last section of http://bit-player.org/2020/mathjax-turns-3-0 for some interesting conclusions.)

I don't disagree that the existing non-MathML approaches have merit. I think I even alluded to that already above. My comment was meant mainly to address the specific case of the people screaming "Why not just implement LateX in the browser, instead of MathML in the browser" or whatever. To which I say, again, "go ahead, and let us know when it's ready."

As it stands, every approach that we have for putting math on the web has short-comings - including MathML. Nonetheless, I think having a well known, widely supported, standardized way of natively rendering math in the browser - without needing extra Javascript libraries or extra server-side processing, is a Good Thing. Note what I'm very explicitly not saying: "MathML is the be all end all" or "MathML is the very best way to put math on the web now and forever", or "MathML is better than TeX / LaTex", etc.

I just think this is Good News, and I'm excited to see tis move forward. I also hope MathJax continues to develop (there was some noise at one time about MathJax interoperating with native browser MathML rendering, but I'm not sure where that stands), as well as whatever other ideas people find appealing. More clean/elegant ways to put math content on the web is always good, IMO.

MathML, like other XML syntax languages, are more meant for computers to read & interpret - not for end users to write. I hope it becomes native across all browsers (as its the best chance of standardisation we currently have), as good as MathJax is, we shouldn't be having to rely on a JS polyfill to render maths in a browser...

Ok, but why? What's the advantage of having unwritable and unreadable syntax? Why shouldn't I be able to edit it directly?

This difference in opinion (on a subjective topic) is causing a rift between MathMl's goal of a universal math-language for browsers are more, with a large chunk of the potential users who hate the syntax.

Accessibility. A fully unambiguous mathematical language like MathML can produce markup for the blind.

I have worked in this space, and I think at present this benefit is and will remain primarily aspirational. Here's MathJax in action: https://www.youtube.com/watch?v=6GSgTjorewQ

It's semi-respectable for a computer analysis, but notice that the creator of this video is using TeX to enter the math. You now don't have to write a caption when you paste some math, automated a11y semi-compliance, but does this mean you stop vetting your site with a screen reader? The answer should be no, but I get a feeling that many developers will check the box and move on (see also: "accessibility overlays").

Now, let's turn the situation around, you are blind, you have heard and understood a complex equation, you must respond by writing a response to that equation. How do you do it? Does MathML help you? The markup, generated from TeX, that can be read aloud by a computer, is still fundamentally read only. The most tech savvy of the visually impaired I've met learned to program and use python or something to do their math. It beats the hell out of handwriting xml or using a very restrictive mathML wysiwyg editor with a screen reader.

Do I think MathML is a pure negative? No, but it's not even close to a silver bullet, and using potential a11y benefits to shield its flaws doesn't sit right with me.

> advantage of having unwritable and unreadable syntax

There's (in a way) a false dilemma at play here. MathML is unreadable because it's XML, but it is also possible to describe trees in more readable languages like S-expressions.

Either way, it would be absurd to add TeX into a Web browser ...

> Either way, it would be absurd to add TeX into a Web browser ...

I agree. Nobody is seriously proposing adding a full LaTeX implementation to a browser, but rather a parseable subset of TeX strictly for math markup, à la MathJax, which incidentally has become the de facto standard for embedding math in webpages.

But who will decide what commands will be part of that subset? There are hundreds, maybe thousands, of commands that are used regularly in tex documents for displaying various forms of notation, and mathematicians still create new notation and packages for it. Most of the commands in MathJax aren't part of TeX, but LaTeX or even some popular package (amsmath, etc.). Some things are implemented in more then one packages, and those different implementation have differences (for example, in Russian books integral sign is different than one in American).

If we implement a subset of TeX in a browser, there are two options: 1. If we only include the most used commands, such technology will be useless for professional mathematicians and probably be obsolete in future (when some new useful notation appears). 2. On the other hand, if we keep adding new commands for every type of mathematical notation, then we will need longer names for commands (or maybe namespaces) and notation will not be readable anymore. Not to mention that browsers maintainers will need a lot of work to implement all that (How long will that take?)

Also, if we implement only a subset of TeX, users will not be able to easily (or anyhow) create new notation. On the other hand, MathML is very flexible in that regard and allows users to be creative.

And, contrary to some other comments in this discussion, I don't think that code for mathematical notation must be human readable before all. Whatever approach you take (tex, XML, S-expressions), you will quickly find some examples that are horrendous for coding (integral inequalties, steps in PDE solution, commutative diagram, proof tree...)

If people hate the syntax, at least it will serve as a stable target to compile other math markup languages to. It might even lead to a lot more experimentation and innovation in that area.

I will quote the venerable Andrew S. Tanenbaum. "XML combines the efficiency of text files with the readability of binary files."

Chrome removed MathML long time ago and I thought that was the end of the story. But apparently Igalia folks didn't give up. I really appreciate and respect this level of persistence, and I'd give a bit of credit to Chrome folks as they don't at least reject their new initiative.

I'm wondering who is sponsoring this work. Igalia is still a company and I don't think they are afford to do this as a hobby... But maybe they do this anyway?

You can find more information on the project website [1].

[1] https://mathml.igalia.com

Not sure who this is for. The notation looks extremely verbose and unusable, and latex won a long time ago already.

MathML is the only accessible option. Screen readers work with it.

Tools like KaTeX output both HTML and MathML simultaneously (HTML for rendering, MathML for screen readers), leading to massive page bloat. MathJax requires client-side JS to look good, which is a big 'no' for me.

Right now, there is no way to display math on the web that: (1) is accessible, (2) works across all modern browsers, (3) is not slow or bloated in some way. MathML in Chrome fixes basically all these problems.

I am not a blind but I would much prefer hearing latex formula word by word than whatever alternative they came up with(e.g "a slash pm b" or "slash frac a b"). I really doubt you can unambiguously say long formulas lot more succinctly. Even after having looked at the formula, I can't follow a lecture without having to look at the board and that is human translation which would be better than MathML.

Should the toolmakers priority not be to make a tool its users will use?

This sort of dictating opinions via tech seems common in the browser world, and it honestly makes me thankful we primarily target pdfs and not the web.

MathML has many potential uses. It's been around for a long time, and its advantages over LaTeX are clear if you've spent any amount of time with it.

Why I think LaTeX "won out":

1) Chrome dropped MathML 10ish years ago

2) MathJax got LaTeX support

3) People don't care about accessibility/ambiguity enough to bother with anything else

4) There aren't good, mainstream frontends for MathML other than MathType, making LaTeX more convenient to learn and use

I get the impression this is a tool made by technologists for technologists, rather than for people who spend a lot of time typesetting math. Academics (mathematicians, physicists, etc) don't have the same concerns as technologists, and the specifics of browser tech in particular are't relevant.

Latex is basically the mathematical (computer) lingua franca right now. Theres an entire cultural and technological ecosystem around latex thats been built up for many years. We even message each in latex notation when rendering isn't available! Theres decades of documentation online. Most important of all, its dead easy to use for most use cases!

Perhaps MathML is technically superior in some ways, but none that the people typesetting math are concerned with. That this is browser oriented tech in particular is already a disadvantage, as the browser is not a priority to us. In this way mathjax feels like the superior tool for users. It attempts to work with users rather than dictating to them.

I think MathML might not have failed so hard if it weren't so ludicrously verbose and time-consuming to type compared to TeX notation.

XML apologists generally propose special editors to make the experience less unpleasant, but that's not what working users of mathematics want.

TeX notation provides a way for us to type formulae in plain text form, anywhere that ascii is accepted, as fast as we can type normal text. That's the bare minimum required for any representation to be relevant to mathematicians and physicists.

I don't know what a more 'web flavoured' notation satisfying this constraint would look like. (Are web standards people even capable of brevity?)

Yeah thats very true and I didnt think to mention it. I very much appreciate how concise latex is, and that I don't need any special editor for it.

Regarding (4), I suppose you can argue if this counts as "good" or "mainstream" but the equation editor in OpenOffice & LibreOffice supports saving as MathML, AFAIK.

Just ease of installing and standardization would rule out latex. Latex packaging is nightmare. Verbosity will be solved by tones of JS and python packages.

The two

> Latex packaging is nightmare.


> Verbosity will be solved by tones of JS and python packages.

seem at odds to me.

Anecdata, but thats not been my experience. Ive found latex to be a much smoother experience to use than languages like js or python.

The Chromium ticket is https://bugs.chromium.org/p/chromium/issues/detail?id=6606

You can try it with chrome://flags/#enable-experimental-web-platform-features enabled here https://fred-wang.github.io/MathFonts/mozilla_mathml_test/

A lot of people don't seem to get the point of MathML. Well, suppose you have a blog, and you want to use math. If it's going to be consumed on the web, a javascript library probably works OK. But if people subscribe over RSS/atom? Or over email? As far as I can tell, nothing works. Your best bet is to try to figure out how to write your math using unicode, or (the horror) to render everything to bitmaped images.

Hopefully MathML will help with that. The most important thing is that MathML is limited enough in power that gmail et al. can support it without worrying about security / privacy. (Unlike, say, .svg images now.)

I fail to see how MathML is inherently safer than SVG.

I meant that as an aspiration: "I sure hope that MathML is designed/considered safer than SVG." Why exactly SVG is considered unsafe now, I'm not sure, but I understand that this is why gmail doesn't allow it. It seems like SVG can contain semi-arbitrary HTML and javascript? (I'd love to have a less-powerful but more trusted version of SVG, too.)

I've played with MathML http://gron.ca/algebra/027.html it's okish but LaTeX is better.

A way of displaying math is needed on browsers. So is a better way of citation, and how about a video standard.

Are we at an inflection point? Traditional math symbols will just be replaced by a computer language like Python? And we end up with 2 languages for doing math; the old paper way and the news computer programming language way?

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact