MathML in Chromium 118 points by aplaice 14 days ago | hide | past | web | favorite | 86 comments

 I really wish browsers would just render formulas directly from TeX. Let me write \sqrt{1+x} or whatever. TeX is the de facto standard for writing mathematical formulas. That browsers don't render it natively just screams of NIH syndrome on the part of browser and web standards developers.MathML still hasn't caught on after two decades, for three reasons: 1) not working in all browsers; 2) even when it worked, the rendering was often buggy or plain ugly; 3) no one wants to write MathML directly.MathJax instantly solved all problems, which made it an overnight success. MathML might be able to overcome 1) and 2), but 3) should not be underestimated. MathJax will be around as long as it is the most convenient solution for showing equations in a browser (no user-side compilation required), and rendering times and network traffic will suffer accordingly.
 MathML is probably the best example of the late 90's early 2000's XML craze when everything was going to be XML and it was the best thing ever.https://en.wikipedia.org/wiki/MathML#Example_and_comparison_...Comparison to LaTex is almost hilarious. Showing well how in the end XML managed to combine the properties of Binary and Text formats. It's slow to parse like text format and hard for humans to read, like binary formats.Personally I'd also really love that they'd just standardize LaTex or something similar. Why invent some non human readable mess when there is already a perfectly functional and widely used notation available?
 XML is faster and far simpler to parse than TeX. To the extent that you need to (if for whatever reason you don't want to rely on a LaTeX to MathML or Ascii to MathML converter) you can make the quadratic equation MathML slightly more readable, by not using hex entities, but unicode for − and ±, and the named entity for ⁢.[0] Furthermore, you (and I!) are just far more familiar with TeX, which makes the comparison in readability not particularly fair. Finally, much of the invisible, seemingly redundant mark-up, such as ⁢ or ⁡, can help you avoid some of TeX's ambiguities — e.g. is $f(a+x)$ the function $f$ acting on $(a+x)$ or $f$ multiplying $(a+x)$?[1] If you were to omit this mark-up (and if you're converting from TeX to MathML and don't want your converter to engage in guesswork, you have to) the MathML would be even simpler.Using the same format for equations as for the rest of the document (i.e. HTML/XML) is advantageous (in addition to the parsing benefits). In particular, you can use the same mechanisms for styling and transforming elements, as you can for the whole document. For instance, you could easily style parts of an equation, provide pop-ups that explain what each symbol means, when you hover over it, or interactively change the equation. (Much of this hasn't actually been done, outside experiments, because only Firefox properly(-ish) supports MathML, so it would have been wasted effort.)[1] Presentation MathML is still obviously not semantic, but it can be better in this respect than default TeX — there have been proposals for semantic TeX, but none of them have really caught on.
 XML and readability are orthogonal concepts in the end. Basic html is easily editable and readable by humans. There is nothing preventing them from making an actually human editable XML markup language for maths. And that is the sad part.As MathML is not human readable nor editable it’s effectively an opaque image format that can be manipulated from JS side and that scales like vector graphics.
 Yep. MathML is quite similar to SVG in that it's really hard to write by hand and you need editors to make it accessible.
 MathML was created as a human-readable interchange format. It was never intended to be written by humans. Much like how HTML is often generated from markdown nowadays.It was specifically created to avoid TeX being used on the web, because TeX is ill-defined, loosely structured and lacks basic functionality such as Unicode support.
 MathML is not human-readable as shown by the example in the wikipedia link, and TeX is supposed to have Unicode support these days, assuming you have a properly configured modern system, that's not a good example.
 HTML was meant to be written by humans.
 I'm sorry but Latex is an inconsistent touring-complete mess. Latex commands are far from intuitive, otherwise Detexify wouldn't be so popular. It's a million macros held together with duct type with no consistency in naming and syntax, and always dependent on which age-old package you're referencing. It's good at outputting pixel-perfect printable, non-accessible PDFs, and that's it.I'm excited by this because I hope that with a proper widely-supported system for math in HTML we can eventually write more papers to be digital-first. An HTML document is so much more accessible, searchable, semantically analysable and flexible than Latex and its PDF output. I dream of a world where the standard for papers is not Latex .pdf but .mhtml or .maff.
 If Latex can be rendered in HTMl, than the world you dream of could become reality.Nobody wants to learn a new type setting language. Latex is much less hard to learn and use than you implyI mean just look at MathML https://en.wikipedia.org/wiki/MathML#Example_and_comparison_...
 I think it can be, it's just nobody does it. I don't know what formulas get compiled to though - I would guess PNGs, at which point you lose the benefits over a PDF (at least for equations). With proper browser support for MathML it could always be MathML though, which would render sharp and be readable by a screen reader.It is unfortunate that MathML is not easily hand-written like the rest of HTML. What I find interesting in the link you posted though is that they embed the Latex or StarMath representation inside the MathML. It should be possible have tooling in text editors such that you only edit that Latex representation and on save the MathML representation is updated automatically.
 While LaTeX might be hard to learn and inconsistent in its entirety, the formula is quite consistent and a lot easier to write by hand than MathML.
 This is an interesting, thought-provoking comment!> (though the original MathML proponents perhaps did not)FWIW I'm pretty sure that they did. Arguments to authority are pretty terrible, but if you look at the authors of the MathML 1.0 (earliest) or 3.0 (latest) specs[0][1], and google them, you can see that many of them have backgrounds in science or math and have been active in the LaTeX ecosystem.> but this [quality of the typesetting] has been mostly underestimated/ignored by those advocating MathML.I don't see any evidence for this, not among its designers, implementers or even general proponents.Firefox's output (implemented almost(?) entirely by individual volunteers), for instance, is acknowledged to be still considerably worse than LaTeX output in a pdf, though it is competitive with its web alternatives (superior in some respects, worse in others) — do be sure to install MathML fonts[2] though.> 5. What the result [...] will be, in the web page's DOM.Have you seen the tag soup generated (by necessity) with MathJax or KaTeX?
 When you write web-pages, do you usually write the raw HTML or do you use something like Markdown or Wikitext and have it converted to HTML? If the latter, then why would having LaTeX as part of the input and MathML as part of the output, be any different?Also, directly converting TeX to MathML, even client-side, is much easier and faster than MathJax's many-to-many approach (I'm not criticising MathJax — given the constraints, they're doing the best possible job).[0][1][2] (See also the Ascii to MathML converter[3] that has already been mentioned in another comment.)
 Most people who write for the web probably do indeed write in something like markdown, which means they probably sprinkle in a bit of HTML for the parts which markdown doesn't natively support. I imagine a lot of blog posts written in markdown contain a few elements, for example. Anyone writing math content for the web using markdown will have to either write the mathml directly, or write the math expressions in another language and manually compile it to mathml which is copy/pasted into the document. Maybe the CMS they happen to use will some day add native support for compiling latex, but that sounds rather unlikely.
 Pandoc and Mediawiki have been able to convert embedded LaTeX to MathML, for a while. Once Chromium supports MathML most CMSs will probably start providing suitable converters, and in the meantime MathJax will still work (and better, since MathJax's Native MathML output is faster than its CommonHTML one[0]).
 I do write the raw HTML actually
 Think of MathML more like SVG. You can write it by hand (and in some cases you should), but in most cases you should use a graphical editor (like inkscape), or a library (like D3).
 This is exactly the property that makes it actively worse than TeX notation. It makes equations a second class citizen compared to text because you can’t write them comfortably without external tools. MathML is a failure - because of its ludicrous verbosity. The correct solution may not be TeX notation, but it can’t be this bad a step backwards in usability.
 How's it worse?In either case, you can use TeX as your input, and if you do, you have to convert it, client-side or browser-side, into something usable by the browser; it's just that if the browser accepts MathML the rendering is faster and/or more convenient, plus you get other options.
 Sorry, my claim was about the two notations: that the TeX one is writable, and the MathML is not.I’m not claiming that a JavaScript parser and complete renderer for TeX is better than a JavaScript parser that renders via MathML. The second option may indeed be more efficient - but making the browser parse a sensible notation for maths instead of an XML crapfest would be better than either.
 I don't see why rendering TeX formulas would be slower than rendering MathML. It should be possible to maintain implementations for both variants with similar performance. MathML's parser should be simpler though, cause it's XML, which already has many efficient parsers.
 MathML was created specifically because TeX is not a standard, has no definition, has only one “true” implementation and does not capture enough structure of an equation to be a useful interchange format. Even ignoring syntax, the AST of TeX does not represent the underlying equation particularly well - just enough to use to as input to a typesetting system which behaves exactly the way that TeX does, e.g \sum has no way to specify what is being summed over (it’s irrelevant for typesetting in TeX - though not in general).Math typesetting is hard and stretchy characters in particular are not an easy fit into the browser layout model. It’s not been a priority for browser vendors. Hopefully that will change.
 >\sum has no way to specify what is being summed overI could usually tell you what was being summed over by reading the _ subscripts and ^ superscripts. However the thing written under the sigma is not always formal, nor should it be: oftentimes it will be an abbreviation of a fairly elaborate conditional that is described elsewhere in the text. Please, don't treat typesetting languages like they're programming languages where the computer has to be able to execute the formulas you write, we want to keep our close interweaving with natural language.
 This comment sums up perfectly why it is such a failure. We don't want to run the frigging formulas! Typesetting is the point! Stop making it do everything, it will end up doing nothing!
 There are two variants of MathML: semantic MathML (which does capture enough structure of an equation, and sees very very little use) and presentational MathML (which doesn't, but is much easier to author, and accounts for almost all MathML usage). There's no real difference in usefulness between presentational MathML and TeX.
 The verbosity of MathML makes it completely impractical to type as working mathematician. It’s not just a small difference; it’s completely ridiculous.
 While MathJax is great, the biggest problem is that 'TeX' is a very poorly (read undefined) standard, and it's fairly easy to produce things which TeX renders fine, but MathJax and Katex both bork on.Of course, writing a clean "standard" which covers most of what people expect when they say "TeX", and implementing that, would hopefully produce the best of both worlds.
 Although that does bring up the question of which subset of TeX browsers are willing to support.
 Markup and rendering+typography should be orthogonal issues.Does TeX semantics have something that MathML does not yet have?MathJax works with both MathML and TeX. Is there difference in the rendering when using different markup with MathJax?
 > Does TeX semantics have something that MathML does not yet have?It does not. In fact MathML captures much richer information than TeX.> MathJax works with both MathML and TeX. Is there difference in the rendering when using different markup with MathJax?There shouldn’t be. MathJax internally converts TeX input into an intermediate MathML AST before rendering.
 I write quite a bit of LaTeX, but I shudder at the thought of browsers embedding it. It is one of the more Lovecraftian codebases I've ever witnessed.
 If you hadn't commented, this would be exactly what I'd have written.
 It's very good news that this has finally started. Wikipedia, probably the biggest website that displays formulas, still renders them to SVG images.Igalia had already been improving WebKit's [1] MathML renderer and they had a fundraiser for the Chromium MathML work for a long time. Now they seem to collected enough to start with it. It's one of the great advantages of open source that a small company like Igalia can just go and improve multiple rendering engines used by billions of people.
 That's probably because MathML support is pretty poor in most browsers and even on Firefox, which has the best support, it isn't really good enough for all uses. MathJax has various backends to render LaTeX formulas, including SVG, MathML and HTML-CSS. Here's what the MathJax documentation says about the MathML backend:"The NativeMML output processor uses the browser’s internal MathML support (if any) to render the mathematics. Currently, Firefox has native support for MathML, and IE has the MathPlayer plugin for rendering MathML. Safari has some support for MathML since version 5.1, but the quality is not as high as either Firefox’s implementation or IE with MathPlayer. Chrome, Konqueror, and most other browsers don’t support MathML natively, but this may change in the future, since MathML is part of the HTML5 specification.""The advantage of the NativeMML output processor is its speed, since native MathML support is usually faster than converting to HTML-with-CSS and SVG. The disadvantage is that you are dependent on the browser’s MathML implementation for your rendering, and these vary in quality of output and completeness of implementation. MathJax relies on features that are not available in some renderers (for example, Firefox’s MathML support does not implement the features needed for labeled equations). While MathJax’s NativeMML output processor works around various limitations of Firefox/Gecko and Safari/WebKit, the results using the NativeMML output processor may have spacing, font, or other rendering problems that are outside of MathJax’s control."
 Those WebKit enhancements have ended up in surprising places. @acabel produced Wittgenstein’s Tractatus Logico-Philosophicus for Standard Ebooks and upgraded the pipeline to use Firefox to render MathML to png for Kindle, etc. Turns out that the Kobo renderer which seems to be WebKit based was able to display MathML directly, so the kepubs we build just have the plain MathML in them now.
 "Wikipedia, probably the biggest website that displays formulas, still renders them to SVG images."Not exactly.
 Even users on Firefox only get to see the SVG images per default.
 You can also use a computer algebra system (CAS) to perform computations and get the output in mathml, for example the free CAS maxima.wxMaxima is something like jupyter notebook but developed with wxWindows by a solo developer. ?? is help for commandI just copy pasted:(%i2) ?? mathml; -- Function: mathml_display (
) Produces MathML output. (%i1) load("alt-display.mac")\$ (%i2) set_alt_display(2,mathml_display); mlabel %o 2 ,done (%o2) true
 Here's an example of why MathML is attractive as a rendering layer (try Firefox vs Chrome, and look at native MathML latency vs MathJax emulation):https://runarberg.github.io/ascii2mathml/(it helps to have the LaTeX Computer Modern fonts installed locally, which for some reason aren't imported on this page)
 hmm, here is a project I haven’t given nearly as much love as it deserves.ascii2mathml was the first compiler I ever wrote (and have written since). I wrote it because I wanted authors from a non math background to be able to write short expressions in forums or comment threads without having to know latex. I tried to make it as intuitive as possible. Even going as far as making 1+2 / 3+4 a different expression from 1 + 2/3 + 4. But I know some people started integrate the library in their notebook apps, mainly used by math students taking notes in lectures in a markdown format (writing the expressions in ascii2mathml). Ascii2mathml might be a better fit then then the original asciimath for that purpose as the original is no expressive enough to capture advanced math expressions.I bailed on it a couple of years ago because it looked like MathML was a dying. Chrome wasn’t going to implement it. Also I haven’t been using it for anything either. Perhaps it is now time I revisit it and give it some love.
 Except MathJax isn't the bar to beat, it's Katex which doesn't rely on a measurement of DOM layout.
 I’m getting “main.js:100 Uncaught ReferenceError: MathJax is not defined” in Chrome.
 Not in any way affiliated, but Igalia is a cool company. They pay their employees to work on OSS projects as a part of their job. Andy Wingo work(ed?)there and they financed quite a bit of guile development as a way to make him develop his compiler and runtime skills.
 My personal opinion is that MathML should die. MathJax is here today and works.
 Mathml is not important for being a latex replacement, it's important for being a unified and native way to render mathematical formulas in browsers.With mathml supported by Chrome, you could instead of the whole Mathjax renderer use only a lightweight "latex/asciimath to mathml" translater and let the browser do the rendering job.
 MathJax is a unified way to render mathematical formulas in browsers. So that only leaves "native".Why is it desirable to have "native" implementation? The usual answer is performance, but I am not aware of much performance complaint of MathJax. If not performance, how does it make sense to add more C++ code to browsers to be exploited, when memory safe JavaScript implementation is already available?
 In addition to the already mentioned performance issue with client-side MathJax, having native MathML makes it conceptually far simpler to do more complex things with equations, both for the end user and for the developer.For instance, as a user, if you want to scale the equations by some amount or use a different maths font, it's a couple of lines of CSS, using exactly the same method you'd use to make any other changes to the appearance of a web-page. (Yes, you can easily do the former with MathJax, but I don't think the latter is possible user-side).As a developer, if you'd want to interactively highlight parts of an equation, for educational purposes, it'd be trivial with MathML, but rather hard to do nicely with MathJax (statically coloured elements are possible with MathJax, with the "color.js" extension, but not dynamically coloured ones — and no, swapping out the entire equation to make colour changes is neither nice nor scalable). Alternatively, if you want to embed equations in a diagram or a graph, it's pretty easy with MathML[0][1], but would be difficult otherwise.Obviously, all of the above is in principle possible with JavaScript implementations, but it's far harder. You might argue that this extra effort is worth the smaller attack surface. IMO, given the importance of maths and science, it isn't.Also, why do we, say, have the CSS flexbox layout? After all, we could have used javascript to arrange elements into an appropriate table or even just set the x and y positions of all elements...
 Yes, I am also against "native" flexbox implementation. Yoga showed it's entirely possible to implement in "user level".
 This. There is so much gained from equations being represented as HTML elements, with full support for CSS styling and JS manipulation.
 People regularly recommend other solutions over MathJax due to performance, and at least on complex pages there's a noticeable delay for it finish processing everything on the page. Is it worth it to do a native implementation because of that? maybe not, but there are issues with it.
 I'd much prefer optimizing MathJax over writing new C++ code in Chromium.
 Even the most optimized JavaScript code will be less performant than a fully declarative language that can be directly interpreted by the layout engine.From a user standpoint, it's ridiculous that I need to have JavaScript enabled, so the browser can download and compile a separate runtime, that itself reparses the page, just so I can look at a static documents with some math symbols.Lastly, I think a common standard for math representation is valuable for the same reasons "official"
and
elements are valuable: They offer a common data model that tools, extensions and search engines can work on to provide extra functionality. A "de-facto standard" like MathJax doesn't provide this, because there is no requirement that two different sites use the same representation. The only requirement is that they put up something which the particular version of MathJax they embedded can understand. This makes things a lot harder for tools.
 Pages with lots of MathJax formulas take a lot of time to load.
 MathML is here today and works in Firefox. I use it on Wikipedia[0], which is the only major ("non-niche"[1]) website that provides it, and it's much nicer than the image-based equations (and much, much faster than MathJax would be).[1] not that "niche" websites like nLab[2] should be disregarded, since the web was originally designed to help scientists...
 arXiv.org uses MathJax. https://arxiv.org/help/mathjax
 So, because we have a JS library, we should drop an effort to natively support first class math rendering in the browser?That's why we can't have nice things.
 It’s not first class. It’s cross-browser-inconsistent garbage layout largely unsuitable for anything but the simplest mathematical expressions, and a horrible markup syntax for authors.If MathJax or KaTeX is too slow for some purpose, someone should try to compile a more streamlined TeX renderer to wasm.
 > garbage layout largely unsuitableI dunno. Looking at the math rendering torture test at https://mdn.mozillademos.org/en-US/docs/Mozilla/MathML_Proje...I prefer the MathML version in 15 of the examples and the LaTex in 11. (No preference in the others.)
 I think the MathML sizing, positioning, and spacing of glyphs is strictly worse in every example at your link, sometimes quite dramatically. In a few cases (like the deeply nested fraction) the LaTeX is also not great.This would be a fairer comparison if they saved the LaTeX as SVG outlines, or as a higher resolution bitmap. As it is the LaTeX version looks fuzzy on my high DPI display.
 First of all, it can be used as a target for any markup syntax if authors don't like it. The syntax is an irrelevant part of the feature (in fact MathJax already supports it).Second, it is first class. "cross-browser-inconsistent" is not an argument that it's not first class, tons of things are inconsistent (JS features, CSS implementations, etc).Third, you missed the whole idea that the proposal is about enhancing the rendering, and also has buy-in from Mozilla people.>If MathJax or KaTeX is too slow for some purpose, someone should try to compile a more streamlined TeX renderer to wasm.That's not even wrong. It's beyond right and wrong, into the realm of crazy.
 It’s confusing to use the term “first class” when what you mean is “mediocre but built in”. The standard English definition of that term is “highest quality”. There are certainly many parts of CSS that I would not consider first class.> The syntax is an irrelevant part of the featureThis viewpoint explains a lot about web technology. The syntax doesn’t matter. The visual output doesn’t matter. Practical adoption by users doesn’t matter. All that matters is ticking features down on a checklist somewhere.
 >It’s confusing to use the term “first class” when what you mean is “mediocre but built in”"First class" in computing terms means strictly "built in", "supported as a native object" -- it doesn't say anything about quality (as opposed to e.g. "first class" airplane seats).>This viewpoint explains a lot about web technology. The syntax doesn’t matter. The visual output doesn’t matter. Practical adoption by users doesn’t matter. All that matters is ticking features down on a checklist somewhere.Sounds like a generic lament.What matters here is: (a) performance, which is and always will be better than some plain-js implementation. (b) being native (which means it will eventually be on all browsers, without asking the users to load anything extra, and will mean writers can just depend on it), (c) the visual output will be better (for one, it will be native vector fonts laid out, not a canvas drawing which is not infinitely zoomable or non-math aware SVG where it's just pretty pictures), (d) it will be able to interact with all other browser capabilities better than any pure-JS implementation.The syntax is irrelevant, as it can be a target for any other syntax one prefers. In fact MathJax already delegates to MathML rendering where it can.
 The article is about fixing all those issues.
 Yes, because JS is superior to C++ in terms of security.I am in favor of bundling MathJax's MathML implementation in browsers though.
 >Yes, because JS is superior to C++ in terms of security.Yeah, because moving 1/100th of web rendering (the math rendering part) to JS is going to make things more secure...
 Is it? Keep in mind that the JavaScript is likely running on a C++ interpreter, and has mostly unfettered access to the page's content by design.
 Yes it is. Bundled JS won't add new use-after-free, new C++ code will.
 I've been coding C++ for 15 years and I've never seen a use-after-free in the wild. (I've seen lots of other bugs and security problems, but not use-after-free.)Use-after-free is a C thing, not a C++ thing. Granted, C++ makes is super easy to code in C, but that's an organization problem that is already solved in any sane project.
 Not to mention something that almost any static analyzer will catch on the first run...
 Can you recommend a static analyzer to Chromium developers? They appear to have problems with basic C++ programming; such a pity that Google cannot afford to hire competent developers like otabdeveloper2.
 That's supposed to be a witty retort? Did you bother to read those bug reports you've linked to?They are already tied to static analyzers, which is how they were found. What do you think the: "Sanitizer: address (ASAN)" or "Issue 938699: AutotestPrivateApiTest.AutotestPrivate getPrinterList failing on ASAN/LSAN" in the bug reports means?
 I see, you are merely unfamiliar with terminology.The word "static" refers to compile-time; a static analysis reports errors or warnings based only on the source code of the program.Sanitizers are dynamic analysis based on instrumentation. https://github.com/google/sanitizers/wiki/AddressSanitizer The tool consists of a compiler instrumentation module (currently, an LLVM pass) and a run-time library which replaces the malloc function.  In order to detect bugs with sanitizers, you have to find a test input that actually moves program execution towards UB. This is best done with a fuzzing setup like clusterfuzz, and lots and lots of CPUs, which Google fortunately has no shortage of.https://github.com/google/clusterfuzzAs Dijkstra said, Program testing can be used to show the presence of bugs, but never to show their absence.
 Having first-class tex/postscript support in the browser would certainly be better than having to support the MathML monstrosity.
 MathJax is too slow for more appy and interactive use cases. With firefox's native implementation I have seen nothing but "instant" rendering.
 Using MathML notation directly isn't a bright idea and MathJax rendering kind of works, but having a formulae renderer directly in browsers/document viewers is much better. Using a javascript library or even remote http calls (!) to render static formulae on client side is extremely slow, wasteful and limiting way to render mathematical formulae.
 Is MathML even made to be written by hand ? Or should we use another tool that translates our formulae into MathML ? The syntax is awful and redondant
 Mathjax is slow and requires client side JavaScript.
 You can render Mathjax offline for non-interactive purposes (ie most use-cases) using 'mathjax-node-page'. I recently implemented this for my website; now pages like https://www.gwern.net/Embryo-selection display math instantly (instead of requiring ~5s - it's a big page), and also no longer require the MathJax JS, just the fonts/CSS.
 Or KaTex, which I believe is superior to MathJax.
 MathJax requires JavaScript, though; MathML doesn't.
 I hope MathML gets more support. It is a bit unwieldy, but it seems like a simple solution to displaying maths on a simple HTML page, here is my example.
 this is great news.Does anybody know if this will translate to chromium based web toolkits (eg QT), and MS Edge (soon to be based on chromium), and therefore, React-XP [1] ?

Applications are open for YC Summer 2019

Search: