Hacker News new | past | comments | ask | show | jobs | submit login

With LaTeX, the document creator must compile the document, through a fairly persnickety processor (latex, tetex, or an equivalent). Whilst that may still pass poorly-structured documents and result in poorly-formed output, straight-up illegal syntax won't make.

HTML has no such requirements. There's absolutely nothing stopping you from posting that in the first place. Throw something at a browser, and it will try its damndest to ingest it --- that's the "runtime compiled" element of the Web.

(It's not a perfect analogy to software compilation, though it can get surprisingly close.)

Essentially: the Web lacks any editor role. That's long been touted as a benefit. At scale, however, and over time ... it becomes problematic.

Moreover, the commercial Web has a selection process and corresponding evolutionary path which has come to be seen as less than optimal for those seeking high-value, high-relevance content. It is in fact hostile to such content, on multiple bases.

Again, LaTeX alone won't fix these problems. It addresses a few issues of structure. The concept of a post-authoring compliation state (ironically itself somewhat present in many web content-management systems, but not oriented around the document or content itself for the most part, but rather branding, advertising, and surveillance instrumenting) is also somewhat attractive but ... problematic. A discovery-mechanism scoring penalty might help (and Google's certainly applied that in other areas).




> Moreover, the commercial Web has a selection process and corresponding evolutionary path which has come to be seen as less than optimal for those seeking high-value, high-relevance content. It is in fact hostile to such content, on multiple bases.

That aspect has absolutely nothing to do with the technical format of a document.


That's a large part of my larger point. It wasn't, however any aspect of the question you'd asked, and I'd answered.

That said, and despite some quite strong semantic elements to HTML (and HTML5 especially), and the semantic/presentation separation of HTML/CSS, what we're getting, what seems to be encouraged by, and what the adtech-profiting browser vendor and driver of Web browser development seem to be encouraging, is not in fact well-structured, meaningful, high-value, high-relevance documents.

You're absolutely correct that the core of the problem isn't the technical format. But the technical format's become infected by that core problem.


"must"? A web4 browser would probably use a "runtime compiled" implementation.

"Whilst that may still pass poorly-structured documents and result in poorly-formed output" - this by the way happens much more often with LaTeX compared to html in my experience.


A LaTeX document is virtually never directly consumed by a reader. It's first converted ("compiled") into some consumable format. Typically PDF/Postscript, though there can be numerous others.

You seem to be unfamiliar with this aspect of the system?


Don't make assumptions about me. A LaTeX document is also virtually never directly consumed by web browsers. In addition to that we are considering LaTeX for the web, not DVI, not PDF, not something else that you compile LaTeX into.

And well, given the amount of people that use overleaf I would say that a lot of people (although writers instead of readers) consume LaTeX.

(Btw, you can compile html too, try printing it as ps/pdf document)


That the Web is principally oriented around HTML is something of an accident of history. Note that any data can be transferred over HTTP(S), including, on occasion, either compiled or uncompiled LaTeX.

I am not making assumptions about what you do or do not know. I'm telling you how you're being perceived. You have the power to alter that perception. You've failed to use it.


"Note that any data can be transferred over HTTP(S), including, on occasion, either compiled or uncompiled LaTeX."

Sure, but I don't see what this has to do with anything.

As for your perception, I don't care :) Keep it to yourself next time please. You too are being perceived in a certain way as well but telling you how would likely be against this site's rules.


We don’t see html raw either dude


Username checks out. It's not the same: HTML is interpreted and often modified on the fly (with JavaScript), with the source being a click away, but it's still just HTML. On the other hand Latex documents are compiled into other formats like PDF that are distributed, instead of the Latex source.


Correct, further:

- The compilation means that there's at least a check for syntactic validity before any old crap is published.

- JS-based HTML can be fully dynamic to the extent that there's no sense of an underlying document at all. There are times when this is useful. That is an exceedingly small minority of the cases in which it is used. The fact that it's often preferable to rerender an HTML document as PDF, simply for readability, let alone archival, should speak volumes.


I talked about the compilation aspect at https://news.ycombinator.com/item?id=29372937 In addition to that there are various checkers for html.

There is lua-based LaTeX and js-based pdf. Just use html without JS.

"The fact that it's often preferable to rerender an HTML document as PDF, simply for readability, let alone archival, should speak volumes."

The fact that it's always preferable to export a LaTeX document as pdf...


Username checks out? Are people who compile markdown, org (or even LaTeX) via pandoc into html somehow "garbage coders"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: