Hacker News new | past | comments | ask | show | jobs | submit login
The TeX tuneup of 2014 [pdf] (tug.org)
178 points by yiransheng on Apr 27, 2014 | hide | past | web | favorite | 69 comments

Sorry, lost the original. I will try to recreate it:

The big advantages of TeX are:

(1) The purpose of TeX was and is to lower the labor and cost of high quality document preparation of heavily mathematical material and to permit authors to prepare such documents themselves. For such work, TeX is and near the beginning long has been the very welcome, highly respected, unchallenged international standard. For people writing such documents, TeX is nearly essential. Before TeX, just the 'typing' could be more work than the work the typing was communicating. Before TeX, preparation of documents with mathematical material was grim for authors, typists, publishers, etc.

(2) The TeX software has produced almost exactly the same output from the same input on nearly any computer over decades.

(3) The software is essentially totally free of bugs.

(4) Knuth's documentation is exemplary.

In a sense, TeX is yet another word processing text markup language; the big differences are high quality output, especially of mathematical material, and the macro language.

For more, TeX was not to cut new ground in graphical arts; instead, essentially TeX looked backwards, not forwards, in that the documents were and are to be essentially just black marks on white paper much like huge examples going back 100+ years on the shelves of the research libraries. TeX was to ease the production of such documents and not to produce different documents.

So, TeX is not for just everything that can be put on paper, a screen, a cereal box, or a billboard; it is not for animated foils, animated movies, interactive user interfaces, engineering drawing, Web pages, routine text documents, routine e-mail, etc.

TeX is not for average customers of Apple or Microsoft Word or some high end graphical arts software from Adobe.

Criticisms of TeX seem mostly to be from people who don't need TeX. For people who really need TeX, the criticisms are from not very important down to irrelevant.

TeX reminds me of GCC — huge, complex codebase, unhelpful error messages, and designed to just be a compiler/typesetter. I wish someone were to rewrite an LLVM-like TeX system. Simpler code, could be used to write libraries to do live syntax checking and formatting, beautiful error messages, and fast typesetting times.

I don't know of any complete rewrites, but there are some cleanup/extension efforts underway, which are increasingly displacing "classic" TeX. The pdfTeX / LuaTeX line is probably the most actively developed codebase. It's 100% C now (no more Pascal or WEB code), and supports modern things like UTF-8 input, the lack of which was one of the more frequent pain points of trying to use classic TeX today.

> and supports modern things like UTF-8 input

Speaking of plain TeX:

Yes, they support UTF-8; that is, support it with a command-line switch which is a bit tricky to use. It's not like TeX can detect the encoding of your file and set things up automatically.

GCC is on par with LLVM performance-wise, and, as of the latest version, has those "beautiful" messages as well.

Do you have a reference for that statement? Does LLVM support the C++11 standard? And does it support OpenMP?

About the error messages in GCC 4.8: http://gcc.gnu.org/wiki/ClangDiagnosticsComparison

The core of TeX (written by Knuth) is actually quite small for what it does - approximately 24kLOC when translated from Pascal to C. The "TeX systems" that are around today are huge because of all the other things that dwarf it, and honestly I don't think everyone needs all or even a majority of that - I know that the parts of TeX that I use to turn a .tex into a .ps can fit on a single floppy.

There's also the fact that LaTeX is often bundled in, and that is another "monstrosity" in itself. (I exclusively use plain TeX, so I have no need for that either.)

I've been wanting to look at TeXmacs for a while now. It seems to have similar typesetting abilities as TeX, but the ability to edit the document live and connect with a running program. I haven't had the time to look into it, though, so it may not be what you want.

TeXworks is a simple GUI for editing *TeX, and watching the output at the same time:


You can also click somewhere in the PDF view and 'jump to source'. Used it a few times when polishing/debugging a LaTeX document, I generally don't use it for writing though. (I prefer to do all my writing in Vim).

I've seen TeXworks, but it's different than TeXmacs, which is real WYSIWYG editing with a TeX-like layout engine. TeXworks is still very cool, though.

> real WYSIWYG editing

That's an argument against using it, it seems to me. Having the words jump around as you type them is maddening.

Texmacs is quite nice actually. If you know emacs shortcuts and are willing to learn a few new Texmacs specific ones it is the fastest way I know to write mixed text and maths.

When I tried last time a few years ago it was not as complete as LaTeX is, atleast for me, who is much more proficient in LaTex.

Actually the error messages are really helpful, I wish compilers were that good.

I just got an error message pointing to an aux file, which was automatically generated and had nothing to do with me. Yeah, the error messages are super helpful.

In my experience the TeX engine has decent error messages. The LaTeX set of macros, however, do not. What happens is that LaTeX doesn't do as much error checkout at its higher level of abstraction and leaves TeX to sort it out. TeX's errors are correct but tend to refer to the lowest level of code—the least helpful pointer for the reader. It's akin to the error messages you see from a C++ compiler about template functions.


"When 36 years old your code is, look as good it will not ehh."

Lets see you do it.

I can't think of anyone else who has been so effective at managing both the biggest abstractions of theory and the smallest concrete elements of practice.

I once attended one of his lectures about MMIX. After he was introduced he fired up Emacs and I remember distinctly that a ripple of voices ran through the audience. I think it was because most of us expected some form of slides or something.

Then he hacked along for an hour or so never leaving Emacs. On every seat of the room there was a little leaflet with the MMIX instruction set which was very helpful. Donald Knuth showed us all kind of stuff you could do with MMIX and it was one of the most down to earth lectures I've ever heard.

I've heard that Knuth gives some amazing lectures and some really dry, boring ones. I've only seen him talk once, and luckily it was one of the former. It was on SAT and game theory and involved seeing whether some very senior computer scientists in the audience could come up with a strategy to beat him in one of the games (which was entertaining).

Anyone know why his Stanford username has apparently changed from "knuth" to "uno"?


redirects to


Because it's shorter and you can type it with one hand?

The Wayback Machine shows his homepage at that URL from 2002 onward.

knuth is the only developer believing he can ship a (the?) bug free sofwtare. And he is the only developer that I would trust to be able to achieve this.

TeX -at my opinion- in his mind is intended to be untouched after is death as a legacy to all software devs that will claim:

- you can do software bug free;

- you can do software that stand the test of time;

- you can do software that you can measure to be correct in regard to the problem it addresses. (Here: everlasting document typesetting).

It is the only source format/code I have from the 1996 that still compile and gives exactly the same output. And we all claim that softwares always have bugs, and must evolve. Sometimes I think we are just justifying our own lack of dedication and hard work for a perfect design that is stable and does the whole job correctly since beginning.

This software is the one that makes me realize I am fraud and every time I print something with LaTeX I wonder how broken «progress» is when all the bloatware we have fail in so many aspects where TeX succeeds.

I share your admiration for Knuth, but let's not get carried away. Knuth spent 10 years working on TeX and he worked on what was then a fairly unchanging problem domain. As mentioned elsewhere in this thread, many of the needs typesetters have today were (by Knuth's design) pushed into the macro system or the back-ends, thereby making the core of Knuth's work inherentlys stable.

I think that many good developers today, given 10 years of receiving grants, and an unchanging problem set could write high-quality, very stable, very solid code, especially with a large community providing bug reports.

To be clear, I celebrate Knuth's achievement, but I don't view it as a commentary on how software is written today.

And we all claim that softwares always have bugs, and must evolve. Sometimes I think we are just justifying our own lack of dedication and hard work for a perfect design that is stable and does the whole job correctly since beginning.

To give one example where TeX hasn't evolved, and which I think is a wrong decision rather than "perfect design": its Unicode handling is atrocious. Of course, this is because Unicode didn't exist when TeX was designed. But now it does. In fact it not only exists, but is the default text encoding on most contemporary machines. My locale is set to en_US.UTF-8, for example, and I'd therefore expect my toolchain, from grep through TeX, to be able to handle UTF-8 text. But TeX chokes on it.

(Fortunately, the developers of pdfTeX aren't as averse to adding new features as Knuth is, so I've switched to that.)

Sure you can just limit your projects scope, but essentially you are trading "software can do foo with bugs" to "software can not do foo at all". Like it or not, problem domains and world around them do evolve, and no software lives in an isolated bubble. Notorious examples related to typesetting would be Unicode and OpenType.

Knuth's definition of 'bug' is a limited one, that seems to be mainly about logic errors. Allowing a user to incorporate bitmap fonts in a document (which very few users actually wish to do in 2014) would be considered a bug in other typesetting software.

This is actually a fantastic illustration of the gap between academic and industry concepts of computing.

When exactly do you think allowing a user to incorporate bitmap fonts in a document became a bug? It certainly wasn't when TeX was first written. Do you imagine it magically became a bug on a particular date?

Also, isn't TeX limited to the .tex -> .dvi transformation? I'm not sure the transformation from .dvi to .ps/.pdf is even part of TeX proper. And that's where the bitmapped fonts come in.

In summary, all of this may just be a fantastic illustration of the gap between Don Knuth and you.

>Do you imagine it magically became a bug on a particular date?

OT, but things can also gradually turn from features into bugs, or --what the parent probably meant-- liabilities.

Not having an presice date for when something stopped being a useful feature does not prove it still is.

> When exactly do you think allowing a user to incorporate bitmap fonts in a document became a bug?

I was about to say: "Good question. At one point everyone who had access to a computer was a programmer, as using an OS required programming skills. As that slowly phased away, considering issues which lead users into bad paths, documentation errors, and usability issues as bugs became more common."

Then I saw:

> "Do you imagine it magically became a bug on a particular date?"


> In summary, all of this may just be a fantastic illustration of the gap between Don Knuth and you.

Snark is unwelcome here. See http://ycombinator.com/newsguidelines.html

I apologize for the snark.

But I thought that your first statement was questionable, and that therefore your second statement was rather facile.

> Knuth's definition of 'bug' is a limited one, that seems to be mainly about logic errors. Allowing a user to incorporate bitmap fonts in a document (which very few users actually wish to do in 2014) would be considered a bug in other typesetting software.

Certainly, people sometimes use "bug" loosely, but it seems to me there is a strict sense in which a "bug" means something that was not intended by the writer of the code. And the issue you describe clearly does not fit this description.

I would call what you describe a "design flaw", at best. But as I alluded to, it wasn't even a design flaw at the time it was introduced. So it's really more of an anachronism than anything else.

And again, it's not really clear that it's even an anachronism in TeX, since it's more of an issue with the DVI-to-PS (or DVI-to-PDF) transformation.

And then you went on to conclude, from this highly questionable chain of reasoning, that this "bug" is "a fantastic illustration of the gap between academic and industry concepts of computing". Which I just find galling. Here you're talking about one of the most celebrated computer scientists of all time, someone who is almost certainly a lot smarter than you (or I), and you chalk up a difference between what you and he call a bug to the difference between industry and academia. As if Knuth is just some kind of ivory-tower crackpot who wouldn't understand the real-world exigencies of industry.

So: I think your definition of "bug" is highly questionable, and I think that in the future, when you find yourself in disagreement with some celebrated academic computer scientist, you should perhaps linger a bit longer on the possibility that you might be mistaken, rather than chalking it all up to the difference between industry and academia.

The non-'strict' sense of bug is the one used by almost everyone in our industry. I assumed that most of HN knew that.

You can't handle UTF 8, the default $LANG for almost every OS it is installed on? That's a bug. It needs to be fixed.

Your output quality is poor because it uses a custom font system that pushes users towards non-scalable fonts? That's a bug. It needs to be fixed.

Something not being a flaw at a time a piece of software was introduced is also irrelevant. Unmaintained software is generally considered to be poor software. While bitmap fonts were acceptable until the mid nineties, they really aren't now.

If you still think it's just me that holds these opinions, let's test it: try and argue for the 'strict' definition of bug outside academia and see how far you get.

I didn't criticise Knuth at all. I was fairly careful not to do that, and the original moderation on the comment (+3) reflected that. I merely note that academia simply works differently from industry.

I'm aware of Knuth and his contributions. I'm sure he's a lot better at computer science than I am. That doesn't mean he can't be questioned, and you finding it 'galling' seems very much to be a case of hero worship.

> The non-'strict' sense of bug is the one used by almost everyone in our industry. I assumed that most of HN knew that.

"I assumed that most of HN knew that", huh? This is the same kind of tendentious nonsense as "this is a fantastic illustration of the different between industry and academia".

Not only do I _not_ know that "the non-'strict' sense of bug is the one used by almost everyone in our industry", I don't think you know it either, because I think you're mistaken. (Maybe you're in a different industry than I am.) The people I deal with (in industry) regularly make distinctions between bugs, design flaws, and possible feature enhancements. And so do most issue trackers (e.g. GitHub, JIRA), for pete's sake. So I really don't think your use of "bug" as a catchall term for all of these things is universal at all, in industry or out of it.

But "Do you imagine it magically became a bug on a particular date?" is a fairly pertinent question, as one of the advantages of TeX is that you can still typeset manuscripts decades after they were published.

I disagree with calling a deprecated feature a bug. Oranges to apples.

Last time I tried TeX was in the early 2000s, but I suspect it hasn't changed that much. At the time metafont pushed people into bitmap fonts, and using scalable formats was quite difficult.

As Knuth's article points out, TeX had a bug, which has been fixed, and the fix could cause a .tex file from 1996 to give different output after the fix.

I've heard Don Knuth speak at Stanford -- he's really quite something. He actually speaks the exact way he writes, very precisely.

I love how he is planning on another cleanup in 2021, when he'll be 83 years old. That's some serious dedication to your software.

The questioner seems to want to place burdens on all users, rather than on the backs of a few macro- developers.

Words to live by when making software. The best software doesn't concern itself with the burden to the developers; it worries about the effect on the user.

Apple's original rounded rectangle buttons is another good example.

You're right, but the parent statement is the opposite: the questioner wants to put the burden on users, not developers.

No, he specifically talks about a few examples where as a user he prefers the code to look one way. That is, he thinks it should be on the macro writer to allow the users to write the code as they wish. Not learn a new way to write fractions, for example.

By 'he', I meant the questioner, as did my parent post. Your post uses 'he' to mean Knuth. I've edited the post to clarify.

I'm not sure what the topic is, then. :( Apologies for the downvotes, I suspect it is just a misunderstanding.

I love Knuth for things like this: "although I did delete a few bytes of redundant source code and alter two names."

He knows what he wants it to do, it need do nothing else, and in pursuit of that, he'll refine even the most minuscule of non-problems.

Yet he writes against this in the article: "Any object of nontrivial complexity is non-optimum, in the sense that it can be improved in some way (while still remaining non-optimum); therefore there’s always a rea- son to change anything that isn’t trivial. But one of TEX’s principal advantages is the fact that it does not change — except for serious flaws whose correction is unlikely to affect more than a very tiny number of archival documents."

I can't reconcile the two quotes, but I've never encountered someone with more attention to detail.

He means "its behaviour does not change", not "its implementation does not change".

I thought that at first, and that could indeed be one thing he's saying. But I also agree with jmount that he probably understands that any tinkering is likely to produce as many bugs as it fixes, and therefore implementation changes are to be avoided as well unless addressing actual problems that people are experiencing.

Yeah it is interesting. My read is he is saying that any de-stabalizing change isn't worth it at this point (even if it is a net improvement). Yet we all envision Knuth as an optimizer and tinkerer. Of course few people have experience maintaining a continuously important and popular tool for such a long interval.

> Users can rest assured that I haven’t “broken” anything in this round of improvements. Everyone can upgrade at their convenience.

If only more developers took this approach instead of dumping out monthly bug-filled "major releases" and force-updating their users.


Does anybody know how the TeXbook source yields a compilation error? I mean to ask, is it enforced by the TeXbook source or is it hard-coded in the TeX source? (Or neither?) Not trying to get around this copyright enforcement, just curious.

You can view the source on CTAN [1]. The first few lines look like this:

  % This manual is copyright (C) 1984 by the American Mathematical Society.
  % All rights are reserved!
  % The file is distributed only for people to see its examples of TeX input,
  % not for use in the preparation of books like The TeXbook.
  % Permission for any other use of this file must be obtained in writing
  % from the copyright holder and also from the publisher (Addison-Wesley).
    \errmessage{This manual is copyrighted and should not be TeXed}\repeat
  \pausing1 \input manmac
  \ifproofmode\message{Proof mode is on!}\pausing1\fi
[1] http://www.ctan.org/pkg/texbook

This is one of the things that I've always found to be a little odd --- it's one of the simplest ever DRM schemes. My theory is that Knuth put it there just to satisfy his publisher's lawyers, since otherwise he wouldn't let the source be copied freely, much less make it public. Being who he is, he could've also made it much harder to bypass. Rather clever of him, and it somewhat fits with his style.

Of course I could be wrong and he truly has quite conservative views on IP...

see the "copyright infringement" section near the bottom of http://www-cs-faculty.stanford.edu/~uno/abcde.html

Will the next "TeX tuneup" be in HTML rather than PDF? One can only hope.

This would require that, between now and 2021, it becomes possible to use HTML+CSS to create an exact publishable layout, and someone makes a TeX renderer that uses it. That would be great. I'm not optimistic about it, given that right now we can't even manage to standardize e-books, but yes, one can only hope.

(It's not that the TeX tuneup inherently needs to be formatted like a publication; but that's what TeX is for, and it would be silly to expect that the TeX tuneup would not be written in TeX.)

Why are you not optimistic? It takes time, yes, but 7 years is a long time. I personally am optimistic wrt what HTML+CSS can do in terms of typesetting (even PDFs): https://authorea.com/users/3/articles/4675/_show_article

Well we have PDF.js, so it's evidently possible, if a bit tricky.

Different platforms different designs. The web is not a printed page. TeX was not designed for the web and HTML is not intended for a well typeset print. Although It would be nice if my HTML would compile to TeX when a user presses the `print` button on their browser.

feature request: can someone please work on making TeX error messages more readable? i dont know how the project is structured and dont know where to begin. i compile TeX documents daily but I still cringe at the compile output. Its mostly gibberish and does not warrant something that most of the scientific world uses for word processing ...

What would a version of TeX reimagined by Jony Ive look like?

Jony Ive isn't a programmer. Your question is about as germane as asking what a version of the iMac reimagined by Keith Richards would look like.

I would buy that, actually.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact