
Exploiting LaTeX with CVE-2018-17407 - posix_compliant
http://nickroessler.com/latex-cve-2018-17407/
======
tptacek
This is a story that probably works better with its original title
("Exploiting LaTeX with CVE-2018-17407") than with the editorialized one,
since the editorialized title makes it sound like the vulnerability is what's
good about this story, when in fact it's the writeup about the exploit --- the
vulnerability itself is maybe not that urgent for most users.

~~~
tlb
Changed from "Arbitrary code execution vulnerability discovered in pdflatex",
thanks.

------
svat
This is so cool! If the bug originates in dvips code then it's probably
decades old.

Confession: About a couple of years ago I tried running AFL (as the author
here did) on tex, but got nowhere; couldn't even get it to get to the
interesting parts. (Took ages even figuring out how to compile TeX.) Good call
starting with dvips which works with binary formats... and really cool
exploitation here.

There is a lot of code in a TeX distribution: there's the “core” code written
by Knuth in WEB, and then there's (probably orders of magnitude larger) all
the code of LaTeX (and other) macro packages (written “in TeX”), which are
both quite likely harmless. But there's also a lot of other code that gets
much less attention... from everything that's been written for TeX (and its
extensions) to interface with the system, to common utilities, etc.

------
rixrax
What I find frustrating is that there still has to be an exploit to have these
crashes taken seriously/be blog worthy. We know that in c/C++ based programs
input parsing errors carry high probability for arbitrary code execution.
Instead of just supplying 50 PDFs that seem to crash the program or lib in
unique ways and author/vendor fixing their code researchers have to ‘waste’
time writing exploits to really rub it in.

~~~
userbinator
_I couldn’t resist writing an exploit to go along with it_

That doesn't sound like the case here. He wrote an exploit because he wanted
to, not because he needed to convince anyone.

------
jancsika
Reading this makes me think the author could skip the fuzzer altogether, grep
the C FLOSS universe for the set of old-school, free-wheelin' string handling
functions, and then iterate over the results to find the (hopefully) smaller
set which can take arbitrary input for at least one of the arguments.

~~~
saagarjha
> grep the C FLOSS universe for the set of old-school, free-wheelin' string
> handling functions

You’ll likely find too many to be useful.

------
userbinator
_It reads them both and concatenates them together into t1_buf_array with a
call to strcat() — but without a bounds check! Oops._

Things like this make me wonder what was going through the mind of the
programmer who wrote the code. I learned C less than a decade after it was
invented, but the lack of implicit bounds-checking wasn't something I ever
forgot. Perhaps it helps that I was using Asm before that. Of course then it
was not thought of as a security thing, but just basic correctness.

It's such a simple concept --- make sure there's enough room --- and there is
a concrete analogy to it in the real world --- that I continue to be
disappointed and amazed at how many times someone manages to get it wrong.
Then again, maybe it's just a bias: no one makes the news for doing it right.

~~~
rtpg
Well you have to _never make a mistake_ to not have issues.

I know my phone won’t follow me magically out the door, I take it with me 99%
of the time. I still leave it at home sometimes.

Of course here the “chain my phone to my pants “ solution exists, in the form
of linting rules preventing usage of unsafe APIs, and having a safer API that
enforces checks (for example a strcat variant that requires reporting the
destination container size). Or using checked string libraries instead of raw
char*. Not 100% foolproof but could help things.

The biggest difficulty is C’s abstraction ceiling being so low. Hard to do
stuff like this without making code much bigger than it already is

------
saagarjha
How are you grabbing the address of the place to jump to that calls system? Is
the binary not position-independent?

~~~
nneonneo
The pmap output he shows near the end has

    
    
        Address           Kbytes     RSS   Dirty Mode  Mapping
        0000000000400000    2460     832       0 r-x-- pdftex
        0000000000400000       0       0       0 r-x-- pdftex
        0000000000867000       8       8       4 r---- pdftex
    

which means the core pdftex binary does not have PIE (sadly a very common
occurrence on Linux). pdftex handles certain TeX functions that invoke
external commands, so it has calls to system().

~~~
saagarjha
Ah, I should have seen that. I guess I'm too used to macOS, where you have to
go out of your way to compile binaries with PIE so it's basically always
enabled.

------
zzo38computer
I do not use LaTeX or dvips or Type1 fonts, so I suppose I am not affected. I
wrote my own DVI driver that supports PK fonts and converts directly to PBM
without needing PostScript.

Probably, other users who do not add any new fonts also would not be affected,
I suppose.

That article says "This same vulnerable function is used by other tools in TeX
Live: pdflatex, pdftex, dvips and luatex. I only built an exploit for
pdflatex, the most widely used of the vulnerable tools." I only use the
program "tex", not any of those four.

Still, the article is good and is interesting and explains it, and is good to
fix them for users who do use these things.

