
Cryptic Crossword: Amateur Crypto and Reverse Engineering - breadbox
http://www.muppetlabs.com/~breadbox/txt/acre.html
======
bbanyc
The punchline to this story is that a few years ago the Times stopped
scrambling their .puz files, making all this reverse-engineering work largely
irrelevant.

(At the time I was using a modified version of the "xword" program in Debian's
repo, which didn't detect whether the file was scrambled. In other words, it
treated every letter as wrong because it didn't match the enciphered grid. I
ended up hacking in some code to detect these files and disable the
check/reveal features when playing them.)

~~~
breadbox
Very true! I mentioned that fact the first time I gave this presentation, but
it wound up being an anticlimactic ending, so I chose to omit it from the
written essay. (And at this point, the focus is more about the process of
reverse-engineering anyway.)

EDIT: To be precise, there were still a few other crossword publishers using
the scrambling feature. None as important as the New York Times, though, of
course.

~~~
StavrosK
Problem solving is its own reward!

------
davepeck
This is a great and entertaining read about reverse engineering.

It's such a good read that this is almost beside the point... but, as it
happens, I worked on and reverse-engineered this same "encryption" scheme (I
hesitate to use the word) for an iOS app that never shipped. I just dumped the
code (which seems to have been written in late 2008) up on github... it's old,
and messy, but hey, maybe it's fun for someone:

[https://github.com/davepeck/puzfile](https://github.com/davepeck/puzfile)

------
danielpunkass
Very cool rundown of the approach to trying to decode this. FWIW there is also
a significant archive of information about the format here, including
information about the scrambling:
[https://code.google.com/p/puz/wiki/FileFormat](https://code.google.com/p/puz/wiki/FileFormat)

~~~
snori74
Indeed, staggering how patient and determined some people can be. I love this
line when asked by his friend if he would be able to reverse-engineer this
scrambling algorithm:

 _" My response was: maybe. Hard to say, but I'm willing to try. Privately,
though, my reaction was THIS IS MY DREAM PROJECT AND THERE IS NO WAY I'M NOT
SPENDING ALL AVAILABLE FREE TIME ON THIS."_

------
TrainedMonkey
Extremely methodical and determined approach. Especially analysis of the
errors that partially successful approaches encountered. Well done.

------
mistercow
>In a way this is just a restatement of Occam's Razor, but I like it because
it clarifies why Occam's Razor is a good idea. It's not because simpler
solutions are actually more likely to be true; they usually aren't. It's
because it's almost always easier to improve a simple solution by adding
complexity, than it is to improve a complicated solution by digging out a
simple solution buried within it.

While the second part of that is an interesting observation, the first part is
simply false. It basically comes down to prior probabilities and conjunctions.
Every bit of new information implied by a hypothesis is another "and". It is a
simple fact that P(X) ≥ P(X and Y), so the more conjunctions your hypothesis
implies (the more complex it is) the lower its prior probability.

------
bluedino
Wouldn't it have been easier to disassemble the program that works with these
files, and analyze the code?

~~~
odin1415
Maybe, but the author mentioned that he wanted to reverse engineer it as a
black box, out of legal concerns. It makes a more interesting challenge this
way too.

~~~
voltagex_
I particularly liked the approach to automation of the application, even
through WINE. Setting the time with LD_PRELOAD is a really neat trick.

------
mistercow
Man, reading little-endian binary formats makes my head hurt. I get why it's
done that way, but what a nightmare for comprehending what you're reading.

~~~
userbinator
After enough time spent reading hexdumps you get used to it, and then all of a
sudden big-endian feels _really_ backwards.

~~~
mistercow
Well, assuming you're reading little-endian hexdumps. I cut my teeth on OS X
back in the PPC days, so to me it is just the opposite.

