
Breaking the Zyzzyva encryption - cdelsolar
https://medium.com/@14domino/breaking-the-zyzzyva-encryption-f00360b695d1
======
Hello71
> The real hackers will know that as soon as I found evidence of
> sqlite3_key_v2 in the Zyzzyva dylib file that getting the key was
> inevitable. I don’t actually know the steps for removing debug symbols from
> compiled code off the top of my head, but I bet if this had been done, this
> would have made my job much, much harder.

I'm not entirely sure about OS X, but at least on Linux, system-assisted
dynamic linking (i.e. not mmap(PROT_EXEC)) requires that all required symbols
are exposed so that relocation can be done in the original executable; in
other words, the OS needs to know where the functions in the library are so
that it can tell the program how to call them.

Of course, you could obfuscate the function names, but then tracebacks
wouldn't work properly and at that point you'd be better off just statically
linking the whole program.

Debug symbols are completely different; if you have those, you can simply do
"frame variables" which shows the args with names.

> Yesss. Time to get out the x86 assembly hats.

You don't even really need to do that. Since you know the function signature,
you can assume (since it is in a separate library) that the function uses the
standard System V AMD64 ABI where "the first six integer or pointer arguments
are passed in registers RDI, RSI, RDX, RCX, R8, and R9" [0], meaning that the
pKey pointer is probably in RDX. I know that the author said that it was in
RAX, but since that is caller-saved, there must have been some copying or
processing done to it inside the function.

[0]
[https://en.wikipedia.org/wiki/X86_calling_conventions#System...](https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI)

~~~
jtchang
I have a pet project trying to reverse engineer an android app if you are
interested :)

------
miles
From _Who Owns Scrabble’s Word List?_ [1]:

"Dictionaries enjoy copyright protection for two main reasons: Their creators
make judgments about what words to include, and entries feature definitions
and other original material. (Just last week, a federal court in Massachusetts
ruled against[2] a plaintiff who wanted to copy and repurpose the bulk of
Merriam-Webster’s Collegiate, including definitions, for his own dictionary.)
But in 1991, in _Feist Publications Inc. v. Rural Telephone Service Co._ [3],
the Supreme Court decided that a phone company wasn’t entitled to a copyright
on its white pages. That’s because the list of names and numbers lacked an
important requirement: originality."

[1]
[http://www.slate.com/articles/life/gaming/2014/09/major_scra...](http://www.slate.com/articles/life/gaming/2014/09/major_scrabble_brouhaha_can_you_copyright_a_list_of_words.html)

[2] [http://www.scribd.com/doc/241384392/Richards-v-
Webster](http://www.scribd.com/doc/241384392/Richards-v-Webster)

[3]
[http://scholar.google.com/scholar_case?case=1195336269698056...](http://scholar.google.com/scholar_case?case=1195336269698056315&hl=en&as_sdt=6&as_vis=1&oi=scholarr)

------
leecb
Isn't this a violation of the DMCA's anti-circumvention section? This seems to
be explicitly describing how to circumvent protection measures for a
copyrighted work.

[https://www.law.cornell.edu/uscode/text/17/1201](https://www.law.cornell.edu/uscode/text/17/1201)

~~~
skywhopper
The fact that the DMCA could criminalize the act of inspecting the contents of
an executeable file acquired legally and running on your personal computer and
then telling other people about it is pretty good evidence that the DMCA is an
immoral law that should be violated as much as possible. Kudos to the
article's author.

~~~
userbinator
Indeed, I think we should be doing everything we can to stop things from going
further in the direction of the dystopia in Stallman's famous story:
[http://www.gnu.org/philosophy/right-to-
read.en.html](http://www.gnu.org/philosophy/right-to-read.en.html)

~~~
black_puppydog
It is sometimes surprising how accurate some of Stallman's dystopian visions
were and it is frightening because some of them have not become true. Yet.

------
madars
It seems that Linux version has a version of the database as plain text
(including their meanings); wc -l tells me that OSPD4.1.txt has 178378
entries, which seems about right. edit: seems like that's only true for v4 and
below, while the article could be about v5 (it doesn't say)

------
Matt3o12_
How can you copyright a wordlist anyways? I'm really not familiar with
scrabble but doesn't it just contain plain English words without any context?
They could of course copyright the order but couldn't you just shuffle the
list and publish it.

An explanation how copyright works in this case would be great.

~~~
Ded7xSEoPKYNsDd
Some countries have a database right besides normal copyright. Copyright seems
like a stretch, but I can understand why hobbyists don't want to take the risk
of getting sued even if the law should be on their side.

[https://en.wikipedia.org/wiki/Sui_generis_database_right](https://en.wikipedia.org/wiki/Sui_generis_database_right)

~~~
Matt3o12_
Thank you very much for the link but this sounds really stupid. Couldn't I
just create a database with all posibile additions ranging from 1,000,000 to
10,000,000 and sue everyone who publishes a book/paper etc because of the
"investment that is made in compiling a database" and it doesn't matter if
they calculated the result themselves because they could've just used my
database.

This is really stupid. I mean I get copyright but I think copyright should
only apply to "the 'creative' aspect[s]" if the author wants that.

~~~
braythwayt
We have had a similar discussion on Hacker News about a company that claims to
be mechanically generating all possible arrangements of words of a certain
length and copyrighting those, as well as all possible images of a certain
size, and all possible musical melodies of a certain length, and so on.

The reason this is not considered copyrightable is that there must be some
evidence of creative effort. Owning an infinite number of monkeys and
typewriters does not entitle you to copyright everything they generate.

------
fit2rule
Excellent work .. and of course, a salient reminder of why we all,
individually, should copyright our own works, even if it is something done for
free and/or for volunteer basis with no commercial interest. A right not
exercised is one lost.

I think its preposterous that someone is able to trademark a word list. I bet
its not even complete.

~~~
zxc1234
Sick system. In other developed countries you have copyright if you do it. You
don´t need to register that right anywhere. Of course more difficult to prove
but thats another story...

~~~
CrystalGamma
When I found out you had to register your stuff for copyright in certain
countries, I was actually surprised ... I just thought having copyright for
your work by default was normal.

~~~
thristian
I believe it's normal in all countries that are signatories to the Berne
Convention, which is... apparently ~160 of the ~190 UN member states.

------
cool-RR
I wonder whether you can bypass copyright for a word list if you feed it into
a bloom filter[1] and then save just the bloom filter.

[1]
[https://en.wikipedia.org/wiki/Bloom_filter](https://en.wikipedia.org/wiki/Bloom_filter)

~~~
lisper
If I were inclined to twist the copyright tiger's tail, the way I would do it
would be to encrypt the plaintext with a one-time pad and them publish the
cyphertext and the pad anonymously in two different locations (preferably on
two different domains). The key and the ciphertext in a one-time-pad are
mathematically indistinguishable, so both publishing parties have plausible
deniability that what they published was the key, i.e. just a string of random
bits, which if course they have every right to do.

An even more interesting experiment would be to copyright the resulting key
and the ciphertext, and put in the TOS for getting either one that you will
not sue the publisher for any copyright violations.

That is, if I were so inclined :-)

~~~
kevin_thibedeau
You can't copyright a random number although I suppose you could insert some
deadbeefs for "artistic" effect without compromising the pad.

~~~
lisper
> You can't copyright a random number

Why not?

~~~
Dylan16807
No creative effort.

~~~
lisper
So write a haiku, xor your random key with repeated copies of your haiku, call
the result modern art, and copyright that.

~~~
Dylan16807
And when it serves a functional purpose as key you get to use it anyway.

------
cosmicexplorer
so like.......did they put this decrypted database online, or something? true,
we could just perform the same operations they did, but if you're going to go
through the trouble of putting your crack in public, might as well spread its
fruits too

~~~
Buge
I'm pretty sure Cesar wants to avoid such a direct copyright violation. Sure
just breaking the encryption might be considered a violation of the DMCA anti-
DRM stuff, but that is a much more controversial law that many people oppose.
A lawsuit over the anti-DRM would likely pull the EFF in, and make large news,
while a lawsuit over plain spreading copyrighted information would be much
more straightforward and likely for him to lose.

By only publishing the steps, he gets the benefit of the publicity of breaking
the encryption. Then anonymous people can easily break it themselves and
spread the actual list, free from worry of being sued.

------
userbinator
My guess about the origins of the key, upon seeing its length, is that it is
the SHA256 of something - could it be one of the words in the list?

------
opcvx
Wouldn't it be easier to just dump the entire process after the database is
loaded and decrypted in memory, and pick out the words?

------
hit8run
Rest In Protein Zyzz.

~~~
n3on_net
we all gonna make it brah

