
Computer scientists develop 'mathematical jigsaw puzzles' to encrypt software - fdb
http://newsroom.ucla.edu/portal/ucla/ucla-computer-scientists-develop-247527.aspx
======
ColinWright
It's been interesting seeing this story submitted repeatedly over the past
week or so, each time getting no traction, no attention, no comments, and no
love. Nice to see it finally hit the front page.

The actual paper is here:
[http://eprint.iacr.org/2013/451.pdf](http://eprint.iacr.org/2013/451.pdf)

Here are some of the submissions of alternate write-ups of this story, in case
you want to see if some other source gives more details:

[https://news.ycombinator.com/item?id=6125268](https://news.ycombinator.com/item?id=6125268)
(phys.org)

[https://news.ycombinator.com/item?id=6126234](https://news.ycombinator.com/item?id=6126234)
(sciencedaily.com)

[https://news.ycombinator.com/item?id=6129664](https://news.ycombinator.com/item?id=6129664)
(rdmag.com)

[https://news.ycombinator.com/item?id=6132772](https://news.ycombinator.com/item?id=6132772)
(ucla.edu)

You can read other submissions on HN about homomorphic encryption by following
this search:

[https://www.hnsearch.com/search#request/all&q=title%3A%28hom...](https://www.hnsearch.com/search#request/all&q=title%3A%28homomorphic+encryption%29&sortby=create_ts+desc&start=0)

There are lots of them, too.

~~~
andrewcooke
to save anyone else looking, all the reports seem to be generated from the
same press release - there's no significant new info in any (compared to the
link here) that i could see.

the paper is a monster :o(

------
huhtenberg
Reads like a marketing junk to be honest. I don't doubt that they have
invented something interesting, but this is _not_ a way to announce it.

> _This is known in computer science as "software obfuscation," and it is the
> first time it has been accomplished._

No, of course not. See Fravia, see skype.exe.

> _" The real challenge and the great mystery in the field was: Can you
> actually take a piece of software and encrypt it but still have it be
> runnable, executable and fully functional"_

A mystery? Is this edited for an O magazine? Again, Fravia, Skype,
Carberp/bootkit.

> _According to Sahai, previously developed techniques for obfuscation
> presented only a "speed bump," forcing an attacker to spend some effort,
> perhaps a few days, trying to reverse-engineer the software._

Uhm, no? Again, see Skype that withstood reverse-engineering attempts for
several years with its incremental decrypting loader and other tricks that it
was stuffed with to the brim.

~~~
Nimi
They're talking about an obfuscation that's _provably_ strong - i.e. there is
_no_ way to de-obfuscate the code, not even in a million years.

You're talking about an obfuscation that's "good enough for the real world" \-
i.e. it will take a very skilled person several years to de-obfuscate.

Also, their result is very theoretic, and too slow to be deployed in the real
world, at least for the next couple of years. See my other comment if you're
interested.

BTW - I think by "mystery" they meant "open computational problem". Kind of a
mystery if you think about it :-)

~~~
freyr
"there is no way to de-obfuscate the code, not even in a million years."

Perhaps you're just being hyperbolic, but can you qualify this statement? Is
it not possible in a million years _given today 's technology_ (this seems
unlikely, even the marketing material suggested 100 years)?

Or are you claiming there is mathematical proof that advances in computational
power, a growing body of knowledge about de-obfuscation, etc., will not break
this approach?

~~~
Nimi
The latter - there is mathematical proof that advances in computational power,
a growing body of knowledge about de-obfuscation, etc., will not break this
approach.

Also, this is not "marketing material" \- nobody is selling this technology,
and likely nobody will in the foreseeable future. They're just trying to
interest the general public in a major theoretical CS breakthrough.

~~~
freyr
I wasn't aware such a proof existed, that's very cool.

But it's certainly marketing material. They're selling UCLA's engineering
department to potential students, potential sponsors, and the general public.
That's not a bad thing. Universities would be stupid not to promote their
brand and their professors' achievements.

~~~
Nimi
Minor caveat: the proof says something along the lines of "to de-obfuscate
this code, you must be able to find the prime factors of insanely huge
numbers, and we assume no one knows how to do that". So it's a bit inaccurate
to say that "no advancement will solve this".

------
Nimi
My humble understanding is this is a very theoretical result, unlikely to
result in unreversable malware, or improvements in the DRM near you.

Here's a brief summary (obviously, I might be missing a lot of things):

1\. "They (researchers in 2001, some of which are authors of this new paper)
showed that there exist unobfuscatable functions – a family of functions {f s
} such that given any circuit that implements f s , an efficient procedure can
extract the secret s; however, any efficient adversary given only black-box
access to f s cannot guess even a single bit of s with non-negligible
advantage."

That result still holds - one cannot obfuscate any function, and this is
proven.

2\. "indistinguishability obfuscation: An indistinguishability obfuscator iO
for a class of circuits C guarantees that given two _equivalent_ circuits C 1
and C 2 from the class, the two distribution of obfuscations iO(C 1 ) and iO(C
2 ) should be computationally indistinguishable."

Note that this works _only for equivalent circuits_.

3\. "Using indistinguishability obfuscator for NC 1 together with any
(leveled) fully homomorphic encryption (FHE) scheme with decryption in NC 1
(e.g. [Gen09b, BV11, BGV12, Bra12, GSW13]), we show how to obtain an
indistinguishability obfuscator for all polynomial-size circuits".

Again, this is indistinguishability obfuscator, which works only for
equivalent circuits. Also, FHE is _very_ slow nowadays, AFAIK there are no
actual deployments of that concept, because of the prohibitive slowness (e.g.
a single AES encryption taking days).

4\. "Using indistinguishability obfuscator for polynomial-size circuits,
together with injective one-way functions, public-key encryption, and a novel
variant of Sahai’s simulation-sound non-interactive zero knowledge [Sah99]
proofs, we show how to obtain functional encryption schemes supporting all
polynomial-size circuits."

This is awesome and sounds like it _can_ obfuscate malware or be used to make
actual DRM, but again, the indistinguishability obfuscator is likely so slow
as to not be practical these days. Maybe in a few decades?

Obviously I'm not writing this to take anything away from this huge
theoretical result - just saying this is likely not what other commenters
think it is. And again, my reading of this is very possibly inaccurate.

~~~
venomsnake
So their use of FHM is a bit like someone saying - we could colonize Jupiter
if we just get enough anti matter.

And how that could be used as a DRM - this is the part I don't get?

~~~
Nimi
This isn't that far-fetched - I wouldn't bet my life that this technology will
still be undeployed in, say, 50 years. Academic research is about those
discoveries too - even everyday stuff that is now taken for granted, like
memory garbage collection, was considered impractical in the beginning.

Regarding DRM: "In Functional Encryption, ciphertexts encrypt inputs x and
keys are issued for strings y. The striking feature of this system is that
using the key SK y to decrypt a ciphertext CT x = Enc(x), yields the value
F(x,y) but does not reveal anything else about x".

You can take x to be the code of the computer game you wrote, and F to be a
code simulator. This sounds like the type of DRM game manufacturers want.

~~~
venomsnake
Once again I cannot see how it can be used as DRM - what prevents from
installing in a VM and sharing it around, writing a wrapper around the
executable that intercepts system calls and just gives what it wants etc?

~~~
Nimi
This is too complicated to answer in a forum post.

The general idea is that no number appears in the code as-is, instead all
numbers appear encrypted. You have to have a way to do "encrypted
multiplication" \- take two encrypted numbers, and get the encryption of their
multiplication, without decrypting them in the process. Also for addition.
This is called fully homomorphic encryption (finally discovered several years
ago, by one of the authors of this paper). This paper builds upon that result.

Edit: also, see here for a simpler technique that seems to work:

[https://news.ycombinator.com/item?id=6160742](https://news.ycombinator.com/item?id=6160742)

------
MarcScott
Maybe I'm being naive (so please correct me if I'm wrong), but isn't this a
dream-come-true for malware authors?

~~~
sarreph
How so? Would there be no way to 'fight fire with fire' agains said malware,
also, though?

------
gizmo686
Amazing, an article actually links to a research paper! What the paper is
claiming is to have invented a indistinguishably obfuscater. This means that
for a program X, you can consider the set of all source codes (of equal size)
which generate a program with equivalent behavior to X. The obfuscater can be
used to draw a function from this set, without revealing anything about the
original source code. As the research paper mentions, although this meets the
technical definition of best possible obfuscation, it is not necessarily good
obfuscation. For example, if the obfuscater generates the most human-readable
version of the input, then it would still qualify, as it reveals nothing about
the original source.

The bigger problem with using this encrypt software is that software is over
specified. Every external call your system makes, whether or not it _actually_
does anything, is still considered part of the behavior of the program, and
would therefore leak information about the structure of your program.

While this is defiantly much stronger than many previous obfuscation
techniques, in order for it to be most effective you would need to very
strongly keep all side-effect generating code separate as isolated as
possible.

EDIT: From the paper: "Now that we have constructed an indistinguishability
obfuscator, we are faced with the question: what good is an
indistinguishability obfuscator? The definition of indistinguishability
obfuscation does not make clear what, if anything, an indistinguishability
obfuscator actually hides about a circuit. In particular, if the circuit being
obfuscated was already in an obvious canonical form, then we know that the
indistinguisha- bility obfuscator would not need to hide anything...we will
use indistinguishability obfuscation by constructing circuits that inherently
have multiple equivalent forms"

Also, the application to software obfuscation is largely an afterthought in
the paper. And I don't see any analysis of the efficiency of software
generated by this, so my guess would be that this is infeasible to use for
obfuscation.

------
tehwalrus
I quite like this - nothing is going to stop companies wanting to secure their
code on customers' machines, and this is at least an elegant way to do it.

I do see problems though, e.g. people trying to make malware could use this
very effectively: make a small pointless change, reencrypt, repeat, you then
have a pseudo-unique malware signature across as many PCs as you can reach, so
antivirus is useless.

~~~
VMG
You don't even need to change the source, you just use a different key.

~~~
VLM
A different salt might be another way to phrase it.

------
rainforest
My naive understanding is this approach is that there will be clear units that
do small bits of work. With a tracing framework and some analysis, like that
recently presented at BH [1], I wonder if blocks of code could be extracted to
remove some of the obfuscation - if the blocks are meaningful (or can be
simplified).

Does anyone have any ideas if this sort of analysis looks feasible in response
to this kind of obfuscation?

[1] : [PDF] - [https://media.blackhat.com/us-13/US-13-Raber-Virtual-
Deobfus...](https://media.blackhat.com/us-13/US-13-Raber-Virtual-Deobfuscator-
A-DARPA-Cyber-Fast-Track-Funded-Effort-Slides.pdf)

------
anologwintermut
The abstract of the paper is far more informative that then article.
Functional encryption allows to decrypt a ciphertext c with a function (there
can be many) f and get F(c), without revealing anything else about c.

This was previously doable where F was public in a way that could be plausibly
efficient. This was an interesting result because it might lead to things like
efficient searchable encryption without using very slow fully homomorphic
encryption(FHE). A lot of these applications, however required F to be
obfuscated.

This paper achieves hiding f, for a somewhat weak notion of obfuscation, and
more crucially, by using fully homomorphic encryption(FHE). Given that FHE is
effectively (very really but very inefficient) cryptographic pixie dust, it's
not too surprising you can do this. However, for a lot of the applications for
functional encryption, you could do int with FHE in other ways.

------
norswap
I have a hard time figuring this out. If the code executes, then it means the
processor reads instructions. Surely there is some way to access those
instructions?

~~~
tenfingers
The same results can be derived from different instructions. By simplifying;
you can perform multiplication by repeated addition, or combinations of both,
or even more clever tricks which would hardly make the original intent of
simply making a multiplication obvious.

Running the code in a simulator would give you the final output, but would do
very little to explain how the code itself is working, which is what you
usually are after.

~~~
norswap
Okay, so this is simply an advanced form of obfuscation. An hypothetical
uberman can still reverse engineer it. I don't expect anyone to, but the
article seems to claim that there is some mathematical impossibility to do so.

~~~
tjaerv
Read up on homomorphic encryption:
[http://en.wikipedia.org/wiki/Homomorphic_encryption](http://en.wikipedia.org/wiki/Homomorphic_encryption)

------
farseer
I wonder if this would hit software performance?

~~~
mistercow
It's really hard for me to imagine that it wouldn't. No matter how you look at
it, at the end of the day, it can't be putting the same machine code into
memory, or an attacker would just look at that and disassemble it.

But cryptography is full of things that I wouldn't have imagined, so maybe?

------
VMG
PDF:
[http://eprint.iacr.org/2013/451.pdf](http://eprint.iacr.org/2013/451.pdf)

------
6d0debc071
Did this just effectively mission-kill all copyrights on algorithms?

------
VLM
An entertaining way to spend time on HN articles about theoretical work might
be to provide a "practical" example.

My interpretation of the paper and the "state of the art" is that its
previously possible to submit "bunchadata" "iamakey34335" and "clearance=T OR
NOT fired=F" (edited) to a decryption algo and your key and boolean stuff mix
together to decrypt the data. The point of the paper is its now possible to
submit a more complicated program than just the boolean like
"clearance_level+seniority_years>0x10".

The entertaining part of the discussion would be if I got the basic concepts
right or wrong, not so entertaining to debate if the greater-than symbol was
proven or other tiny details like that.

------
pornel
Do I understand that correctly that it allows writing Malbolge[1]-like
programs easily for the owner of the key?

[1]
[http://en.wikipedia.org/wiki/Malbolge](http://en.wikipedia.org/wiki/Malbolge)

------
kunai
Yeah, well, OS X has been using this system to encrypt loginwindow,
Finder.app, Dock.app, and iTunes.app, so this doesn't seem like much to write
home about.

------
jonahx
does this mean that unbreakable DRM will now be possible?

~~~
pornel
Is there anything that prevents running such program in a VM?

