

Surreptitiously Tampering with Computer Chips - timw6n
https://www.schneier.com/blog/archives/2013/09/surreptitiously.html

======
tptacek
This serves as your daily reminder that modern CSPRNG designs don't rely on a
single entropy source, but rather mix many sources with hashes or block
ciphers into entropy pools.

It would indeed have been bad if Linux (insanely) had converted to relying
entirely on RDRAND as a replacement for the devrandom kernel CSPRNG. But no OS
I know has done anything like that.

Even at 32 bits of entropy (which is an extremely unlikely conspiracy theory,
as Schneier says), mixing the output of RDRAND into the rest of the entropy
available to a kernel would still be a marginal improvement. For instance, it
would mitigate some concerns about cold-start entropy for solid-state machines
without real entropy sources.

The fact that OS CSPRNGs were unlikely ever to have trusted RDRAND directly
(even if wasn't backdoored, what if someone screwed up? That too is somewhat
unlikely, given that Paul Kocher's Cryptography Research reviewed the Intel
HWRNG. But still...), and the fact that it's such a high-profile place to put
a hardware backdoor, leads me to conclude that this is an unlikely vector for
hardware backdoors.

------
kwantam
Note that there _are_ tests that would catch this, but Intel specifically
chooses not to test the hardware RNG using them, according to the paper. (This
is ostensibly to guard against attacks on the RNG using scan mode. That could
very well be true; I don't have enough domain knowledge in designing secure
hardware to know whether this is a common practice or not.)

Detecting stuck-at faults is one of the main applications of scan testing[1].
Generally coverage isn't perfect, but by spending more time and making
particular implementation decisions one can make coverage pretty close to
arbitrarily good.

Moreover, I can always choose to randomize my scan chains if I think this is a
threat, and then an attacker will have to assume that he's only allowed to
modify gates that can't be covered by any possible scan chain. Remember, the
attacker has to modify the mask, so that means I'm making a large number of
chips that all have the same flaw. So if I detect stuck-at faults using
randomized scan chains on any wafer in a lot and I'm paranoid, I can re-run
the failing scan chain on other chips from the same lot to determine whether
it's systematic or random failure.

Lest you think this is an inconvenient test to perform, almost all digital
chips are scanned because it's one of the fastest ways to detect manufacturing
defects. Unlike rewiring a chip, generating stuck-at faults is a really bad
way of being sneaky if the target circuit is being scanned, because as I said
above that's a type of error against which scan is quite good.

[1]
[http://en.wikipedia.org/wiki/Scan_chain](http://en.wikipedia.org/wiki/Scan_chain)

~~~
archer174
This happens to be my area of expertise and is exactly true. Basic DFT would
prevent this from happening without a customer knowing. Also, anything below
65nm and the chip is almost certain to have stuck@ scan testing at the
minimum. The chip would be marked bad during production and tossed out. The
only way in todays market this could get through is if the design was
intentionally done this way, otherwise the test vectors would not match the
silicon. EDIT: After reading more about their decision to use LBIST for this
logic I could see a possible back door, but it does up the complexity of this
hack. In my opinion this is a very low probability attack vector. It would be
much easier to just pay off a disgruntled designer.

------
kens
The designers of the Z-80 did the same thing (modifying dopant levels so
transistors would look like normal transistors but function differently) in
order to avoid competitors from cloning the chip.

See the very interesting interview about the Z-80 (page 13):
[http://archive.computerhistory.org/resources/text/Oral_Histo...](http://archive.computerhistory.org/resources/text/Oral_History/Zilog_Z80/102658073.05.01.pdf)

Faggin (Zilog CEO): "Yes, we were concerned about others copying the Z80. So I
was trying to figure what we could do that that would be effective, and that’s
when I came across an idea that if we use the depletion load the mask that
doesn’t leave any trace, then I could create depletion load devices that look
like enhancement mode devices. And by doing that we could trick the customer
into believing that a certain logic was implemented, when it was not."

Shima (Z-80 designer): "I placed six traps for stopping the copy of the layout
by the copy maker. And one transistor was added to existing enhancement
transistors. And I added a transistor looks like an enhancement transistor.
But if transistors are set to be always on state by the ion implantations, it
has a drastic effect"

I think you could detect dopant tampering (and these traps) with scanning
capacitance microscopy, but I don't have any actual experience with that.

------
makomk
I remember seeing the original journal article a while ago on Slashdot
(somehow it never made it to the front page on HN). It's really ingenious -
even inspection under a microscope can't spot the difference between the
tampered hardware RNG and a honest implementation, it passes all the tests,
and yet it actually has so little entropy that essentially anyone can break
otherwise-secure crypto that uses it.

~~~
pavel_lishin
> yet it actually has so little entropy that essentially anyone can break
> otherwise-secure crypto that uses it.

Isn't this a testable thing?

~~~
makomk
Possibly, if you know what you're doing, but as someone in the comments at
Bruce Schneier's blog points out the changes are small enough that it's
feasible to only backdoor selected batches of chips (for example, just the
ones shipped to countries of interest to the US or companies like Google).

~~~
tlb
In fact, it can also be designed as a marginal circuit so that a small
fraction of the chips fail due to manufacturing variance. You can test for the
weakness and ship the bad chips to people whose privacy you don't respect.

~~~
oakwhiz
The circuit in question could also be enabled or disabled by using a high-
energy laser pulse to blow out a tiny fuse on the ICs as they sit on the
wafer, before sending the wafer out to be cut into individual ICs which are
then subsequently packaged into the seemingly ordinary chips that we are all
familiar with.

------
jstalin
I don't understand how an RNG can be compromised to the point that it's got
very little entropy, yet still be able to pass randomness tests. Can anyone
explain?

~~~
ReidZB
Randomness tests mean _extremely_ little for cryptographic PRNGs [1]. Passing
a statistical test basically just means that your RNG isn't horribly, awfully
broken. Such tests mean almost nothing in terms of cryptographic security.

A statistical test just determines if the output of a PRNG "looks random" from
a very, very narrow point of view, typically not algorithm-specific and often
predefined ahead of time. Cryptographic security, on the other hand, requires
defending against an intelligent adversary who actively seeks to subjugate and
compromise the unpredictability of the specific PRNG in question.

As an example, the Mersenne Twister passes many of the popular statistical
tests with flying colors, but learning 624 consecutive values from it allows
an attacker to reconstruct its internal state and predict all future values.
(That is, it's totally worthless when it comes to cryptographic security.)

[1]
[http://security.stackexchange.com/a/32416/1373](http://security.stackexchange.com/a/32416/1373)

------
PeterisP
Can modern crypto systems conveniently support manual, verifiable entropy?

What I mean - even if I don't trust any of the computers RNG's, if I want, I
can generate a 2048 bit random key manually with a piece of paper and a few
somewhat fair dice. Sure, it'll take time, but it's feasible.

~~~
acqq
4 dices thrown around 200 times are needed for 2048 bits.

~~~
auctiontheory
"Dice" is actually the plural form of "die."

~~~
acqq
I've learned something new. So, dice is the plural of die exactly like mice is
of... err... Anyway, I really didn't know about "die" and dice as singular and
plural, thanks! I've had to check the Oxford's dictionary.

------
scott_wilson46
Any modern ASIC being built would be tested using Automatic Test Pattern
Generation. This involves connecting all the registers in the chip into a
chain and feeding patterns in and out of the chains and stimulating the logic
between registers in a defined manner. These tests will at one level be
looking for Stuck and 0 and Stuck at 1 faults and typically nobody would tape
out a chip without >99% coverage. It would be very difficult to fiddle with
individual gates as the author describes without this being detected with an
ATPG test.

------
Qantourisc
There are tools to test the quality of your random generator. But I forgot the
name.

