
Packets of Death - quentusrex
http://blog.krisk.org/2013/02/packets-of-death.html
======
guylhem
_That_ is great HN content!

Debugging deep down the rabbit hole, until you find a bug in the NIC EEPROM -
and the disbelief many show when hearing a software message can bring down a
NIC.

I for one would enjoy reading more content like this on HN that what qualifies
as best as a friday-night hack

~~~
brazzy
> the disbelief many show when hearing a software message can bring down a
> NIC.

Shouldn't be a surprise to anyone. Firmware is just software, and it
necessarily deals with raw bytes. Not really surprising that it can contain
bugs that are triggered by certain byte patterns.

~~~
voidlogic
It would be interesting to see what the actual error in the NIC firmware
source was.

This invalidates my assumption that a shop like Intel probably uses formal
verification in firmware development.

Its also scary to consider how many very important (nuclear/damn control, etc)
systems while themselves might be formally verified are dependent on services
of lower level software (OS, drivers) and hardware (firmware) that are not...

~~~
fnordfnordfnord
Dam control, or damn control? I teach a control systems course, so I know
about both kinds.

PS You should be very frightened. The industrial control systems industry are
slow to adopt new tech/new best practices/anything else new; and even more
reticent to touch anything that "works." Tons of vulnerable equipment
controlling dangerous things with default passwords (if any at all). I could
go on for days.

~~~
kevinherron
Too true... (I work at a SCADA/HMI software company :)

------
ChuckMcM
Makes me wonder if this is related to in-band management? One of the
interesting thing about working at NetApp, which had its own "OS" was that
every driver was written by engineering. That allowed the full challenge of
some of these devices to be experienced first hand.

One of the more painful summers resulted from a QLogic HBA which sometimes,
for no apparent reason, injected a string of hex digits into the data it
transmitted. There is a commemorative t-shirt of that bug with just the string
of characters. It lead NetApp to putting in-block checksums into the file
system so that corruption between the disk and memory, which was 'self
inflicted' (and so passed various channel integrity checks) could be detected.

Here at Blekko we had a packet fragment that would simply vanish into the
center switch. It would go in and never come out. We never got a satisfactory
answer for that one. Keith, our chief architect, worked around it by
randomizing the packet on a retransmit request.

The amount of code between your data and you that you can't control is, sadly,
way larger than you probably would like.

------
jerdfelt
I ran into a similar problem with an Intel motherboard about 10 years ago.

We had problems when some NFS traffic would end up getting stalled. Our NFS
server would use UDP packets larger than the MTU and they would end up getting
fragmented.

Turns out the NIC would not look at the fragmentation headers of the IP packet
and always assume a UDP header was present. From time to time, the payload of
the NFS packet would have user data that matched the UDP port number the NIC
would scan for to determine if the packet should be forwarded to the BMC. This
motherboard had no BMC but it was configured as if it did have one.

It would time out after a second or so but in the meantime drop a bunch of
packets. The NFS server would retransmit the packet but since the payload
didn't change, the NIC would reliably drop the rest of the fragments of the
packet.

Of course Intel claimed it wasn't their bug ("it's a bug in the Linux NFS
implementation") but they quickly changed their tune when I coded up a sample
program that would send one packet a second and reliably cause the NIC to drop
99% of packets received.

While it turned out to be a fairly lame implementation problem on Intel's part
(both by ignoring the fragmentation headers and the poor implementation of the
motherboard) I have to say it was very satisfying to solve the mystery.

~~~
EvanAnderson
Reading about the OP's issue got me to a doc from Intel
([http://www.intel.com/content/dam/doc/application-
note/sideba...](http://www.intel.com/content/dam/doc/application-
note/sideband-technology-appl-note.pdf)) re: the "NC Sideband Interface",
which sounds like the place where the bug that bit you "lives". Reading over
that doc made me shudder a few times, thinking about the complexity and, thus,
potential bugs that could be lurking there. I wonder if the OP's bug was
related, too.

Having the NIC inspecting incoming frames and potentially diverting them to
the management controller sounds like a scary proposition. I'd almost rather
just have dedicated Ethernet hardware for the management controller. The
decrease in switch ports needed is certainly seductive, but I wonder if it's
worth the risk.

(Do you happen to recall which Intel motherboard this bit you on? I was just
getting out of whitebox Intel motherboard-based server builds about the time
you're describing, but I'm just curious if only for the nostalgia.)

~~~
jevinskie
"IPMI operates independently of the OS and allows administrators to manage a
system remotely even without an OS, system management software, and even if
the monitored system is powered off (along as it is connected to a power
source). IPMI can also function after an OS has started, offering enhanced
features when used with system management software."

Yikes! Sounds like system management mode in a BIOS!

~~~
EvanAnderson
It's worse than that. It's not BIOS-- it's a freestanding computer.

You'll enjoy this (or be horrified by it): <http://fish2.com/ipmi/itrain.html>

~~~
rosser
But it's a freestanding computer that means that I don't need to go to the
data center at two in the morning to bring up a box that's kernel panicked.

Yeah, be careful with it. Firewall it silly. But recognize that it's a tool
that can be very useful.

~~~
EvanAnderson
They're definitely useful tools-- don't get me wrong about that. The fear is
that they're controlled, essentially, by an in-band signaling mechanism.

Being grafted onto the same NICs on the server computer that you might
potentially expose to the Internet makes firewalling them a more difficult
proposition. I get a lot of piece of mind from having management interfaces on
an an out-of-band control network whenever possible.

~~~
rosser
If you use your BMC's "pass-through" capability to push its traffic through
your regular NICs, you're Doing It Wrong. In fact, if you're not running a
separate physical network (not merely separate VLANs) that is, to the extent
possible, air-gapped from the rest of the world for your IPMI traffic, you're
probably still Doing It Wrong.

------
EvanAnderson
I've always had mixed emotions about NICs that have hardware assisted offload
features. I welcome the decrease in CPU utilization and increased throughput,
but the NIC ends up being a complex system that very subtle bugs can lurk
inside versus being a simple I/O device that a kernel driver controls.

If there's denial of service hiding in there I wonder about what other
security bugs might be lurking. It's scary stuff, and pretty much impossible
to audit yourself.

Edit:

Also, I'm a little freaked-out that the EEPROM on the NIC can be modified
easily with ethtool. I would have hoped for some signature verification. I
guess I'm hoping for too much.

Edit 2:

I wonder if this isn't the same issue described here:
<https://bugzilla.redhat.com/show_bug.cgi?id=632650>

~~~
jevinskie
Be very afraid of PCI firmwares. You can insert rootkits there that have full
access to RAM. An IOMMU can mitigate this threat.

~~~
EvanAnderson
It sounds like, in this case, the OP is talking about the EEPROM holding code
executed by the embedded coprocessor on the NIC (or, at least, lookup tables
that the coprocessor uses) rather than a PCI option ROM that will be executed
by the host computer's CPU. Depending on how the access to the EEPROM is
performed (i.e. if such access is facilitated by the co-processor versus being
read out directly from the EEPROM) I'd think an attacker could even implement
"stealth" functionality to allow the compromised EEPROM to appear to be benign
when audited.

Depending on what functionality is being offloaded to the NIC (are there still
NICs that do IPSEC and crypto offload?) there's the possibility for
information disclosure vulnerabilities in the NIC itself. Yikes.

~~~
JoachimSchipper
> OP is talking about the EEPROM holding code executed by the embedded
> coprocessor on the NIC

Yes, but this coprocessor has access to the host's PCI bus. That is enough to
totally pwn the machine, since it gives read/write access to all memory (well,
all memory below 4GB, IIRC, but that's enough.)

------
wglb
Very good detective work. However, a small suggestion, given:

 _I’ve been working with networks for over 15 years and I’ve never seen
anything like this. I doubt I’ll ever see anything like it again._

This is a very excellent case for fuzz testing. My thinking is that you want
to whip up your Ruby and your EventMachine and Redis going and run a constant
fuzz with all sorts of packets in your pre-shipping lab.

The idea is that you _want_ to create a condition where you do see it, and the
other handful of lockups that are there that you haven't yet seen.

~~~
Jabbles
Surely that's the manufacturer's job.

Since it's caused by a specific byte at a specific place, surely you'd only
need to fuzz an average of 256 packets (of the required length) to find it...
which suggests it wasn't done at all... zero...

~~~
arnsholt
That's assuming you know the magical position. If you need to test all
positions, it's 256 to the power of the number of bytes in the message.

~~~
DuskStar
But couldn't you test all the positions at once, by having the entire packet
be random data? Unless the packet has to be a particular length for this to
happen, which I didn't notice in the article.

------
TapaJob
Fantastic Article, Fantastic fine. Well done.

As a telecoms engineer predominantly selling Asterisk for the last 4 years and
Asterisk experiance extending back to 2006 it's shocking to see this finally
put right. For so many years, I have avoided the e1000 Intel controllers after
a very public/embarassing situation when a conferencing server behaved in a
wierd manner disrupting core services. Not having the expertise the author
has, I narrowed it down to the Eth. Controller, Immediately replaced the
server with IBM Hardware with Broadcom chipset and resumed our services in
providing conferencing to some of the top FTSE100 companies.

Following this episode, I spend numerous days diagnosing the chipset with many
conference calls with Digium engineers debugging the server remotely. In the
end, no solution, recommendation to avoid the e1000 chipset and moved on.

~~~
TapaJob
brings back memories....

<http://lists.debian.org/debian-isp/2009/06/msg00018.html>

------
engtech
As someone who works with FPGAs/ASICs, this isn't that weird.

Everything gets serialized/deserialized these days, so there's all kinds of
boundary conditions where you can flip just the right bit and get the data to
be deserialized the wrong way.

What's more interesting is that it bypasses all of the checks to prevent this
from happening.

Here is the wiki page on the INVITE OF DEATH which sounds like the problem you
hit:

<http://en.wikipedia.org/wiki/INVITE_of_Death>

~~~
huhtenberg
> _Everything gets serialized/deserialized these days, ... and get the data to
> be deserialized the wrong way._

Can you elaborate? I recognize the words, but not the meaning.

~~~
bigiain
Anybody else waiting for him to reply with something like:

"Oh yeah, I used to work at Intel - that nic's got a YAML parser in it"…

------
jacquesm
Persistent bugger.

"With a modified HTTP server configured to generate the data at byte value
(based on headers, host, etc) you could easily configure an HTTP 200 response
to contain the packet of death - and kill client machines behind firewalls!"

That's worrisome, I'll bet there are lots of not-so-nice guys trying to figure
out a way to do just that. There must be tons of server hardware out there
with these cards in them.

~~~
thechut
I read the whole thing and that is the line that stuck out most to me. This
could very scary. It could be used to bring down a webserver

~~~
EvanAnderson
It sounds like the vulnerability could be used to bring down any machine you
can send an arbitrary Ethernet frame to. (I immediately wonder if it works for
broadcast frames? Sounds like a way to take down a LAN full of machines
quickly if it does.)

Edit: Per <http://www.kriskinc.com/intel-pod> it does work on broadcast
frames. Yikes!

~~~
samstave
Heh, yeah and he pretty much gives all the info on how to create the attack.

The test would be to find out how common that particular NIC is out there -
and grab a few and test out his method.

Looks like it would be fairly trivial to setup duplication given the author
did all the heavy lifting in finding this. Putting a small AWS bot-net up that
just sweeps massive IP blocks would be easy - heck you could do it really
easily from a single machine, it would seem.

If you can take out a machine with a single packet....

~~~
jacquesm
Just to be on the safe side I just checked all my machines, the majority have
broadcom cards, the three that have intel cards are all another type. I think
I'll sleep soundly tonight but I felt compelled to check.

~~~
samstave
That was a very responsible thing to do. :)

~~~
jacquesm
Professional paranoia. Stuff like this is really no fun at all.

off-topic:

A long time ago we found that a certain ping packet would be dropped with
about 30%, which in turn triggered a monitoring system to register 'server
down' when enough packets in a row were missed.

This would happen about once every day or so, leading to an operator being
paged (usually at 3am). Very annoying problem and incredibly hard to debug.
We'd replaced just about every piece of hardware except for a stupid little
T-connector. My buddy Jasper and me looked at it and we both more or less at
the same time said 'it can't be'. We swapped out the T-connector, problems
solved.

It took the better part of a day to nail that one, I still remember the
hostname (chopper) of the SGI box that the thing was connected to (SGI
Challenge, an Indy sold as a server with one of those silly thinnet adapters
dangling off the back, even though it had a UTP connector too).

Some bugs... I can't say I'm mourning the demise of coaxial ethernet and the
bus topology.

~~~
samstave
Heh, I have had experiences like that with T-connectors.

Heck, even just a few months ago I spent 2 hours troubleshooting a 10G fiber
connection on all brand new gear before swapping out the brand-new cisco 10G
SR SFP module which was DOA.

Thinking the same exact thing after swapping out everything else, including
all patch cords on both ends "It can't possible be this SFP module"

Yup.

------
cheeseprocedure
I've been unable to reproduce this on systems equipped with the controller in
question. I'd love to see "ethtool -e ethX" output for a NIC confirmed to be
vulnerable.

/edit Ah, I spoke to soon; the author has updated his page here with diffs
between affected and unaffected EEPROMs:

<http://www.kriskinc.com/intel-pod>

------
lifeisstillgood
Can anyone remember the source of the quote :

    
    
      Sometimes bug fixing simply takes two people to lock themselves in a room and nearly kill themselves for two days.
    

Reminded me of this

------
0x0
So is it only the byte at 0x47f that matters? Could you just send a packet
filled with 0x32 0x32 0x32 0x32 0x32 to trigger this? (Like, download a file
full of 0x32s?) Or does it have to look like a SIP packet?

You'd think the odds of getting a packet with 0x32 in position 0x47f is almost
1/256 per packet? So why aren't these network cards falling over everywhere
every few seconds?

~~~
wvenable
Probably because there is a 2/256 chance of getting sent the inoculation
value. But it's a good question.

~~~
caf
Later in the article it states that any value other than 0x31, 0x32 or 0x33
acts as an "inoculation value", so that would be a 253/256 chance for each
packet of at least 1151 bytes.

~~~
jes5199
is 1151 an unusually large packet? Because otherwise how are these NICs not
getting inoculated as soon as they're online?

~~~
caf
It depends on what the server is doing - packets that size wouldn't be unusual
in a bulk data transfer (like a HTTP response) but are larger than you'd see
in the typical DNS query/response.

------
elasticdog
Before actually testing this with the real payload, is there a better way of
determining if you have a potentially vulnerable driver than something like
this?

    
    
      # awk '/eth/ { print $1 }' <(ifconfig -a) | cut -d':' -f1 | uniq | while read interface; do echo -n "$interface "; ethtool -i $interface | grep driver; done
      eth0 driver: e1000e
      eth1 driver: e1000e

~~~
minaguib
This is not about the particular linux driver, but about a particular chipset,
and even then, only sometimes...

The linux e1000e may support many chipsets, so the fact that it's in service
on your box doesn't necessarily mean you're running the suspect chipset, or
that it's vulnerable.

Check with lspci -v, and check with the concrete test using the cold
boot+magic packet others and the OP have posted.

------
quentusrex
Updated with more specific info: <http://www.kriskinc.com/intel-pod>

------
drucken
Intruiging.

Intel 82574L ethernet controller looks to be popular too. Intel, Supermicro,
Tyan and Asus use it on multiple current motherboards and Asus notably on
their WS (Workstation) variants of consumer motherboards, e.g. the Asus P8Z77
WS (socket LGA 1155) and Asus Z9PE-D8 WS (dual CPU, socket LGA 2011).

~~~
dfox
It's quite popular because while it has large amount of weird quirks (usually
specific to silicon revision / configuration) it still works and in many cases
better than other comparable chipsets.

------
shawndumas
<http://computer.yourdictionary.com/truck-roll>

------
sc68cal
I'm not surprised - firmware for ethernet controllers have grown quite
complex, with the addition of new features that allow the hardware to do more
work on behalf of the kernel.

Could this be a bug in the code of the EEPROM that handles TCP offloading, or
one of the other hardware features that are now becoming more common?
(<https://en.wikipedia.org/wiki/TCP_offload_engine>)

------
devicenull
Wow, I've run into what seems to be the same problem with this controller
before. We "fixed" it by upgrading the e1000 driver.

------
corford
My servers all have the affected cards (two per machine - yikes!) but so far I
can't reproduce the bug (yay).

There are subtle differences between the offsets I get when I run "ethtool -e
interface" versus those in the article that indicate an affected card (but
they're quite close).

Mine are:

0x0010: ff ff ff ff 6b 02 69 83 43 10 d3 10 ff ff 58 a5

0x0030: c9 6c 50 31 3e 07 0b 46 84 2d 40 01 00 f0 06 07

0x0060: 00 01 00 40 48 13 13 40 ff ff ff ff ff ff ff ff

Output of "ethtool -i interface" (in case anyone wants to compare notes):

driver: e1000e version: 1.5.1-k firmware-version: 1.8-0

I tested both packet replays by broadcasting to all attached devices on a
simple Gbit switch and no links dropped.

~~~
mrb
You need to shut down, boot up the server, and do a test right away. The very
_first_ packet of 1152 bytes or more that it receives after a cold boot
determines if the NIC is going to be affected or "inoculated" (until next cold
boot).

~~~
corford
Thanks mrb, I missed that a cold power up was needed. I'm going to try again
now but it's a bit tricky as the affected machines are in a different country
and I don't have access to full remote power cycling (I can only reset the
machines). Hopefully, the data centre staff will be accommodating (after all,
if my machines are affected, likely hundreds of their other clients are too as
I'm using dedicated servers provided by them).

EDIT: it's difficult to tell definitively doing it remotely but I still can't
re-produce the bug after a cold boot.

------
grego
I had something similar in my home network, but my network foo is not good
enough and I did not have to time to debug for days and weeks.

Basically one linux box with NVidia embedded gigabit controller could take
down the whole segment. It would only happen after a random period, like after
days when the box was busy. No two machines connected to the same switch would
be able to ping each other any more after that. I suspected the switch, bad
cables, etc. In the end I successfully circumvented the problem by buying a
discrete gigabit ethernet card for the server in question.

------
noonespecial
Kielhofner is a pretty awesome guy. I met him a couple of times "back in the
day" at Astricon conferences when he was hacking together Astlinux.

He was instrumental in taming the Soekris and Alix SBC boards of old and
creating Asterisk appliances with them. If you've got a little asterisk box
running on some embedded looking hardware somewhere, it doesn't matter whose
name is on the sticker, its got some Kielhofner in it.

I live about a mile from Star2Star. I ought to pop in one of these days and
see what they're up to.

------
astangl
This seems much more serious than the much-ballyhooed Pentium FDIV bug.
Hopefully Intel will be on the ball with notifying people and distributing the
fix.

------
lukego
Cool!

I'm currently working on an open source project where we are chasing "hang
really hard and need a reboot to come back" issues with _exactly_ this same
ethernet controller, the Intel 82574L. I wonder if it's related!

Our Github issue: <https://github.com/SnabbCo/snabbswitch/issues/39>

------
jws
Well this hurts. I have a critical machine with a dual NIC Intel motherboard.
I had to abandon the 82579LM port because of unresolved bugs in the Linux
drivers, and the other one is a 82574L, the one documented in this post.

I suppose I can send just the right ICMP echo packet to router to make it send
me back an innoculating frame.

~~~
cdvonstinkpot
Good luck

------
altcognito
<http://en.wikipedia.org/wiki/Ping_of_death>

~~~
huhtenberg
That was an OS-level bug, it's far less exciting.

~~~
altcognito
I'll agree it's more interesting in that the end-result was a box that
required a hard boot, but still these two issues aren't that distantly
related: it effected routers and many, many OS platforms, so it's not as if it
was related to some implementation detail that MS left out of Windows.

Correct me if I'm wrong (no, seriously) -- aren't both "packets of death" just
poor handling of said malformed packets? Violations of their respective
protocols? (TCP/SIP)

~~~
eps
You probably want to re-read the linked article a bit more closely.

------
sriramnrn
Reminds me of my own adventures with systems hanging on PXE boot when a
Symantec Ghost PreOS Image didn't boot up completely, and went on to flood the
network with packets. See <http://dynamicproxy.livejournal.com/46862.html>

------
spitfire
This somehow reminds me of the slammer SQL worm. A simply formed single packet
caused a tsunami over the internet.

Personally, I am not at all surprised that this sort of thing exists. I'm sure
there's lots more defects out there to be found. turning completeness is a
cruel master.

------
meshko
I have mixed feelings about the write up. I think it gets clear pretty early
on that the issue is in the NIC hardware at which point it is time to stop
wasting your time investigating problem you can't fix and start contacting the
vendor.

~~~
jerdfelt
In my experience dealing with a similar bug (see my other post in the thread),
the vendors will immediately assume it's not their problem.

They spent a long time "showing" us that a different version of the Linux
kernel didn't exhibit the problem so it must be a Linux kernel bug. Turned out
the different version just sent data differently so it didn't trigger the same
bug with the same data. Other data would have triggered it.

I wouldn't be surprised if the majority of "bugs" they receive reports on turn
out to not be bugs in their hardware. There's probably parallels with the
reports of compiler bugs, most end up not being bugs in the compiler.

The unfortunate truth is that responsibility of proving it's the vendors bug
falls on the customer.

I had to write a proof-of-concept "exploit" to show the problem was with their
hardware, effectively troubleshooting most of the problem for them.

~~~
homosaur
THIS.

It's always someone else's testing procedures, someone's else's hardware...
The thing is though, most of the time it is. Tech support at the lower levels
especially are used to dealing with people who have bad configurations are are
using the products incorrectly. The annoyance comes in when you as a customer
narrow a problem down but can't get anyone on the phone who can help you at
that level.

------
viraptor
It's like a reverse example of a broken packet... You can see a number of
interesting samples and stories in the museum of broken packets:
<http://lcamtuf.coredump.cx/mobp/>

------
X4
Congrats Sir, you've just discovered the Internet Kill-Switch!

The “red telephone,” used to shut down the entire Internet comes to mind.

You discovered howto immunize friends and kill enemies in CyberWars.

Do governments have an Internet kill switch?

Yes, see Egypt & Syria they're good examples. We know China is doing
Cyberwars, they are beyond Kill-Switches.

Techcrunch: [http://techcrunch.com/2011/03/06/in-search-of-the-
internet-k...](http://techcrunch.com/2011/03/06/in-search-of-the-internet-
kill-switch/)

Wiki: <http://en.wikipedia.org/wiki/Internet_kill_switch>

We know Goverments deploy hardware that they can control when needed.
Smartphones are the best examples for Goverment issued backdoors, next to some
Intel Hardware (including NICs).

------
Garbage
Author mentioned a custom package generator tool "Ostinato". I met the author
of this tool 2-3 months back. A lone guy working on this tool as a side
project. Amazing work. :)

------
quentusrex
It appears to work if you send the packet to the network broadcast address.
Quick way to detect if any of the machines are vulnerable(they won't respond
to the second ping).

~~~
baq
you conveniently omitted the part in which you walk to the racks and reboot
them all.

~~~
wglb
Right! Except it sounds worse--sounded like you needed to cycle power to bring
them back.

~~~
caf
Luckily there's usually a convenient switch for power-cycling a whole suite of
racks at once ;)

------
anabis
Great diligence! I had 1G hubs lockup with Intel 82578DM. I was too lazy track
it down, so I just dropped the speed to 100M, which made it work.

