
Filter all ICMP and watch the world burn - jonchang
https://rachelbythebay.com/w/2015/05/15/pmtud/
======
contingencies
For those who are new to this issue: ICMP is the sort of 'signalling' protocol
related to IP itself. ICMP provides a few services (eg. classic ping), but
critically when issues within the IP layer occur elsewhere on the internet
while attempting the delivery of a packet, ICMP messages are usually sent in
response. However, recently (~last 15-20 years) badly configured firewalls
block all ICMP or certain types of ICMP, which can result in difficulties in
communication. In this case, type 3 code 4 was blocked. More info @
[http://en.wikipedia.org/wiki/Internet_Control_Message_Protoc...](http://en.wikipedia.org/wiki/Internet_Control_Message_Protocol)

~~~
avodonosov
Thanks. Why does the source host sets the Don't Fragment flag?

~~~
contingencies
According to the original IPv4 RFC @
[http://tools.ietf.org/html/rfc791](http://tools.ietf.org/html/rfc791) page
#25...

 _If the Don 't Fragment flag (DF) bit is set, then internet fragmentation of
this datagram is NOT permitted, although it may be discarded. This can be used
to prohibit fragmentation in cases where the receiving host does not have
sufficient resources to reassemble internet fragments.

One example of use of the Don't Fragment feature is to down line load a small
host. A small host could have a boot strap program that accepts a datagram
stores it in memory and then executes it._

What it seems to mean is things like PXE[1] or BOOTP[2].

Basically, DF was built to allow the sender to optionally force zero
fragmentation by intermediate hosts en-route that are connected to networks
with a smaller MTU than the packet size originally emitted by the sender. This
was _originally_ intended to be of use because the sender was somehow made
aware of limitations in the recipient's network stack.

 _Probably_ 20 or more years ago it was used for awhile as a latency-related
hack for certain applications (VOIP, video, low-latency finance, certain
scientific experiments generating vast amounts of data, etc.) mostly on
UDP[3], though we have better methods for those now that operate through other
IP headers (QoS).

Theoretically, intermediate nodes could also use it as part of path selection
during routing, though I have no idea if this has been done or is encouraged -
eg. a packet from node A reaches node B en-route to node C. Node B has two
routes to node C. The lower-cost route is available with a smaller MTU than
the packet size, and a higher-cost route is available with a large enough MTU
to accommodate the packet size. The DF flag could be used by routing logic at
node B to automatically shuffle the packet across that higher-cost route.

I believe the _Path MTU discovery_ [4] feature of modern Linux kernels also
probably uses this mechanism combined with short TTLs (ie. maximum hop counts,
which also cause ICMP errors to be returned on their failure, and are the
basis of _traceroute_ [5], itself an unintended hack based on unrecognized
capabilities of the IP protocol's specification) to optimize long-lived
traffic flows.

There must be other edge-case or optimization-related wishy-washy reasons to
set it, too. The main thing to remember is: none of these were really intended
by the authors of IPv4. This whole historic pile of what-if edge-case
hackyness has been thrown out in favor of a better system in IPv6[6].

[1]
[http://en.wikipedia.org/wiki/Preboot_Execution_Environment](http://en.wikipedia.org/wiki/Preboot_Execution_Environment)

[2]
[http://en.wikipedia.org/wiki/Bootstrap_Protocol](http://en.wikipedia.org/wiki/Bootstrap_Protocol)

[3]
[http://en.wikipedia.org/wiki/User_Datagram_Protocol](http://en.wikipedia.org/wiki/User_Datagram_Protocol)

[4]
[http://en.wikipedia.org/wiki/Path_MTU_Discovery](http://en.wikipedia.org/wiki/Path_MTU_Discovery)

[5]
[http://en.wikipedia.org/wiki/Traceroute](http://en.wikipedia.org/wiki/Traceroute)

[6]
[http://en.wikipedia.org/wiki/IPv6#Simplified_processing_by_r...](http://en.wikipedia.org/wiki/IPv6#Simplified_processing_by_routers)

~~~
beneater
The reason is because you never want routers to have to fragment your packets
ever. Fragmentation is really inefficient. So any modern stack will always set
DF and listen for ICMP unreachables. In other words, PMTU discovery.

None of the other reasons you mention are really relevant.

~~~
avodonosov
So, the sender uses DonfFragment as a tool to detect the optimal packet size.

Alternative idea: when router needs to fragment a packet, router passes the
fragmented packet through, but sends an ICMP to the source - "fragmentation
happened". (Or the party which re-assembles the fragmented packet, maybe
receiver, sends this "fragmentation happened" message).

So this IP packet does not need to be resent, the source can optimize packet
size for future, and the network doesn't break if ICMP is disabled somewhere.

We don't want to constantly send these "fragmentation happened" ICMP messages
(if they don't reach the source and it keeps sending large packets), so the
router sends ICMP not always, but only for first 3 fragmented packets of that
source in each 10 minutes.

[I am just thinking, it's not a real proposal.]

------
Twirrim
Unfortunately almost every time I've dealt with PCI-DSS compliance auditors
they almost always raise the fact that I haven't got ICMP completely filtered.

It's always an annoying, long argument with them about why ICMP exists, why it
shouldn't be completely filtered, and what the potential side effects of
filtering it are.

~~~
vidarh
Not PCI-DSS, but I dealt with security auditors for a client that insisted
that we shouldn't allow ping... To the address of the public website...
Because we might reveal there was something there.

My passive aggressive response was to point out, while copying the client,
that a number of more prominent security auditors sites responded to ping, as
well as the websites of any number of intelligence organizations, banks and
similar.

~~~
MichaelGG
Just had an audit with one of the biggest telcos, done by a large "security"
firm. They insisted that the public website not respond unless the right Host
header was there. Stupid, but OK, I can see it on a checklist for intranet
apps.

But the real kicker: The site was TLS only, so connecting to the IP will still
leak the hostname, from the cert.

Edit: This was a really big security firm, too. Totally worthless audit. They
actually complained that a site admin could "include iframes in the HTML,
which could be a malware vector" when uploading content. Ignoring that they
could also upload scripts and arbitrary binaries.

After their weeklong "penetration test" concluded, I found some serious XSS
(public user->admin, which could easily turn into system takeover) with about
5 minutes of looking. Are most audits this useless?

~~~
strayptr
_After their weeklong "penetration test" concluded, I found some serious XSS
(public user->admin, which could easily turn into system takeover) with about
5 minutes of looking. Are most audits this useless?_

Hiya! I started at Matasano/NCC back in February. Part of the reason I joined
was to find out whether or not everything tptacek has been saying for years is
true. Turns out it's pretty much all true. Some of that is awesome, like the
hiring process. Some of that is scary, like the fact that someone of moderate
skill level can usually break into most production apps.

My experience is limited. That said, put me on an audit and the first thing
I'll check for is XSS and SQLi. The second thing I'll check for is authz: log
into an admin account, note a URL to perform an admin action, log in as a
normal user, try to access that URL. Third thing I'll check for is if there
are any upload forms, because that's a common way to get RCE: upload a file
and try to trick the app into executing it. Etc. Stuff that matters.

It's a point of pride to ensure that our assigned app has been pentested
thoroughly by the conclusion of an audit. A thorough pentest doesn't
necessarily mean finding every possible vulnerability, because time is often
limited, but it does mean finding the serious ones.

If we include any findings in the final report which could be called "trivial"
(there are occasionally some), they're marked as informational findings, i.e.
their severity level is less than low. The reason we include them is because
even though the finding doesn't necessarily pose any security risk, a client
will often get another pentest from another security firm and diff the
results. If the other firm points out something we thought was too trivial to
include, the client will rightfully ask why we didn't find it. (We try to be
pretty clear in the report about each finding, though, so you're not going to
come away with the impression that we're saying you need to address something
trivial.)

I don't know enough about your experience with that security firm you worked
with to comment directly, but communication is one of the most important
aspects of the job. If we find some flaws but don't communicate well to the
client, then nobody was served by the audit. So if you're feeling like the
whole process was a waste of time, you might want to shop around. There are
several good security firms, not just Matasano/NCC, so you may want to give it
another shot.

For what it's worth, the fact that you had a bad experience with one of the
firms is actually painful to me. It's only recently that people started to
care about security in a significant way, and it's a tenuous position. The
more people who get a "security audit" and end up feeling like it was a waste
of time, the more likely we are to end up back in a situation where people
know there are probably serious security problems but feel like there's
nothing they can do to find or fix them. There is: Give us a test environment
and two weeks. We'll find what matters, and we'll give you a report explaining
each issue and how to fix them.

~~~
smu
Thank you for your comment! I want to add to it as a former pentester: it's
absolutely painful to read about these nitpicky "the world is going to burn if
you don't modify trivial security setting X that will destroy user
experience". Not because these shouldn't be included in the report, but
because the focus is wrong.

As strayptr, I would also include the trivial issues as "informationals" in
the report as you do want your clients to know about these for a number of
reasons. However, most of my time would go to hunting for severe issues, where
I defined severe on some mental ranking based on "difficulty to exploit",
potential impact,... In addition, these issues were also where most of my
attention went to afterwards, because you need to explain and educate
development, testing and business on the issues, why you think they are
important and how to best/quickest fix them.

In my opinion, building up relationships and having empathy for your client is
very important. I would always try to have a chat with
development/test/business to get a feel of where their heads were at. That
would help me both while testing (what is important to them? how did they
develop it? what is their maturity?) and while reporting issues (they would
actually believe me, I could help them rank the issues and they would allow me
to brainstorm how to best mitigate the issues for their environment).

------
benjojo12
We got hit massively by this at CloudFlare (though we were not explicitly
filtering ICMP, but changes we made to our infra meant that PMTU packets got
lost) We wrote a blog post about this too: [https://blog.cloudflare.com/path-
mtu-discovery-in-practice/](https://blog.cloudflare.com/path-mtu-discovery-in-
practice/) and the solution to our change:
[https://github.com/cloudflare/pmtud](https://github.com/cloudflare/pmtud)

~~~
michh
Interesting!

I'd think, in theory the ECMP router could keep track of the MTU on a per IP
basis (rather than per TCP connection) based on it having received the ICM
unreachable packet. And from that moment on, sending a spoofed ICMP packet
back whenever one of the servers it's routing for sends a packet the router
knows won't reach the host.

But even if that works, I'm by no means a network engineer, your solution of
simply broadcasting the packets is probably more efficient in the real world.

------
simon_vetter
icmp in ipv6 does much more than its ipv4 counterpart and most importantly:

1) L2 address resolution (neighbor discovery), which ARP used to do in ipv4,

2) full network autoconfiguration (global scope addresses, default route(s),
DNS resolver), which DHCP used to do in ipv4 (although DHCPv6 is still an
option),

3) multicast group management (MLD), which igmp used to do in ipv4,

4) path mtu discovery (through 'packet too big' messages this article
references). Routers fragment packets exceeding the link MTU in ipv4, they
notify the source of the lower mtu in ipv6.

ping, TTL exceeded, destination (host, route or port) unreachable and
parameter problem were mostly carried over from ipv4.

Blocking 1, 2 (and to some extent 3) on a local network will most likely break
ipv6 connectivity entirely while blocking the others will only break it in
subtle, hard to debug ways (especially with ECMP and traffic engineering where
multiple routes for a given destination can be used).

I've found that explaining this before asking network admins to unblock icmpv6
filters is a good way to succeed (although it can be hard, i'll give you
that).

People aren't used to filter ARP or link local broadcast in ipv4 (which DHCP
uses), so telling them that they need to allow icmpv6 to let stations merely
_configure_ themselves is a bit of a mentality change.

At the same time, developers of firewall management tools like ufw understood
this problem a while ago and insert a working, good, tried and tested icmpv6
accept list as first rule which you can't mess with.

Telling people to use ufw is usually much better than teaching them
ip[6]tables.

~~~
smkelly
We use a load balancer product that is Linux-based. It defaulted to blocking
all IPv6 ICMP, including neighbor discovery. This made IPv6 not work at all.
It was a struggle to get them to fix it. And I don't think they've released
the update with the fix yet either.

------
cperciva
Last time I checked, EC2 defaulted to filtering ICMP packets, with the
predictable bad results:

[http://www.daemonology.net/blog/2012-11-28-broken-
EC2-firewa...](http://www.daemonology.net/blog/2012-11-28-broken-
EC2-firewall.html)

~~~
acdha
As a bonus, you can't enable them for things like ELBs where you don't control
the box.

~~~
jsmthrowaway
You can in VPC just fine. The ELB has a regular security group.

------
ChuckMcM
When people filter ICMP is really really annoys me. Sure I get that some
people don't like to respond to pings, or that you can ddos some routers by
flooding them with 64 byte packets (old routers btw) but hey ICMP is a
critical part of making the network work correctly.

------
jvdh
Discovering Path MTU black holes on the Internet using RIPE Atlas:
[https://www.os3.nl/_media/2011-2012/courses/rp2/p57_report.p...](https://www.os3.nl/_media/2011-2012/courses/rp2/p57_report.pdf)

A Master thesis research report from 2012 which examined this very problem on
a global scale, using the RIPE Atlas monitoring network.

------
js2
Eh, seen this so many times...

My favorite related problem was about 12 years ago when I had a Mac and a
Linux box side-by-side and the Mac could connect to a Verizon site (a paging
gateway) while the Linux box never even got a response to its SYN. I
eventually figured out the Linux box had ECN enabled. Probably an out-of-date
firewall at Verizon's end didn't like such exotic TCP options. Disabling ECN
on the Linux box fixed the issue.

(I believe I was working on an email-to-page script at the time.)

~~~
mcguire
ECN was a bit of a special case, and a major pain in the ass: it uses a
previously-unused bit in the header and "security conscious" network hardware
developers forgot the "liberal in what you accept", set-it-to-zero-when-you-
create-a-packet-and-ignore-it-otherwise proper default behavior for unused
bits.

------
KaiserPro
Yup, I've had this: "ICMP is a security hole" lets turn it off.

such a tedious conversation to have with the networks(!) team

~~~
georgerobinson
Can you explain how ICMP is insecure? Is it just ping being exploited?

~~~
scurvy
ICMP can be used to affect routing on insecurely configured hosts. That might
be an exploit vector? Circa 1995-97 there was the ping of death....but that
was 20 years ago.

Most of these security best practices are like building code. They get written
down and are never updated. For example, showers need a 2" drain pipe but tubs
only need 1.5". The theory was that a backed up drain would flood a bathroom
with a lower shower rim very quickly. It would take much longer with a tub
(higher side wall). No one ever bothered to update the regulations once we
stopped allowing 5 gpm showerheads. There's no reason put into it. Just "nope
it says 2" required b/c we've always required 2"." "Why?" "Because that's how
it is."

Same way with computer security.

~~~
gonzo
"Code is not prescription."

------
dap
When a client tries to send packets too big for the network (as when the
client is configured with jumbo frames but the network isn't), this can be
really painful to debug. The worst is that many things will work because small
packets get through. For example, an "scp" connection may successfully
connect, and then just hang when it starts transferring real data.

------
taspeotis
This is a well known problem. Windows (since at least 2000) can detect this
scenario and mitigate it [1].

[1] [https://technet.microsoft.com/en-
us/library/cc960465.aspx](https://technet.microsoft.com/en-
us/library/cc960465.aspx)

------
scurvy
I'd probably be saying "Eureka!" if this were 1997. But it's 2015. This is
super basic stuff. Do people these days just blindly assume a MSS of 1460 is
going to work on the Internet? Or do they think that the Internet is comprised
entirely of Ethernet links?

Or has the use of cloud providers and reliance on higher level programming
languages produced a generation of ops people who don't understand the
mechanics of how things work?

~~~
scurvy
Everyone who downvoted my comment needs to pick up and read a copy of Comer's
"Internetworking with TCP/IP". That and Stevens' TCP/IP Illustrated are the
best sources for networking out there. You won't find the info on any blog. It
won't be in something on ServerFault or StackOverflow and definitely not HN.
Buy, read, learn.

Yes, I know the US publisher has messed with the pricing for current versions
but you can find the previous ones used pretty cheap. Other than dropping the
IPng chapter, I doubt much has changed.

------
X-Istence
The lowest MTU on IPv6 is 1280, which means that even if we go with the
minimal MTU just to get traffic to flow, it's not as terrible as IPv4's
minimum MTU: 576.

~~~
e12e
That just means you'll have to encapsulate ipv6 in ipv4 to get across those
really poorly configured routers...

~~~
vacri
Would those routers support 6 in the first place?

~~~
mobiplayer
If it's encapsulated it doesn't matter :)

------
znep
Brings back memories from my past...
[http://znep.com/~marcs/mtu/](http://znep.com/~marcs/mtu/)

Woefully out of date and wrong, but helped some folks out.

------
junto
We have what I think is a long running MTU problem on Rackspace hosting for a
customer. We are losing parts of HTTP requests between the H5 load balancer
and the customer's web servers (which are running IIS).

The header of the request reaches IIS and then the content body of the request
fails to turn up causing a 500 error on the server.

Issue is mostly seen on POST requests where the content of the request is
going to be split over more packets. It's been driving us nuts.

I wonder if we should also check the ICMP blocking too?

~~~
mobiplayer
Hey! Ex-Racker (NetSec) here. The truth is on the wire, get captures and check
where the packets are dropped.

Do you have the F5 in front of the webservers or on a different interface in
one-arm mode? Second case you're going through the firewall, so I guess that's
your scenario.

------
lloeki
> ipv6

> someone blackholed the very important packets which say "fragmentation
> needed but DF set"

IIRC IPv6 never fragments and uses MTU path discovery (via ICMPv6)

~~~
feld
IPv6 does have fragments. Routers will not do the fragmentation. They just
drop the packet and force the client to do it. It can be horrible if you're
using a tunnel broker and don't lower your MTU.

------
sliken
Heh, back when google was just a white page with a search bar I noticed that
the front page would come up, but he results didn't. Turns out I was on a home
network connection with PPPoE which slightly decreases the maximum MTU.

I opened a ticket with google and a SRE called me back (to my surprise) and we
tracked it down. Google had a new firewall that was blocking MTU negotiation.

------
bgilroy26
Educational Stack Overflow from the Google results for 'mtu':

[http://serverfault.com/questions/43866/whats-the-best-mtu-
se...](http://serverfault.com/questions/43866/whats-the-best-mtu-setting-for-
a-web-server)

------
mjankowski
Damn, I had this issue a while ago but I was not able to figure out the cause.
I posted to stackoverflow but nobody pointed me in this direction. now I went
back and answered my own question :) thanks! @jonchang

------
tzakrajs
Some application protocols require ICMP _and_ another TCP or UDP port and
won't send their TCP or UDP packets until the ICMP ping packet has
successfully been responded to.

------
lamontcg
Its really kind of depressing that this is 'news' enough that it gets 245+
upvotes here. It can't be older than the internet itself, obviously, but its
damn close...

------
feld
Read the title and immediately knew it would be PMTU.

~~~
Swannie
Ditto. Yet I still have this conversation with self styled "network
architects" who want to blanket block all ICMP. It's depressing.

