
Masscan: scan the entire Internet in under 6 minutes, 10 million packets/second - nvk
https://github.com/robertdavidgraham/masscan
======
aortega
What's impressive about this or zmap? they do not magically invent bandwidth.
Zmap requires a 1 Gbps connection or more, this will require 10X the
bandwidth. Those projects are basically dumb packet generators, what I find
really impressive is the Internet routing infrastructure that can accept those
outputs.

~~~
erlichmen
Having bandwidth is one thing, utilizing it to its fullest potential is anther
thing. Only the problem of randomizig target ip and port in real time and not
hurting the throughput of the scan in the process is one tough nugget. I
suggest that you read the readme before making judgment about the complexity
of this project.

~~~
aortega
>Having bandwidth is one thing, utilizing it to its fullest potential is
anther thing.

It's the same thing if you stop coding in javascript

------
plainOldText
The Readme file is very detailed and well laid out. While reading it, it felt
like I was having a conversation with the author. I whish more repos could
have that.

------
twodayslate
> To get beyond 2 million packets/second, you need an Intel 10-gbps Ethernet
> adapter and a special driver known as "PF_RING DNA" from
> [http://www.netop.org](http://www.netop.org).

~~~
WestCoastJustin
Robert Graham (masscan author) gave a talk on scaling to 10 millions packets
at Shmoocon 2013 - C10M Defending The Internet At Scale [1, 2]. You basically
bypass the kernels network stack and its robust features, and use a special
purpose built drive (i.e. _" PF_RING DNA"_), which you heavily customize. You
also need to write your application in a way that can assign CPU and memory
for your tasks. This is not just a, run X and get 10 millions packets second,
it is a very planned out exercise.

[1]
[http://www.youtube.com/watch?v=73XNtI0w7jA](http://www.youtube.com/watch?v=73XNtI0w7jA)

[2] [http://c10m.robertgraham.com/](http://c10m.robertgraham.com/)

~~~
3327
So are you telling me this does not have a wonderful GUI where there is a big
massive button that says "SCAN INTERNET" ?

~~~
taspeotis
I'll create a GUI interface using Visual Basic. See if I can track all the IP
addresses.

~~~
nnnnni
Then you can zoom, rotate (in 3D), and enhance the results!

------
andrewljohnson
Similar to: [https://zmap.io/](https://zmap.io/)

And the recent HN discussion on ZMap:
[https://news.ycombinator.com/item?id=6226105](https://news.ycombinator.com/item?id=6226105)

------
X-Istence
I doubt you can scan the entirety of IPv6 address space in under 6 minutes ...

~~~
Posibyte
I'd really like to know what the answer would be, so I thought I'd do the
math. So, anybody please correct me, but my math comes up to something like
3.4x10^21 times the age of the universe given the rate above.

~~~
weazl
2^128 ip-addresses / 10 million packets a second = 7.8 * 10^13 times the age
of the universe [1]

To actually scan the entirety of ipv6 address space in under 6 minutes you
would need to send 1 billion billion billion billion packets a second [2], or
100 octillion times faster than 10 million packets a second. [3]

[1]
[http://www.wolframalpha.com/input/?i=2%5E128%2F10%5E7+second...](http://www.wolframalpha.com/input/?i=2%5E128%2F10%5E7+seconds)

[2]
[http://www.wolframalpha.com/input/?i=2%5E128%2F360](http://www.wolframalpha.com/input/?i=2%5E128%2F360)

[3]
[http://www.wolframalpha.com/input/?i=10%5E36%2F10%5E7](http://www.wolframalpha.com/input/?i=10%5E36%2F10%5E7)

~~~
AaronFriel
Assuming ideal conditions, a 10GbE adapter uses about 20W to send at most
about 15 million packets per second. Assuming no improvements in efficiency
(unlikely), the network adapter that could do this would use 3500 times the
power of the sun for that six minutes[1], which would be an amount of energy
comparable to the kinetic energy of the Earth orbiting the sun (1/6th) [2].
That amount of energy would be enough to more than boil the oceans[3], it
would practically liquefy the Earth. It would be enough energy to ionize a
ball of water the size of the earth into plasma, according to some sources,
but I'm skeptical that 10000 Kelvin water at that volume would remain a plasma
for long.

tl;dr: We won't be scanning the IPv6 address space any time soon. And
hopefully not on Earth.

[1]
[http://www.wolframalpha.com/input/?i=2%5E128%2F360+*+%2820+w...](http://www.wolframalpha.com/input/?i=2%5E128%2F360+*+%2820+watts+%2F+15000000%29)

[2]
[http://www.wolframalpha.com/input/?i=2%5E128%2F360+*+%2820+w...](http://www.wolframalpha.com/input/?i=2%5E128%2F360+*+%2820+watts+%2F+15000000%29+*+6+minutes)

[3]
[http://www.wolframalpha.com/input/?i=%282%5E128%2F360+*+%282...](http://www.wolframalpha.com/input/?i=%282%5E128%2F360+*+%2820+watts+%2F+15000000%29+*+6+minutes+*+%281+degree+Celsius+*+milliliter+%2F+4.190+J%29%29+%2F+%281.332%C3%9710%5E18+m%5E3%29)

[4]
[http://www.wolframalpha.com/input/?i=%282%5E128%2F360+*+%282...](http://www.wolframalpha.com/input/?i=%282%5E128%2F360+*+%2820+watts+%2F+15000000%29+*+6+minutes+*+%281+degree+Celsius+*+milliliter+%2F+4.190+J%29%29+%2F+%281.083%C3%9710%5E21+m%5E3%29)

------
wmf
Dupe:
[https://news.ycombinator.com/item?id=6388222](https://news.ycombinator.com/item?id=6388222)

~~~
biot
Plus the original story addressing the abuse complaints caused by scanning the
entire internet which also included the link to masscan on github:
[http://news.ycombinator.com/item?id=6383562](http://news.ycombinator.com/item?id=6383562)

------
pbsd
I haven't checked what exactly massscan is doing to randomize the IP and port
sequences, but if reduction modulo some runtime constant is that much of a
problem (according to the Readme at least) perhaps you should consider
replacing the modulo and division operations by multiplications?

The canonical reference is Granlund and Montgomery [1]. Luckily, there are
ready-made libraries for this, like libdivide [2], which would probably lower
the reported 90 cycles into something more palatable (and pipelineable).

[1] [http://gmplib.org/~tege/divcnst-
pldi94.pdf](http://gmplib.org/~tege/divcnst-pldi94.pdf)

[2]
[https://github.com/ridiculousfish/libdivide](https://github.com/ridiculousfish/libdivide)

------
javert
This is impressive, but is there any reason someone would _want_ to scan the
entire Internet?

In other words, is that a feature, or is it just a performance metric?

~~~
dthunt
There are plenty of reasons to want to perform that kind of scan on the
internet. Sometimes, if all you're looking for is a a sample of how many
people are running version X of a product versus version Y, you really don't
need to map the whole internet. You can just sweep for your port of interest
across a few million random IPs and call it a day.

Sometimes, though, if you are building a directory, or if you just want a
really fantastically beautiful map, you scan the whole shebang.

Doing it all in a few minutes is a little unncessary, though, even if it is
neat how much better scanners have been getting lately. A scan that merely
takes a few days is plenty fast.

~~~
malandrew
The shorter time scale means that it is now getting easy enough for everybody
to do it. With that in mind I feel like the responsible thing is to only
release the source as an educational resource and also maintain a public data
store of any results gathered by this tool so that people can perform analysis
without having to fire so many packets at every machine on the Internet. Not
providing a Datastore in Tarball format to mitigate the actual use is a tab
irresponsible.

~~~
adestefan
It's really not that easy for someone to do it. The problem is getting someone
to give you a connection that can actually put that many packets onto the
Internet. Even the author has issues with this points. See
[http://blog.erratasec.com/2013/09/masscan-entire-internet-
in...](http://blog.erratasec.com/2013/09/masscan-entire-internet-
in-3-minutes.html)

> The problem is that I don't have a 10-gbps network to test on. My ISP let's
> me go out to 100,000 packets/second as long as I deal with the abuse
> complaints, but that's around 44-mbps.

------
wpnx
Well written, well documented code. Kudos.

------
metalruler
This could be handy for joining a fledgling peer-to-peer network, where there
are connected peers forming a network, but for one reason or another new nodes
cannot find them - just do a masscan of the default listen port to find IPs to
attempt connections to.

------
radikalus
Impressive engineering; I'm learning a lot going through the source -- thanks!

------
iancarroll
Wasn't there a similar project on HN earlier this month?

~~~
thejosh
zmap!

