Hacker News new | comments | ask | show | jobs | submit login
The Internet Map (internet-map.net)
182 points by m0th87 on July 28, 2012 | hide | past | web | favorite | 46 comments

That is not the Internet, its the Web. I'm all for telling non-technical people that the internet ends with the Web, save we get politicians wanting to regulate TCP ports or establish police server patrols.

A map of the internet would probably show AS and peerings between them.

This is the Internet map a few months after I first got on from Harv-10 in 1972:

http://www.flickr.com/photos/walkingsf/7257339850/lightbox/ .

Brings back great memories of sitting by the Harvard IMP (router) late at night, and getting a call from BBN on the phone asking me (anyone) to reboot it manually...

here are a bunch of other ARPANET maps from 1969-77:


By the way, I (as non-US) wonder why west coast and california had more nodes even back then?

I mean, west coast is less populated and if you look at night lights map it's practically invisible compared to east coast. Why tech tend to be there anyway?

Most of the work on the ARPAnet was happening in California. Between defense contractors (one of the 'goal' clients), the number of universities (Stanford, USC, UCB, Etc.), primary contractors (SRI, BBN, Etc), and the relative low cost of slinging wires long distances through the central valley to try out protocols over long distances, it got more investment early on.

Back then it was likely easier to lay the groundwork than in more populated cities. In addition there were great institutes of higher learning focusing on technology.

Note this is just a small theory and in no way has any evidence to back it.

But then, why all those technology institutes happened to grow there? Back when there weren't many people, or infrastructure, or tech?

That's a good question and I'd be very interested in reading about why so many popped up out here. I posed a question about this on Quora so hopefully I can report back soon with some good answers.

"Back when there weren't many people, or infrastructure"

What time frame are you thinking of here? I'm not sure it is accurate - there was plenty of infrastructure and people in post WW2 Western US.

When those technological institutions were found.

Maybe there were plenty, still I guess east coast had much more. Am I wrong?

You'd probably have to go to before WW2.

On the earlier end, Stanford, UCLA, UCSB and Berkeley all trace their roots back to 1860s-1890s.

But you'd probably have to go back to before the gold rush to go to a time when there weren't many people in the bay area: san francisco had about 1,000 people living there in 1848, 25,000 by 1849, and was the 10th most populous city in the US by 1870, and had 300,000 by 1890.

Los Angeles blew up with the discovery of oil and entertainment and broke the top 10 in the 20's and the top 5 in the 30's.

You aren't wrong, the east coast had a lot more people, but they still do today.

Both H (Bill Hewlett) and P (Dave Packard) were at Stanford prior to WW2. And Silicon Valley pre-dates the Internet era. See Fairchild Semiconductor, The Treacherous Eight, etc.

Steve Blank gave a very interesting Google tech talk about it, back in 1997:


William Shockley is a notable contributor to the birth of Silicon Valley (along with many other factors).

"Shockley's attempts to commercialize a new transistor design in the 1950s and 1960s led to California's "Silicon Valley" becoming a hotbed of electronics innovation."[1]

[1] https://en.wikipedia.org/wiki/William_Shockley

Those were just the early lucky ARPA fundees, mostly because they had excellent computer science departments (or were well-connected politically, like Harvard) or were research institutes like SRI or Rand.

Plus the usual military nodes early on, since all the ARPA grants were sponsored by the military, and they had to keep their fingers on things.

What do the two different symbols represent?

The red circles are IMPs (interface message processors--full routers connected to hosts) and the circled T's are TIPs (terminal interface processors--dialup routers).

I'd love to see a how this was made blog article... especially how all of the data was found/processed.

The about page has a fair amount of detail, but a rough outline:

Basically, they took a web crawler like Heritrix (archive.org) or Scrapy (a handy Python implementation good for prototyping) and just started fetching web pages.

Eventually, they have a database of 350,000 websites, along with two million links between these domains. Any set of web pages within a given domain may have hundreds of hyperlinks to a dozen other domains, but a link from any page in one domain to any page in another domain becomes a relationship between two domain nodes in a graph. Presumably they used something like neo4j.org to store these relationships (cf. jokes about relational databases being bad at storing relationship information).

Then the actual hard part comes in. They link to a high level paper on rendering a visualization of that much information, and then used a similar algorithm to determine placement of each node. The size of each node is presumably the number of links in/out and the color coding is geographic (and probably not considered in this algorithm).

So now they have a database describing all these nodes and relationships, and an algorithm to draw a gigantic image of all of them in 2D space. They used GPU-based parallel processing techniques (probably with NVidia's CUDA language) to crunch all the numbers to generate the final image.

Finally, the image ends up being pretty large at a reasonable zoom. A scaled map of the Earth zoomed a bit above street level would still be about 125 miles on a side. So they use Google Maps API to manage small chunks of the image at various zoom levels. (They also end up rerunning that algorithm a few times to generate smaller images for each zoom step, including one at good old 1024x768).

Pretty neat. Would love to see their writeup.

According to this: http://habrahabr.ru/post/148351/ they are just using data from Alexa and visualizing it.

What is it visualizing? Pages or links? Because if it's links, there's no way that Facebook is so large and Wikipedia is so small.

The about section on the site says it's pageviews. The sites arrange themselves around one another according to links, but the size is about views.

According to other comments on HN, it seems they are counting outside links to a given domain.

That's what I'm saying, it's just not possible. Pageviews would explain Facebook being so much larger than Wikipedia, but think about it - how many links from outside Facebook point to Facebook vs how many links from outside Wikipedia point to Wikipedia?

what about all the Facebook embedded widgets? wouldn't those be links to facebook.com ?

Not necessarily. Most of those things are iframes, so technically it's FB linking back to FB.

A couple of these are very strange. Instagram is right beside yousendit as well as almost every Irish web property. Still an awesome visualization, but what is it visualizing?

This is actually a very useful advertising tool - if you're considering placing ads on any sites, it's a great way to see how they compare to other sites in terms of traffic and visitor flow.

Agreed, but it's also funny to find inexplicable little outliers in the graph. Amishamerica.com, for instance, seems to be nestled right in the middle of the pornoverse (http://internet-map.net/#12-157.30947875976562-183.961929321...).

For instance it tells me that HN primarily receives it's non-direct traffic from Twitter, and that softwarebyrob and randsinrepose both have roughly the same traffic - both driven primarily from HN.

Apparently it uses Google Maps for display, and I wonder, does GMaps API allow use of custom map data, or does it mean that this visualization is made by Google?

The Maps API works with all sorts of custom map types [1]. You can augment or replace the standard Google map tiles.

This map uses a google.maps.ImageMapType [2] along with a custom EuclideanProjection map projection [3] which replaces the standard spherical Mercator ("Google Mercator") projection.

[1] https://developers.google.com/maps/documentation/javascript/...

[2] https://developers.google.com/maps/documentation/javascript/...

[3] View source of http://internet-map.net/

Thank you for the detailed reply. I definitely agree with [3], and wouldn't ask if I've had enough experience with modern web technologies.

The API lets you load a custom set of map tiles, managing up to 16 (I think) levels of zoom. It doesn't have to be specifically tied to Earth's GPS/cartographic conventions.

You should be able to go to zoom level 23 or 24.

In their "About"

> The Internet map is a non-commercial project. You can share our expenses and let more people see beauty of the Internet.

I don't think it's a Google project.

Some questions: [1] nice to have some details about the site when clicking on them. [2] Is there any meaning to the color? if so, what are they?

Color is country, it's written somewhere in the about page. Russia is red, China is yellow-ish, etc.

Here is a physical map of the Internet: http://www.cablemap.info/

And here is a logical map of the Internet: http://www.peer1.com/map-of-the-internet

A map of websites doesn't really make that much sense as a website can easily be in multiple places

Thanks so much cablemap.info site.

I can't reddit is this[1] small!

Also you are here[2].

[1]: http://i.imgur.com/impUR.jpg [2]: http://i.imgur.com/7WKaF.jpg

Very awesome, it's cool how each color is a different country and you can see where countries integrate with other countries and where they clump together on their own.

It's a bit strange to see wordpress.com alone in the top left of the map.

Agreed. Maybe an abnormality?

extra points for really tidy and readable JS and markup

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact