
The Top 1 million URLs in the world [warning: 44MB text file] - r721
https://twitter.com/mikko/status/348029544952897536
======
weinzierl
From what I understand the list is part of the ChromeBot Tool, a sub-project
of Chromium.

    
    
       ChromeBot is a distributed crash detection system.
    
       Chrome is run on a list of URLs to check for crashes.  A 
       pool of machines is used to distribute the workload, 
       each machine might potentially launch several instances 
       of Chrome. Crashes detected are symbolicated and saved 
       in database.
    

[http://src.chromium.org/svn/trunk/tools/chromebot/README.txt](http://src.chromium.org/svn/trunk/tools/chromebot/README.txt)

------
paulasmith
Hard for me to believe Orkut is so high on the list.

~~~
weinzierl
The tweet said only "The Top 1 million URLs in the world", not that the list
is ordered in any particular way. I doubt the list is sorted.

For example, the first URL with a German domain (.de) -and if we ignore google
- is

[http://suchen.mobile.de/fahrzeuge/showDetails.html](http://suchen.mobile.de/fahrzeuge/showDetails.html)

Never heard of that one, spiegel.de and amazon.de should be much more popular.
The reason that google in the first entries might be that about 2.3% of all
URLs in the list are google or facebook domains.

