

Ask HN: Efficient way to generate Internet censorship statistics? - crypt1d

I think it would be interesting to create a tool that takes a certain number of popular websites and checks if they are available from two or more geographic locations. The idea is to determine how much % of the internet is censored in countries like Iran, China, Syria, etc. I&#x27;m trying to figure out how to do this automatically while at the same time properly detecting various censorship techniques. There are many ways to block access to a website so creating an algorithm that would compare the webpage and calculate the difference would make sense, but considering that most of the websites these days are full of dynamic content, I&#x27;m not sure how reliable such an algorithm would be.
Does the HN community have any ideas or pointers? Also, would anyone be interested in teaming up for this project?
======
nmc
Heard about www.blockedinchina.net ?

Maybe you could ask them for advice.

~~~
crypt1d
From what I can see on their website they are just checking if the address is
resolvable via a Chinese DNS. I'm more interested in detecting those filters
that just redirect you to a custom page that state the website is blocked, or
to a 404 page. I suppose detecting such filters would require using some
technique that compares differences in webpages from two different locations.
Thanks for the suggestion though, I'll try to get in touch with them.

