
A list of domains emitting [Terry Pratchett] clacks headers - jonathanmh
http://clacks.jonathanmh.com/
======
brudgers
GNU Terry Pratchett

 _In Terry Pratchett 's Discworld series, the clacks are a series of semaphore
towers loosely based on the concept of the telegraph. Invented by an artificer
named Robert Dearheart, the towers could send messages "at the speed of light"
using standardized codes. Three of these codes are of particular import:_

    
    
        G: send the message on
        N: do not log the message
        U: turn the message around at the end of the line
           and send it back again
    

_When Dearheart 's son John died due to an accident while working on a clacks
tower, Dearheart inserted John's name into the overhead of the clacks with a
"GNU" in front of it as a way to memorialize his son forever (or for at least
as long as the clacks are standing.)_

[http://www.gnuterrypratchett.com/](http://www.gnuterrypratchett.com/)

------
Jaruzel
Hmm, I wondered a while back if anyone was keeping track of these. It seems to
be a manual list so far, so added my main server, as that's been transmitting
the clacks overhead since he died. Can't wait for the crawler to go live and
then we'll see how many people did add it, including any big name sites.

~~~
jonathanmh
Hi Jaruzel, the crawler is live :D you can see in the bottom which page it is
crawling right now.

It's basically scraping all links on every page it hits and tests the headers
if they are containing the clacks value.

I added the form so people can submit pages to speed up the development of the
list, even though I believe eventually the crawler would get to their pages :D

~~~
Jaruzel
Cool! What software are you using for the crawler?

~~~
jonathanmh
I built a not very advanced one with node.js and mongodb :)

Mainly in use in the crawler: * request * cheerio

