

A Simple Web Scraper in Go - gschier
http://schier.co/blog/2015/04/26/a-simple-web-scraper-in-go.html

======
natch
>In my day job at Sendwithus, we've been having trouble writing performant
concurrent systems in Python.

I'm curious, at a company that does email campaigns, what use do you have for
a web crawler?

(I realize that you've called this a scraper, but the code does have the
underpinnings of a crawler, and that is reflected in the naming used in the
code).

Which version of Python are you using? Did you try asyncio?

~~~
gschier
This exercise was more about learning Go, and not so much writing something
directly work-related. We don't have a specific use case for a web crawler. We
do, however, have a need for parsing and manipulating HTML (for dynamic
templating), making HTTP requests, and writing concurrent code, so this
exercise actually taught me a lot of relevant things.

I started off with a crawler in mind (hence the function named "crawl" (good
catch)), but I stopped a bit short so that I could write a comprehensible
tutorial (I find writing tutorials of things I've just learned to really help
my understanding).

I haven't looked at asyncio yet, but it does look interesting. We've mostly
been using gevent (py 2.7) and have found it to be a bit too magical (ie
monkey-patching core libraries) and hard to track down specific bottlenecks.

