
Spidermon: Scrapinghub’s Now Open Sourced Spider Monitoring Library - Ian_Kerins
https://blog.scrapinghub.com/spidermon-scrapy-spider-monitoring
======
hartator
That's awesome, great work from the ScrapingHub team. We had a similar
approach at SerpApi. We write open source rspec tests and run them daily.

e.g.:

\- [https://github.com/serpapi/test-knowledge-graph-
desktop/tree...](https://github.com/serpapi/test-knowledge-graph-
desktop/tree/master/spec)

\- [https://travis-ci.org/serpapi/test-knowledge-graph-
desktop](https://travis-ci.org/serpapi/test-knowledge-graph-desktop)

\- [https://github.com/serpapi/test-organic-results-
desktop/tree...](https://github.com/serpapi/test-organic-results-
desktop/tree/master/spec)

\- [https://travis-ci.org/serpapi/test-organic-results-
desktop](https://travis-ci.org/serpapi/test-organic-results-desktop)

\- [https://github.com/serpapi/test-news-results-
desktop/blob/ma...](https://github.com/serpapi/test-news-results-
desktop/blob/master/spec/news_results_trump_spec.rb)

\- [https://travis-ci.org/serpapi/test-news-results-desktop](https://travis-
ci.org/serpapi/test-news-results-desktop)

Producing reliable scrapers and parsers is very hard. Testing as much possible
is the only way to go. Smart use also of JSON schema on Spidermon.

------
Excluse
As a Jamaican, I got a unique chuckle out of this. Clever branding!

~~~
robk
Sim simma!

------
bryanrasmussen
How does this handle sites that are rendered entirely or nearly entirely by
JavaScript client side? My only real concern when looking at a new scraping
solution, almost everything else can be worked with.

~~~
landyman
I haven't used Spidermon yet, but Scrapinghub has a Scrapy plugin called
Splash which can be used to render JS pages within a Scrapy spider.

------
faitswulff
Great, now to figure out some way to use this with Batman.js ...

------
martinlaz
Cool name :)

