Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Online web crawling services that are affordable and support UTF-8
3 points by anfractuosity on June 17, 2017 | hide | past | favorite | 1 comment
Hi,

I'm just wondering if anyone can recommend any web crawling services that support UTF-8,

I've had issues with 80legs not supporting that when I used it some years ago, which gave me results which I couldn't use unfortunately.

Ideally I would be looking to pay less than £100 to crawl a site with a large number of pages.

Any advice would be most appreciated!

Cheers



If your budget is that low, perhaps you just want to use wget. If you use a fairly recent version of wget (1.14 or newer), you can even produce a WARC file, which is great for archiving. (It captures your requests as well as the responses, with full headers).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: