Hacker News new | past | comments | ask | show | jobs | submit login

I've written scrapers and web automation bots going as far back as 1998 (using VB5 and VB6 automating IE)

Nowadays mostly I write custom scrapers use PHP (Guzzle, Curl, etc) and Python (I was introduced to Beautiful Soup, by the "Python for Secret Agents" book)

I've tried several commercial scraper tools and services over the years but few stuck for various reasons.

UbotStudio had great potential but in the end was buggy and painful, almost abandonware.

Scrapinghub.com is decent enough but bit expensive for my projects.

80legs.com is cool for massive scale but was overly robots.txt restrictive at the time I tried using it (for what I was scraping) and I don't like the syntax.

A scraper colleague likes using Winautomation however it's no longer for sale separately, because Microsoft acquired the company and rolled it into their Power Automate SaaS (RPA/RDA focused)

There is a new tool called RTILA that I used for the very first time a 9 days ago, which actually is the easiest way to create and run scrapers I have found in 20 years.

The RTILA software currently has minimal documentation (apparently is being worked on now), however the new features are being developed fast, all releases are here on GitHub (see the frequency of releases) https://github.com/IKAJIAN/rtila-releases/releases

Another user of the software has produced several video tutorials here showing how it works: https://www.youtube.com/channel/UCH6ov8LnB8-4ZF0yraxjw8Q/vid...

You download it from GitHub but also still need to buy a license key from here https://codecanyon.net/user/ikajian

The RTILA home page is here https://rtila.ikajian.com/

I am a genuine customer and have no connection in any way with the solo founder developing it, (except for a few emails and support forum messages). Genuine recommendation for writing a bot more easily.




Having written a few scrapers, including the browser kind. I found the trick was to specialize.

Each website respond differently, some can just use requests, others would need selenium with pre-existing profile. To develop a common utility is almost futile.

I do have useful building blocks, but for each individual things I want to scrape I scale out using project specific code. It's never too slow either - the time it would take to fill in all of the required bits in a do it all tool would have been similar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: