
Ask HN: Whats a modern way to parse data from webpages at interval? - wohnung
I&#x27;m doing a personal project where I&#x27;d like to get different data from maybe 100 or so webpages every day (or maybe more often) and store them for future use&#x2F;processing.<p>Its been a while since I did programming (used to do PHP&#x2F;JS in another lifetime) so I thought I&#x27;d ask here whats the modern way of doing this thats also easy to implement?<p>Dunno if on my own PC, buy even better somewhere online where I&#x27;d set it up and forget about it for a while. Serverless seems to be a thing these days - is there a simple (newbie) way to set this up with JS snipplets somewhere? Have been reading about Cloudflare workers - are they a good fit for this? Could you retrieve pages a list of pages at interval, get their DOM and send parsed data somewhere?
======
AznHisoka
Easiest way in 2019 is the same way you'd do it in 2009, or even 1999. Setup a
cron job that calls a simple PHP or Ruby script that calls CURL on a set of
URLs and saves the HTML/extracted data into a MySQL database. All in a LAMP
server.

It's not sexy, but you asked for the easiest way. No need for all this
serverless, or Cloudflare worker nonsense :)

------
onion2k
AWS has a scheduler for doing exactly this with Lambda functions and
CloudWatch -
[https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/S...](https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html)

