Hacker News new | past | comments | ask | show | jobs | submit login

With XHTML 2.0 and related tools like XQuery it could've been a matter of course. Hell XQuery is still a much better tool for this job than SQL, but no one cares.

I mean

> string_between(content, '<title>', '</title>') as title

really?




I think the main point here is that you can get data from many different places without having to run crawlers. Like the etld example. tbh I too want to see better DOM handling (stringBetween is not the best function for HTML parsing lol) but the main value prop is pretty impressive.


> you can get data from many different places without having to run crawlers

Is it really the case ? Can you really avoid crawling before doing that ? Article is unclear




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: