Hacker News new | past | comments | ask | show | jobs | submit login

Interesting idea, but a waste of bandwidth and harder to maintain than a separate API IMHO, even though it allows the responses to be cached for normal and API usage together.

Might as well go back to serving XML and letting the browser transform it (http://www.w3schools.com/xsl/xsl_client.asp), that would be more elegant for such a task (even though it never caught on due to limited browser support).

The content negotiation method would seem best except that most content-driven web pages will contain much content that isn't interesting to the scraper on every page, so a well-designed API with very specific queries will be faster and more useful. Whatever content is on each page is also a moving target for the scraper, even if she gets it as JSON, not so with a straight API.




I once built a very tiny web application with the structure of the GUI in an xml file that was transformed (on the server) with xslt. I probably did one or two things wrong, but it was a complete misery. It could only be worse if the transformation was done in the uncontrollable environment on the client.


On the client side, there were lots of bugs, some are discussed in this SO thread:

http://stackoverflow.com/questions/274290/any-big-sites-usin...

You'd have to do hacky browser xslt detection as detailed here: http://www.informit.com/articles/printerfriendly.aspx?p=6779...

XML in this context apparently pretty much died because of too many badly written parsers, libraries and browsers - and to some extent probably also because dominant search engines would not index XML files properly (see http://news.oreilly.com/2008/06/why-xslt-support-in-the-brow...).

But nowdays at least there are some reasonably portable solutions like http://archive.plugins.jquery.com/project/Transform (but that's unnecessarily slow).

On the server side, it should be fairly simple, unless you hit bugs/deficiencies (like wasting memory / leaks) in libraries ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: