

Providing APIs for content-driven websites - aaronpk
http://aaronparecki.com/2012/281/article/1/providing-apis-for-content-driven-websites

======
tommorris
So, much as I'm inclined to keep my head down given I sparked all this
silliness off with a stupid joke that got out of hand, I'll say this... In
addition to using microformats, RDFa, microdata and other structured-data-in-
HTML techniques, Rails does the whole content negotiaton and suffix fallback
thing the right way out of the box.

It astounds me that people use Rails and end up doing all the craziness with
making custom APIs when Rails already does it the right way out of the box.
That they then to go on and pat themselves on the back and call what they've
done "RESTful" when they've made an API that's less RESTful than what comes
out of the box with Rails is bizarre.

In Java-land, Apache Jersey is also doing it the right way (content
negotiation) and is trivially easy to setup suffix fallback.

------
overbroad
But why even place the burden on the content producer to create these formats?
Why not just establish a practice of raw feeds and let consumers write their
own parsers? (I can imagine where consumers write parsers and put the code on
Github.) Clearly, consumers are willing to do some work with respect to the
data they seek, they will be doing some bulk manipulation of it, else they
would not be ready to jump through hoops like "API keys".

My favorite data is raw file on FTP servers, i.e. bulk data. For me, it is the
easiest to work with. I can translate it to XML or JSON if I want to, but many
times I do not even need to go through that intermediary step to get what I
want. If others wanted the data in, say, JSON, I'd be happy to share my
parsers with them. I'd bet others would too.

I think Tom's point was "Cut the BS, and just give us the damn data, already."
(Correct me if I'm wrong, Tom.) And I think he is spot on. This API nonsense
has gotten out of hand.

------
brennannovak
This makes a tremendous amount of sense on many levels. Having to write screen
scrappers for the vast array of HTML / Content layouts on the web is an very
tiresome task. Think of all the cool services like Instapaper, Evernote,
etc... that would have been considerably easier to build should something like
this exist!

------
adelevie
The FCC uses (and open sourced[0]) a Drupal module[1] that provides API access
to much of the site's content.

[0] <http://www.fcc.gov/encyclopedia/content-api-drupal-module>

[1] <http://www.fcc.gov/developers/fcc-content-api>

------
lazyjones
Interesting idea, but a waste of bandwidth and harder to maintain than a
separate API IMHO, even though it allows the responses to be cached for normal
and API usage together.

Might as well go back to serving XML and letting the browser transform it
(<http://www.w3schools.com/xsl/xsl_client.asp>), that would be more elegant
for such a task (even though it never caught on due to limited browser
support).

The content negotiation method would seem best except that most content-driven
web pages will contain much content that isn't interesting to the scraper on
every page, so a well-designed API with very specific queries will be faster
and more useful. Whatever content is on each page is also a moving target for
the scraper, even if she gets it as JSON, not so with a straight API.

~~~
eCa
I once built a very tiny web application with the structure of the GUI in an
xml file that was transformed (on the server) with xslt. I probably did one or
two things wrong, but it was a complete misery. It could only be worse if the
transformation was done in the uncontrollable environment on the client.

~~~
lazyjones
On the client side, there were lots of bugs, some are discussed in this SO
thread:

[http://stackoverflow.com/questions/274290/any-big-sites-
usin...](http://stackoverflow.com/questions/274290/any-big-sites-using-client-
side-xslt)

You'd have to do hacky browser xslt detection as detailed here:
[http://www.informit.com/articles/printerfriendly.aspx?p=6779...](http://www.informit.com/articles/printerfriendly.aspx?p=677911)

XML in this context apparently pretty much died because of too many badly
written parsers, libraries and browsers - and to some extent probably also
because dominant search engines would not index XML files properly (see
[http://news.oreilly.com/2008/06/why-xslt-support-in-the-
brow...](http://news.oreilly.com/2008/06/why-xslt-support-in-the-
browse.html)).

But nowdays at least there are some reasonably portable solutions like
<http://archive.plugins.jquery.com/project/Transform> (but that's
unnecessarily slow).

On the server side, it should be fairly simple, unless you hit
bugs/deficiencies (like wasting memory / leaks) in libraries ...

