Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Given an API, Generate client libraries in Node, Python, PHP, Ruby (github.com/pksunkara)
268 points by pksunkara on Jan 3, 2014 | hide | past | favorite | 73 comments



Wouldn't it be wiser to choose a hypermedia format for the API, and then use a generic hypermedia client on whichever platform you like? Then you write just as many client libraries (zero), but the problem of pushing updates to your clients is solved as well.

Full disclosure: I wrote such a library for Ruby, called HyperResource. https://github.com/gamache/hyperresource


I've read most of the book RESTful Web APIs, which advocates Hypermedia as a way to stop having to create client-specific libraries for each and every Web API. But the book doesn't say much about creating such clients.

Is there such a generic Hypermedia client for Python?

How does this affects performance? Navigating through the maze of a Hypermedia API requires many more requests than just hitting a known endpoint, doesn't it? Do hypermedia clients provide client-side caching?

My other concern is that those APIs might be harder to learn than ad hoc fiat standard. The reason why RESTful-ish APIs have been successful is because they don't add any overhead on top of the things they're representing. At first glance, stuff like JSON-LD, Siren, HAL, etc. seem to be bringing back the complexity of things like SOAP that people have fled from.


The client space is slowly filling in. There are a bunch of clients which still leave the plumbing exposed (a couple good ones are HyperClient.rb[1] and HyperAgent.js[2]) -- it still feels like you're making HTTP requests.

One of the things I like most about hypermedia is that the hyperlinks can represent a complete set of functions which can be applied to an object. In other words, each object contains its own method list. This fits well in languages which get to implement catch-all methods, like Ruby, and I couldn't resist coding up a client that worked that way.

(There's room for this sort of trickery in near-future ECMAScript too, with Proxy[3]. I would really like someone to do this and I would kind of like it to not be me.)

When you lay out your API according to that philosophy, and cache a few "stepping-stone" objects you'll be traversing often, hypermedia APIs don't seem so inefficient at all.

As to your last point: take a look at HAL[4]. It sits alongside a "traditional" API layout very nicely, essentially adding "_links" and "_embedded" which can be safely ignored by non-hypermedia clients. The HAL spec is extremely sane.

[1] https://github.com/codegram/hyperclient

[2] http://weluse.github.io/hyperagent/

[3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[4] http://stateless.co/hal_specification.html


Thank you and Steve for your replies.

You both seem to recommend HAL over other Hypermedia formats. In their book, Richardson & Amundsen mention that HAL doesn't allow to tell the client which HTTP methods to use when doing state transitions. They say that HAL is therefore only suited for read-only APIs. What do you think of that?


Link relations can be designed that indicate which HTTP methods are allowed. HAL is heavily dependent on conveying semantics via link relations, which is something that some people don't like doing.

Consider the oauth2-token link relation defined here[1]. The definition of this link relation refers to RFC 6749 [2] which states that it is necessary to pass a application/x-www-form-urlencoded body using POST.

[1] https://tools.ietf.org/html/draft-wmills-oauth-lrdd-07#secti... [2] https://tools.ietf.org/html/rfc6749#section-4.1.3


Formally, they're absolutely right. Information like that is out of band with HAL.

In practice, I deal with it by sticking to HTTP verb conventions and specifying what to do in the documentation.

It's not automatic -- it'd be e.g. rsrc.somelink.post(params) instead of rsrc.somelink(params) in HyperResource's case -- but it works, and any human who knows what the 'somelink' rel is supposed to do might also be expected to know how to use it.


Yup, this is largely the year of the client.

> Python?

There are some libraries, expect more to spring up soon.

> How does this affect performance?

This kind of question is incredibly broad. In ways it's more efficient, in ways it's less. It Depends.

That said, as you alluded to, caching should be very prevalent, so that helps.

Also, it's not like you have to make 5 requests any time you want to do anything: the point is that you follow the application's state along. Just one request per transition. A maze is a pretty decent analogy, actually...

> harder to learn than an ad hoc fiat standard

They may _seem_ harder to learn, but you have to re-learn every single ad-hoc standard over and over and over and over. If you learn HAL, you can speak to any number of APIs that use HAL. Plus, you say 'complexity,' I say, 'no surprises.' Everything is actually enumerated for you, so it should be easier to learn. No hidden assumptions.

Furthermore, as you're more familiar with the format, the details fade into the background. I'm sure you don't read RFC 4627 every time you want to deal with a JSON-based API, either.


I agree its nice to do it that way. But I found that not all APIs use hypermedia format. So, I did it this way. :)


Ah, that makes more sense. I was under the impression that Alpaca was targeted at people writing the APIs.

Neat tool!


At risk of writing a vacuous comment but as a supporter of more HN positivity...

Damn, HyperResource looks sexy.


I appreciate it -- that was kind of the point. :)


Author here. I covered almost everything in the documentation. And there is also a small example which is hosted at https://github.com/alpaca-api.

Please ask if you have any questions. Thanks


How do you differ from Swagger, Blueprint, I/O Docs, RAML and all these API endpoint descriptions?


I tried searching for a program which generates client libraries in different programming languages and failed.

Could you please link if something does it? From what I gather, the above mentioned sites only provide methods of describing the API and/or Automating the API on the server side. Nothing is said about client libraries.

Thanks



I think that creates an API, then exposes it in numerous languages. This writes wrappers to existing APIs. Not the same thing. (Lots of code similarity under the hood, though.)


No, it actually does both - which is why I was so interested in it initially.

from the above link: "generates code to be used to easily build RPC clients and servers"

TBF, I haven't used it as a developer, but I did contract work with a company that had invested in it heavily. From what I saw of their workflow, it ended up being a pretty nice bootstrap tool and little more.

edit: Oh, I see what you're saying. Yes, you're correct - thrift does not wrap existing APIs.


Why create a new format for describing APIs instead of using something like RAML? The client library generation is very useful, but it would be nice if there wasn't yet another format needed to describe the API.


There's this utility called swig(http://www.swig.org/) which let you create wrappers for C/C++ libraries to a number of different target languages (dynamic and static), though often time languages offer ffi on their own.

The code generated by this tool is rather heavy weight though. Also I'm not quite sure if this is what you're looking for.



https://github.com/wordnik/swagger-codegen

does javascript, scala, java, objective-c, php, python (3), ruby, android, and even flash

Also has the benefit of a commercial backer (https://helloreverb.com) and a real app depending on it


Hey, I noticed you mentioned that you were interested in having client library generation in Obj-C, Java, and Scala, yet all of the existing client lib support is for dynamic languages. I think you may find that some fundamental changes need to be made to support this, unless every datatype is just going to be a string.


If you are talking about API definition, there need not be many changes since api.json do not define any objects.

If you are talking about the code of the program, they wont be changed because, the program just compiles and executes the templates.

If I didnt answer your question properly, could you please elaborate?


Just tried this out and Alpaca is an awesome tool. However, I'd never want to release this in production without tests. Alpaca doesn't generate tests, so you're back to maintaining the tests for your N different platforms/languages. But you'd have the same problem with its competitors, too; Thrift [0], for instance, doesn't generate tests either.

Overall, I'm not sure that the time savings is as big as it first appears, but I think it's great for quick projects.

[0] http://thrift.apache.org/


Tests is one of the main priority and I am planning to do them soon.

This current program just a small step in the right direction. :)

EDIT: Currently, I have a test suite at https://github.com/pksunkara/alpaca/tree/testing which tests the generated client libraries of an example API with the respective server.


That sounds cool! Is there a way to automatically generate BDD style Tests? Have a look at http://funkload.nuxeo.org/intro.html


If you have programmatically generated code, generating tests is almost completely useless (especially if the same code is generating the tests as well).

A bunch of passing tests shouldn't make you feel comfortable able releasing something to production. Proper code review and monitoring of client failures on your servers should make you confident.


I'm confused by this comment. What would the tests test?

My thinking is that, if trusted to output a library that matches the API spec, there is no need to test that API. However, that does require that Alpaca is well-tested enough to be trusted - would you be happy with tests in Alpaca itself, or should it generate tests too?


I guess it depends on whether you view Alpaca as (a) "it's a starting point for generating your API", or (b) "it's the authoritative source and the only way to generate your API". If you use it like (b) you don't need tests, but Alpaca does. If you use it like (a) Alpaca still needs tests, but the buck stops with you.


Ah, using it as a starting point, that you then want to test. That makes sense, thanks :)


I like the SPORE (Specification to a POrtable Rest Environment) approach better. You create a description file in JSON and each native language client can use that file to access the HTTP API. https://github.com/SPORE/specifications

SPORE already has clients for Clojure, Javascript, Lua, NodeJS, Perl, Python, and Ruby. I have used SPORE in a few projects and I was not disappointed. Another approach to solving the cross language library problem.


Looks promising.

Too bad the last commit was 2 years ago...


Maybe it is finished?


This is cool. I would suggest that it would be very useful to have this kind of thing for JSON Schema [1], which is what I use with Python code to validate incoming JSON. (I was originally hesitant to using that, but since getting into it, I have yet to run into a use case which it cannot handle.)

There is also an RFC for "JSON Hyper Schema" which is intended to describe REST APIs. It doesn't have much library support in much of everything, but I am surprised that it hasn't taken off!

I like that this library is fairly opinionated (options for how to authenticate, supported formats, etc.) Though I worry that that creates a bit of inflexibility - for what exactly does "oauth" actually mean, there are always vagaries.

Neato!

[1] http://json-schema.org/


Am I right in thinking about this like a WSDL, but based on JSON?


Being a WSDL is not the aim here. Generating client libraries in multiple programming languages is.

But, to do that I need a format using which people can define their API. I used JSON and only supported the elements that I needed to generate the client libraries.


Right, the format affords generic description documents from which to generate client code. This seems very similar in spirit to WSDL. No?


WSDL is not a method in itself for actually generating the client code, though. It's simply a (barely) machine readable description of the API from which clients decide how to deal with it.

EDIT: Since I was down-voted for whatever reason, I'd like to rephrase. WSDL is a standardized language for defining web services. It's not a tool in itself for generating client code, and what I said was meant to point out that that's the obvious difference between this and WSDL. A description written in WSDL can both be used to generate client APIs and skeleton code for the server itself, but isn't the generator itself, just like C isn't gcc.


You're being downvoted because this project is a generator program which, as input, takes a specification of the API in JSON format. This JSON specification is what is being compared to WSDL because they serve the same purpose: to define the endpoints, arguments, and return values for a given API.


I agree with that, but I am correct in pointing out that the scope of this project covers things that WSDL does not, and in that sense it isn't like WSDL. When someone asks a question that can easily be answered in terms of differences I think it's completely reasonable to point those out.

Perhaps my parent intended for the question to be rhetorical, with the implied answer being "Yes, I'm silly for reinventing the wheel" in which case the differences between this project and WSDL are an also an obvious defense of it.


The comment you replied to said:

  > Right, the format affords generic description documents
  > from which to generate client code. This seems very
  > similar in spirit to WSDL. No?
The word "this" in the second sentence clearly refers to "the format". To reword the parent:

  > The JSON format used by this project seems very similar
  > in spirit to WSDL. No?
So while you're correct that the entirety of the project (JSON format + code generation) is greater in scope than just WSDL, you took exception to a claim the parent did not make. You could equally have said:

  > The JSON format used by this project is not a method in
  > itself for actually generating the client code, though.
  > It's simply a (barely) machine readable description of
  > the API from which [the project's code-generating client]
  > decides how to deal with it.
I don't think anyone thought otherwise.


It looks like a WSDL-like spec (although not as complete) with a code generator as well.

I wonder if we'll see JSON versions of DISCO and whatnot as well.


I think this is fantastic. @sunkarapk, great job. That it's written in Go makes using it so much easier. Wonderful.


I have to admit, this is very reminiscent of "Add Service Reference" in Visual Studio, a capability to which I have grown to despise over the years. The code was almost always incomprehensible. I cannot tell you how much I loath seeing the comment at the top of a file "This was generated by a tool".

Having said that, this tool does look interesting. I hope that a goal is to always make sure that the generated code is as readable, and maintainable, as possible. Also, as mentioned by others, adding generated tests to the generated client libraries is extremely important.


You can go through the generated code examples at https://github.com/alpaca-api.

I tried to make them a lot more readable. :)


Describing an API is not hard, but the API authentication method is. How do you think you will do it?

Edit : If you don't make oauth consumption simpler, you don't really solve the problem


Currently, I am supporting 3 authentication strategies. Basic, Token in header and OAUTH.

These will cover most of the APIs. And I am open to include other kind of authentication strategies too.

https://github.com/pksunkara/alpaca#authorization-strategies

EDIT: OAUTH Consumption is not the main priority here. The problem I intended to solve is automating the development and maintaing of API client libraries in several different languages.

That said, I intend to add support for OAUTH consumption too.


Very cool, but why come up with a new API schema rather than use an open standard like OData (http://www.odata.org/)? Then Alpaca would be compatible with a bunch of APIs that already exist today. In fact, something like this (generating client libraries from APIs) may exist for OData already, but if it does, I've only seen it for .NET and OData (Visual Studio 'Add Service Reference').

This is actually pretty similar to a side project I've been working on called Gargl (Generic API Recorder and Generator Lite) (https://github.com/jodoglevy/gargl). Haven't gotten around to doing a Show HN post yet, but would love any feedback or to combine efforts. Basically it lets you generate an API for websites that don't have APIs publically available, by looking at how a web page / form submission of that web site interacts with the web server. You record web requests you make while normally browsing a website via a chrome extension, parameterize them as needed, and then output your "API" to a template file. Then this template file can be converted to a client library in a programming language of your choosing.


So, this is markedly similar to the project I've been working on in grad school, only without static typing. http://research.cs.vt.edu/vtspaces/realtimeweb/ Also, mine is explicitly geared towards education purposes. I'm about one third of the way thru version two, but I wonder if we can cross pollinate our code bases to get something even better.


Can we please stop calling Web Services APIs?


Or probably put "Web Services API" in the title: "Given a Web Services API, Generate ...".


Or just "Web API", short but specific.


Why? what else do you call APIs?


Libraries have APIs too.


Win32 API for example


Libraries and operating systems have APIs.

Web sites have Web Services using REST, XML-RPC, SOAP, whatever as communication protocol.


Does this use the Json Schema spec or have you reinvented the wheel?


JSON Schema is too complex for this project. I just chose the fields which needs to be populated so that I can generate the code.


Not that I would defend JSON Schema, but yes, you did just reinvent the wheel, and it was also not a new idea. I thought that despite the verbosity of XML, WSDL and associated technologies solved this problem rather better. I have the misfortune to develop in PHP, so pardon the example, but the client-side code went something like this:

$PayPal = new SoapClient($WSDLLocation); $PayPal->SomeMethod();

Yes, you can get away with having a less verbose description language if your needs are simpler, but that to me just ends up moving the problem when your needs become less simple. It's not like you can avoid validating your requests in some way, and required/not required is where that starts -- but, I maintain, not sufficient. I most definitely fail to see client code generation as being a necessary step.

You could say I've made this mistake myself; the thought process that leads to code like this must not be unique. Web services need documentation if they are to be used, and since that code will only ever talk to other code, it makes sense to have a machine-parseable description. Language agnosticism follows as a matter of course. I'm willing to entertain that XML was a bad idea and JSON Schema is not an improvement, but I still feel that if one must reinvent the concept of a language-agnostic machine-parseable web services description language, one should thoroughly understand the prior art. It may be complex for a reason, and like it or not all that SOAP stuff actually tends to work pretty well. I'm sure I have every reason for wanting to see a better technology suite which is JSON based, and I wish this were it. At the moment I don't think you're headed in quite the right direction.


Regarding bug reports:

> Guaranteed reply within a day.

That seems difficult to achieve--wonder how they're doing that? (Also, why??)


I am just a single guy. I always make a point to reply back to issues/pull-requests on my open source projects within 24hrs.

Yes, sometimes I fail to make it. But it's a rule I want to live by.


Good luck. I found a six month old PR on one of my projects the other day....


"Thanks for your report. I will read it later."

Done.


That looks fantastic, and seems to be in alignment with some things I have been doing lately (generating the server side controller of the API in Clojure + a set of documentation, from a set of definitions of API methods). Good job!


Well done!

Know what I think would be really neat? If it could be pointed at an instance of Swagger-UI, or use the same discoverUrl that Swagger-UI would use, and spit out the libraries from that.

If you're not familiar.. https://github.com/wordnik/swagger-ui


I did something similar for Go and Java. It's simpler if you don't need the whole API, but of course not as powerful. https://github.com/bashtian/jsonutils


fwiw, I experimented along this line, dynamically generating python wrappers from yaml: https://github.com/reklaklislaw/rest_easy

It lacks documentation, a bunch of features, and parts smell pretty bad, but since the topic came up I thought maybe someone would find it interesting, if only vaguely.


Useful, well-documented, TODOs right in the readme, and fast response for pull requests. I wish every open source project was like this.


Great idea! Including Obj-C would be very helpful.


I've been enjoying http://apiblueprint.org


The web page was very pretty, but I still have no idea what this is/does. Help?


oh man you just killed a major feature of mashape.com


Client library generation is something that mashape supports, but currently they generate libraries for:

Java, Node, PHP, Python, Objective-C, Ruby, and .NET

It looks like alpaca supports:

Java, Go, Perl, Clojure, Scala, Obj-C




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: