Hacker News new | past | comments | ask | show | jobs | submit login
Libcloud: One Interface To Rule Them All (apache.org)
136 points by shrikar on Feb 7, 2014 | hide | past | favorite | 50 comments



I hope libcloud gets some traction.

The problem is, currently it's not really mature enough. The whole idea is to use one interface so you don't end up with writing special cases for different IaaS. The problem is that it simply isn't fully-featured and that the interface for different platform differs even when it's not really necessary.

I've been using it to interact with AWS and OpenStack, but still end up with lots of special cases and even falling back to boto for some AWS features (e.g. CloudWatch).

I really hope they fix the interface issues and cover more features of specific platforms in future.


In the ruby ecosystem Fog has existed for a long time which offers a similar service: http://fog.io/

Edit: I should also say it's neat to see an apache supported project in this space, more tooling around these services are very useful.


jcloud recently became part of apache: https://jclouds.apache.org/


Link to GitHub https://github.com/jclouds

jclouds is good for Java and Clojure projects.


I'm curious how fog stands feature-wise in comparison to Apache libcloud.


Each cloud has its own semantics. Unless you want to unify the semantics, you'll either end up with lowest-common denominator APIs, or cloud specific methods.

What is needed is a standard for discoverable RESTful interfaces. With that, there could be a single client library for each platform/programming language to access any compliant API.

The metaphor is here that i don't need a different browser to connect to Citibank's and Bank of America's online banking services. A single browser works just fine. This is the deeper meaning of RESTful, something which is overlooked by many people.


The Link header [1] works very well for this. It exports an index of links with the same semantics as the <link> element.

  <link rel="stylesheet" href="/styles.css">
But in the header, which means API consumers don't have to search the response body for links, and can specifically request the index with a HEAD request.

  Link: </styles.css>; rel="stylesheet"
The reltype is pretty powerful - it enables semantic standards and lets a client make assumptions about endpoints. Since the Link header is a standard format of, you know, links, it's also very discoverable.

1. http://tools.ietf.org/html/rfc5988

EDIT: hm. No idea why I got downvoted, but I tightened it up to get more to the OP's point.


Your first and second sentences contradict in a way which illustrates the challenge: interface discovery is trivial but that only helps with the lowest-level part of a client library.

There's a great Python library called UniversalClient which illustrates this nicely: it provides some helpful scaffolding for working with REST APIs and structuring your code in a maintainable manner but ultimately a programmer still has to provide the service-specific understanding:

https://universal-client.readthedocs.org/en/latest/basics.ht...


> What is needed is a standard for discoverable RESTful interfaces. With that, there could be a single client library for each platform/programming language to access any compliant API.

I don't get how that would work magically. Imagine that I want to update my phone number in my account data on two different sites. If I point this magical library at the two site's APIs, how would it decide which resources to update under which paths?

It seems to me that you would still need to implement the different concepts, resource names, and so on across different sites, and you wouldn't really win anything here.


You mean SOAP and WSDL? Or perhaps Rails ActiveResource? Or perhaps http://xkcd.com/927/?


No, you don't a different browser, but you need to know how to use the sites of both banks. You can do that because you have a brain. How should code know to navigate the APIs of different providers without the code knowing the tasks to accomplish? It cannot decide for itself.


I have seen this example before, but to me this neglects the aspect of using human intuition to understand how to, for example, send a wire through Citibank's site vs BofA's site. Without being argumentative, I think want to understand how this applies to APIs? (i.e. How could Citibank's hypothetical RESTful API teach my code how to send a wire transfer through their system if I had been using BofA for all previous transactions?) Or am I just flat-out missing a bigger point here?


I've been working on something very similar to what you are describing: http://www.cosmic-api.com I would really appreciate some feedback :)


Looks nice. To really evaluate it, I need to see what the HTTP endpoints and traffic look like. My concern is what experience a non-cosmic client would have.


Some banks work only with IE...


Hm really? I can't imagine a bank in 2014 getting away with this, at least for consumer-facing services. They are just going to write off all their mobile customers?

If you're talking about B2B or other more back-end type stuff, yeah some of this is annoyingly outdated. My bank's "business bill pay" website looks like it was written in 1998 and not changed since.... it does work in FF and Chrome though, since it's just bare-bones html forms and almost no javascript.


No, consumer facing websites. I work closely with a Service Desk department. I've dealt with having to get major bank websites up and going one too many times!


I have an even better strategy, instead of dealing with all this bullshit manufactured complexity and artificial scarcity, throw all that shit away and use basic web sockets like everyone already does, and save a fortune in time and money in the process. On top of that, you don't even have to support bullshit metadata protocols that do nothing but add overhead and implementation complexity, but are sold on being dynamic and secure, when the only thing that is dynamic is their product support and latency and the only thing that is secure is how fucked you are when your account is canceled.


I'm using salt-cloud which uses apache-libcloud. I've blogged a few times about it here -

This post shows how to stand up fleet of servers with Digital Ocean and salt-cloud:

http://russell.ballestrini.net/create-your-own-fleet-of-serv...

This post shows how I quickly setup a Sensu monitoring test environment:

http://russell.ballestrini.net/sensu-salt/


Not sure I'm a fan of wrapping libraries like that one. It tried to abstract things from you, but on the the other hand, you need to know the implementation details of each provider to understand the behavior of the library.

Maybe I misunderstood what it does?


Most wrappers exist to provide a fast and easy way to execute common methods. They likely make it trivial for most people to implement something quickly. Those who need functionality beyond what they provide probably don't care about it being easy to implement and probably won't use it. It's a choice.


With this new website, the libcloud community has done a great job of better documenting. The list of supported providers has also increased. we are glad to be part of the adventure with exoscale: https://libcloud.apache.org/blog/2014/01/27/libcloud-0-14-an...


I agree that the result of these APIs can be a dumbing down of your cloud infrastructure. IMHO, the major challenges inherent in having multiple cloud providers with differing APIs is absolutely not writing code to use them, rather: - infrastructure management - workflow process security - credential security - offline workflow (local emulation) support - cost comparisons - performance observation / guarantees - SLA - recognition of disparate legal jurisdictions where required - managing nontrivial inter-service build and live dependencies - complex or non-standard network topology support (ie. not "single default route to the internet" or "single default route to the internet, secondary route to cloud provider specific renamed VLAN concept") - embedding real time failover

Keeping all of the above out of the way of regular programmers who just want to write cloud provider neutral services is the real challenge. None of these libcloud/libvirt type solutions ever target the above, which mostly border on operations concerns.

I am now working on the second functional prototype of a system that I believe goes a long way toward resolving these issues by taking a broader, more operations-centric perspective on the evolving norm of multi-provider cloud infrastructure within companies that may include remote developers, require offline development support, and still need to maintain higher standards of trust and security.

Those curious can browse the presently rather obtuse documentation at http://stani.sh/walter/cims/ .. I am considering asking my company to allow me to open source it.

Right now we have storage providers for LVM and ZFS, and support for an internal, high availability corosync/pacemaker cluster + LXC based cloud provider in addition to a range of popular commercial ones. Very interested in feedback and/or experienced/motivated help.


No Azure support? Really? But has support for various minnow providers...


The Azure API is much more complex that any other providers: implementing it without introducing a dependency on the Azure Python SDK [1] is going to be complex. Just have a look at its source code (e.g. [2] and [3]) to get an idea.

All drivers available in Apache Libcloud have been implemented without introducing a dependency on a custom SDK. Why make an exception for Azure? Note that the complexity does not come from the Azure Python client API but rather from the Azure REST API that is in my opinion much more complicated that any other provider's.

Azure has some specific idiosyncrasies (such as deployments, certificates management, VM images that first need to be transferred to a blob container) that make it harder to fit in the simple and unified Libcloud API.

[1] https://github.com/WindowsAzure/azure-sdk-for-python

[2] https://github.com/WindowsAzure/azure-sdk-for-python/blob/ma...

[3] https://github.com/WindowsAzure/azure-sdk-for-python/blob/ma...


Also some less-known providers such as Brightbox decided to contribute directly to Apache Libcloud rather than writing there own custom Python client SDK.


Someone working on or using Azure would be more than welcomed to implement support for Azure drivers. Contributions typically come from the providers themselves, or the users of those providers.


Yeah....I mean Apache and MS have always been so tight, hard to imagine them getting left out ;)


Microsoft is a platinum sponsor of the Apache foundation nowadays:

http://www.apache.org/foundation/thanks.html

http://www.apache.org/foundation/sponsorship.html


Oddly enough, Apache and MS have a pretty good relationship nowadays. There was definitely tension in the past, but MS is a very different company today then it was in the 90's, early 00's.

I wouldn't be surprised, now that it is highlighted, to see MS submit a patch to libcloud with what is needed to support Azure.


> "Latest stable version: 0.13.2"

Does that mean that they suggest I can use this in production (since it's "stable") or that I can't, since it's not at version 1 yet?


We just released 0.14, and the next will move to 1.0. The general feeling is that it is stable, and people really like seeing "1.0".


It is just a version schema. A version is dictated "stable" when tests pass or confidence in the code at that release is high. They could have easily started with version 1.0.0 and currently be on version 1.13.2. The version number doesn't directly dictate stability.


lots of projects have 0.x stable for years (then all of the sudden they go from 0.6.7 to 7.0 which is kind of funny!)


Honestly the one interface should be the AWS API. Eucalyptus has already embraced it and set the tone. OpenStack is partially there but political battles in the community have forced a stalemate.

If you standardize on the lowest common denominator, you lose many of the key features of AWS such as VPC and autoscaling.


The AWS API is an INTERFACE to a proprietary system which runs closed source code. The code that runs the API endpoints is also closed source. Whether or not the SDKs for the AWS API are open or not is irrelevant to this discussion (not that you said something to the contrary, but I'm covering my bases here).

Eucalyptus has no more or less 'embraced' the AWS API methods than OpenStack has - save for a bare few methods that have different connotations in OpenStack land than they do with Euca: https://wiki.openstack.org/wiki/Nova/APIFeatureComparison.

The 'political battles' you are citing in OpenStack are actually a bunch of cage rattlers who are promoting their own interests in their respective service offerings. Going into a company that uses AWS for infrastructure and trying to pitch them on OpenStack is a long, drawn out process. Selling new technology is hard.

The lowest common denominator benefit that OpenStack provides is an open and transparent code base that can be modified and utilized by anyone. It's that openness that allows the few people using features like VPC and autoscaling to write some simple code to do deployments on whatever they like. Honestly, it's a handful of lines of code at the most for any given method. Saying you 'lose' feature functionality is simply a rationalization for attacking the OpenStack movement.


Rackspace pitched OpenStack hard to competitors when they released it. I mean hard. They were practically begging various folks to give them a moment to discuss switching entire hosting services over to OpenStack. The strategic play was, and remains, fairly obvious: if you remove feature differentiation the market competes on other differentiators (Rackspace wants to compete on support, most likely). Given that historical context of OpenStack's origin and not-so-subtle strategy, it's interesting to see such a dismissal of hosters that still want to compete on features.

Think about it. If you launch a hosting shop today and adopt OpenStack, you will be able to offer a similar feature set to Rackspace right off the bat. You then have to begin work on how to differentiate yourself feature-wise, while meanwhile not really competing with Rackspace because it's the same offering. OpenStack tells us that Rackspace wants all hosters to look largely equivalent technically.

I'm pretty sure what's where the project came from. Now, however, it appears to be evolving into a community effort to compete with Amazon, which is smart on Rackspace's part: by making a community effort, Rackspace (and everybody who runs OpenStack) get to capitalize on the work of the community to build an Amazon-like system, since such a system would require a significant investment on Rackspace's part and they likely don't have the people to do it. The danger, however, is that we'll end up with a hosting industry with dozens of AWS clones.

Regardless, in the broad, virtualization was our crutch until Linux containers caught up to what zone/jail admins have known for decades: there are far more efficient ways to share a machine than virtualizing an entire OS. The smart minds are focusing on containers now. I know OpenStack just implemented Docker support in Nova, but I don't think OpenStack models container computing well.


This is a long-standing argument in OpenStack. The root problem is that the AWS API exposes only what AWS exposes (trivial example: no live-migration.) The AWS API precludes features better than AWS, which is a big driver for some private clouds.

But there's no stalemate in OpenStack: it supports both APIs, although the EC2 API definitely gets less love. If you improve the EC2 API implementation, those improvements will get merged subject to the normal code review. If you want to do an ecosystem project that proxies EC2 calls to OpenStack, there are obvious questions over code duplication, but there's not profound opposition. And if you do want to do a proxy, go right ahead; you don't need OpenStack's buy in at all: build it, make it better than what is there already, and you'll win the argument.

The root problem (as with all open source) is that there are are a lot more people saying they want others to build better EC2 compatability, than there are people saying they want to build it.


If you standardize on AWS you lose many key features of other systems, and you put AWS in control of your future.


> If you standardize on the lowest common denominator, you lose many of the key features of AWS such as VPC and autoscaling.

That's not the level where libcloud is aiming for LCD - it's not at a provider level, but at a product level. As long as a couple of providers support autoscaling, and AWS and Rackspace do, there can be support for it. It's on my todo list to implement, by finding the common thread between AWS and Rackspace APIs, and developing an abstraction that aims to cover the features of both.

[Rackspace employee, libcloud committer]


I regret that I have but one downvote to give.

The AWS API is convoluted, overwhelmingly complicated, and often yields no insight to the actual behavior taking place in the Amazon system. I can think of countless scenarios wherein my API call succeeded but nothing happened, with no reporting of the reason why anywhere. A later run of the exact same call again works. No explanation.

Now, arguably, I'm discussing the implementation of the specific API, but even on an architecture level the AWS API tries to handle absolutely every case in the most complex manner possible. This is because AWS tries to handle every case in their systems; where other providers trim things for simplicity, AWS adds as much as it can. This flexibility can be useful if you need it but to someone getting started, it is a giant wall. They've gotten much better about guiding people through setup, but the API still reflects the kitchen sink philosophy. Everybody I've ever known who loves the AWS API has only ever used boto. As an exercise, launch an instance with ephemeral and EBS storage mounted using the bare API on a clean account. You'll see what boto hides from you.

The correct approach is something like libcloud with the ability to interact with platform-specific things in a way that the library does not anticipate. However, there's a school of thought in that if you are digging yourself in to platform-specific features, your ability to move is now a significant line item of technical debt that could put you in a position where you are completely stuck and cannot hire enough people to get out.

As a former hosting employee it is immeasurably frustrating that everybody mentally defaults to AWS these days without even a second thought. All of Silicon Valley is basically wired to funnel venture capital directly to Amazon. The VC firms might as well just invest in Amazon and skip the startup risk, since inevitably most of the cash burn of a startup will go directly to AWS (because it takes a genius of reservations, flexible scaling, and other tricks to not spend six figures a month on your hosting). AWS was designed for flexible scaling, not running your 700 instances 24x7 regardless of load, but nobody operates that way. You think Amazon is going to tell you that?

With Amazon, the API is full of issues, and that's even before speaking about the service itself as well as the support; even with paid support, 12-hour ticket responses were the norm for me including in an outage situation. My personal favorite was a six-month ticket that culminated in "we don't know, so we're closing this ticket," when reporting buggy behavior with their API. There are many aspects of AWS that are dreadful, the API being only a small example, and the suggestion that we should standardize on the AWS way of doing things as an industry is very harmful. I'd contend we need a viable alternative, but Amazon has made themselves a Kitchen Sink Provider and other companies can only bite at little pieces of the pie.


Very different experience with AWS/OpenStack/running our own here. We indeed had a largish AWS bill ~50k/month. Decided to move dev/test in house into an OpenStack cluster I built with the help of Mirantis/FuelWeb. Decent experience standing up the cluster but spurts of trouble thereafter. No real win with anything API-like at all with OpenStack. Feels half-baked. Getting help is damn near impossible.

Meanwhile in AWS-land, things are humming along. Using more and more of the AWS infrastructure and removing pieces we had previously built in-house. Getting lots of mileage out of boto/Java/awscli and starting to see a real paradigm shift in how to run ephemeral servers instead of the 700 instances 24x7 approach you refer to above.

I came from running my own servers, floundered for a bit on AWS, built an OpenStack cluster and strongly prefer AWS for simplicity and sheer flexibility of the API. I'm a year in running on AWS and haven't had a single issue requiring any support.

The largest benefit has come from the concepts and projects put out by Netflix. If I didn't see companies like them leading the way, I'd likely be a bigger OpenStack fanboy.

See generally: http://www.cloudscaling.com/blog/cloud-computing/openstack-a...


You could run eucalyptus cloud in-house and have AWS compatibility.



no. it should be a market place, not a mono-poly.


Did we learn nothing from LOTR?


cool. how does this relate to packer & vagrant? this is basically the trend to an abstract machine interface. so you can boot an image at several place and you want notice. this will matter in terms of price and consumer power.


sounds like a really ambitious project to me, I cant do anything else then wish the dev team good luck! (and use boto meanwhile the product matures :))



and in the darkness ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: