It doens't help that the Neutron data-model at the time that I was working on in (say 12 months ago or so) was terrible and basically impossible to scale/make to perform.
Inevitably you were then stuck with the deprecated and janky nova-network interface. Which while efficient and fast was also old and missing tons of stuff - meaning more monkey patching and janking around. Not to mention the fact that because of it's deprecation many completely ridiculous bugs befell it in later releases. (Grizzly and onwards basically)
TBH I am so disillusioned with the project I hope I don't have to work in or around it again.
You're not the first I've heard this from, nor do I suspect the last.
The problem isn't that the code is bad as much as it is that the climate often makes it impossible to fix it. Review queues are weeks or months long. The article makes a good point about the necessary man hours to work on OpenStack. I've seen code removed not because it didn't have a maintainer, but because 200 lines of code didn't have 3-5 full time developers. Insanity persists and money talks.
Looking back, I'd say that OpenStack Nova in the beginning was never this bad. It may not have been the best thing ever, because it wasn't, but no code needs to be terribly great in the beginning. The beginning of a project needs good process more than it needs good code, and OpenStack didn't establish this well enough, early enough.
OpenStack never had a solid, centralized architectural vision. Anyone that attempted to contribute architecturally was essentially ejected. Those that flushed millions into controlling the process and millions more into building adhoc features got their way. I mistakenly advocated early for wrangling control from Rackspace. The increased influence gained by individual contributors was quickly dwarfed by large corporate influences.
I'm still involved with OpenStack, but far less than I had been in the past. Mostly, I prefer to see myself peripherally involved where I might improve the lives of those trapped in that ecosystem, either to help them deal with the pains they've inflicted upon themselves, or to escape them entirely.
Lots of the code isn't great either.
That said, I'm not suggesting it should not be considered.
I'm guessing that a lot of people make the same mistake of thinking openstack is just as easy as linux to get running. Its really not. But it does provide 95% of the groundwork to get you started; often that remaining 5% is either your secret sauce or security overheads. And unfortunately, the details of how to do that are not open to the public...yet.
Also, a the slow pace of adding changes to openstack makes many projects add their changes to their custom patches and once its working there really isn't much incentive to push it upstream.
>>You see, physical switch operating systems leave a lot to be desired in terms of supporting modern automation and API interaction (Juniper’s forthcoming 14.2 JUNOS updates offer some refreshing REST API’s!).
This. Network hardware vendors have no incentive to make their devices more easily automated, and in fact face disincentive not to.
Doe anyone remember the excitement and promise around Google App engine when it was first announced, and before they changed the pricing model to per instance? The ability to put your app on the cloud, and scale up to the free tier, then out from the free tier on a paid plan if that's what you needed.
That model entirely disappeared. I miss it. Is anyone doing that now?
>> This. Network hardware vendors have no incentive to make their devices more easily automated, and in fact face disincentive not to.
There is actually a relatively established roadmap for the solution to this in "bare metal" / "white box" switches that essentially just talk OpenFlow to a controller. Google moved their entire international internal backbone (more traffic than public facing) to this model.
The issue at the moment is that there isn't lots of OS options and subsequently very little hardware support. Google developed their own hardware (despite preferring to have bought it) and my understanding is they wrote their own software too.
With regards to "bare metal" virtualization I'd expect to see a lot more in the next 12-18 months. On the network you need dynamic path configuration and traffic encapsulation/isolation. That should be "openflow" and vxlan/nvgre. On the host hardware youll want io virtualization (sr/mr iov) and possibly hardware encap as well. Substantial progress is being made on both fronts.
Edit: although its great to have two encap options I think theyre incomplete at best. All of the hard work has been punted to the centralized controllers and the rfcs have nothing useful to contribute there. Some of the rfc behavior is also insane/laughable; multicast for broadcast and mac/tunnel endpoint discovery ORLY? Ill be very surprised if there are any large vxlan/nvgre deployments which arent bespoke.
I've found there's lots of OSS around the controller and virtual switches for testing/lab but the only serious openflow agent designed for hardware switches I've found is Big Switch's Indigo it has very limited hardware support.
I see experimental support in OpenWRT - this is very interesting as it opens up a shed load of hardware options.
That model entirely disappeared. I miss it. Is anyone doing that now?
Just about every PAAS (including AppEngine) does this. What am I missing?
Edit; further, in the original GAE pricing model, the customer paid for specific services, usually by volume. Maybe accounting was prohibitive?
Plenty of PAASs do autoscaling.
I was a little incredulous when I read that they started writing ip manager code, but then I remembered this article about amazon AWS scale:
see section titled "The Network Is A Bigger Pain Point Than Servers" in this article about AWS scale: http://www.enterprisetech.com/2014/11/14/rare-peek-massive-s...
Quite realistic, as there are many bare-metal offerings at this time, as a quick Google search will attest to. What exactly the packet.net people mean when they say 'premium', however, is unclear.
> That model entirely disappeared. I miss it. Is anyone doing that now?
Heroku still offers a free-tier to start.
An OSS project isn't really supposed to be about "I can freeload on the works of others for some investment in comprehension and customization", which is how I felt the author framed his situation at times.
The underlying failure seems to be that the author decided it was easier to maintain his own proprietary platform than modify OpenStack for their needs and contributing back to the community. This would lead to others to pick up their stuff down the road and potentially reducing the maintenance burden (at the expense of exposing any secret sauce you feel you might have).
This deeper failure is In the incentives for Rackspace to withhold key commits on Ironic from the community because they feel it is secret sauce. (I am taking the OP's version of the tale at face value). They're one of the flagship supporters of OpenStack, and their behavior is perceptably a big reason for its failures to date.
The limitations of Neurtron without a product like VMware NSX underneath are well known. Production grade virtual networking at scale is hard and also mostly a secret sauce (for now).
OpenStack seems to effectively have become the OMG and CORBA 1.0 with a reference implementation - it's cloud vendor kabuki instead of distributed objects square dancing. You need vendor help to get going, and the portability is very limited, you'll get some value out of what's been done but at great effort. It seems to also be a useful commons for network and storage vendors to help drive interperability with the side modules (Cinder and Neutron). If anything, OpenStack is how the industry is desperately brute-force learning what Amazon Web Services has accomplished before they swallow the universe, which is valuable but messy.
OpenStack seems the only "I want to run a general purpose cloud" game in town today - CloudStack exists but doesn't seem to have a lot of momentum. Google, Azure and Digital Ocean are the only competitors to AWS of note and they don't open source their stuff. CoreOS on PXE or Ubuntu MaaS might work but needs a much more mature cluster schedulers, network and volume management. Or perhaps the real next generation will be "none of the above".
I'm an Ironic core reviewer and work on OnMetal at Rackspace.
At Rackspace, we run ahead of Ironic trunk. It's true that we haven't been super vigilant about upstreaming our patches into Ironic; this is not because it's "secret sauce", not because we don't care. Priorities are hard, both upstream and downstream.
OpenStack moves slowly compared to a team developing proprietary software. This is a well-known fact. We do our best to upstream our patches as quickly as the project allows, but they often need to be improved to work with other hardware/drivers/etc.
For example, when we launched in July, we already had support for "cleaning" a server - erasing disks, flashing firmware, etc. The "spec" for the new feature was first posted upstream June 25, 2014. This spec finally landed last January 16, 2015.
Our work on improving network support in Ironic has been similar; the project hasn't been ready for it (again, priorities). It's been done in the open, but the code is not in Ironic trunk yet.
We've been extremely open about what we're doing since we joined the Ironic project almost a year ago; I'm curious which patches the article has in mind.
As an Ironic developer, this article bums me out a bit, but it's a good pointer as to what we're doing poorly. /me starts writing better docs
That was a huge, unnecessary leap from your part. I did not read it like that. To me, it was more like "we were going to leverage the existing projects and add to that and give back (as they said they will do), but we could not because the underlying projects are not mature so right now it is more work to fix than to start from scratch."
I don't know if what they did is advisable or not. All I am saying yours is an unnecessarily aggressive conclusion attacking someone who just spent a lot of time warning the community about a lot of the issues under discussion.
Also, freeloading is, by far, how most people and organizations use open source, so it's not exactly a unique situation.
This is a beautiful and evocative metaphor.
It was too easy to break core functionality--for example, I literally never saw resizing an instance work properly. It does this crazy hack where under the hood it SCPs the VM image to another host and then tries to bring it up. It could have been a quirk of our installation but it would break every time. I saw similar breakage with Cinder operations where volumes would get "stuck" on VMs. Again, it could be a bad installation but it goes to show you how easy it is to break OpenStack if you aren't an expert in the codebase.
My current thinking is that a container-centric (as opposed to VM-centric) infrastructure is the way to go--that way I can just throw CoreOS or whatever on the bare metal nodes and migrate containers as needed.
"As we finalize our installation setup for CoreOS this next week (after plowing through Ubuntu, Debian and CentOS)"
Pity he doesn't elaborate on that. I understand that CoreOS is his choice, but it would be nice to know why the other distros aren't.
Do I want to work out and fix all the issues to get this distro working, or do I want it to work and move on to the other gotchas?
A fellow developer tried to get me into openstack a little over three years ago, and when I looked, it was far too enterprise for my tastes, but I care more about code than the devops and managing servers.
So yeah, as you suggested, if you install Linux on your computer, the Linux Kernel is running on the bare metal.
Another example is deploying hypervisors: Test suites for OpenStack are run against many different versions by using OpenStack to deploy the systems to test. HP's OpenStack distribution uses it as a deployment mechanism, taking over and managing the nodes of the OpenStack cluster from a small initial cluster.
Someone tell me if I'm wrong :-)
This would mean he could give two hoots about virtualisation, I guess his concern would be automated deployment and network allocation, along with monitoring.
edit : why not dedicated ? ...because you can have containers running on "bare metal"
Bare metal implies dedicated hardware.
...as I read through the article is sounds like it was probably around bare-metal needs - still, elaboration would be nice here :)
The Ironic guys are amazing, really great people to work with. The guys in IRC are good at working with us.
Just hope we can provide some value to the project as well to return the favor.
From what I recall the documentation left a ton to be desired. Just trying to figure out how Neutron and their "VPC" equivalent was supposed to be implemented left more questions than answers :|
Given that's the offering, it doesn't surprise me a bit they didn't go with OpenStack. That said, I guess they think running containers on bare metal is a better way to roll.
Openstack really isnt appropriate for this type of scenario, unless their original goal was to use KVM machines to add some extra security / multi-tenancy.
OpenStack no longer behaves like a nimble startup and may no longer be the right option for someone looking for a quick, iterative development process. I'd question if any startup should really be a consumer of OpenStack at this point.
If OpenStack doesn't solve the startup's future needs right now, the startup's future need will come sooner than the features needed in OpenStack. Contributing upstream will have too great an opportunity cost. The only legitimate options for such companies are not to use OpenStack or maintain their own fork.
Right now, at the rate of innovation and improvement currently in OpenStack and the processes necessary for participating in the community, I'd argue that if a startup consuming OpenStack has resources to dedicate toward upstream development and baby-sitting that process, that they're either A) Not a startup, or B) a failing startup.
No startup should be rolling their own cloud, any more than they should be putting together their own Linux kernel. Go public, or if you MUST be on your own metal, use a turnkey solution like Metacloud or Nebula (and let them manage it for you).