Hacker News new | past | comments | ask | show | jobs | submit login
AWS is inappropriate for small teams because its complexity demands a specialist (smashcompany.com)
45 points by lkrubner on June 11, 2016 | hide | past | favorite | 62 comments

I'll chime in with another "AWS is fine, as long as..." comment:

AWS is fine, as long as you don't rely too much on the web interface.

The AWS web interfaces are inscrutable, clunky, and slow. A lot of the trial and error you want to do to figure out how things work in AWS is very difficult to do in that interface. Often, the interface isn't even a win: it will just dump you into a textarea where you have to fill in some structured command-line style text anyways.

Instead of trying to deploy through the AWS web interface, build some simple tooling (use the AWS cli, or Python boto, or whatever) and then do trial-and-error using those tools. A lot of things in AWS make a lot more sense when you work directly with the API, rather than the web interface.

I would go further and say that—despite its many flaws—the "primary" interface to AWS isn't (or shouldn't be considered to be) the base API, but rather CloudFormation. The simplest way to understand how AWS "goes together" is to write a top-down description of your entire VPC and everything in it, in a single file, and get it validated as a sensible or invalid configuration, iterating and tweaking and debugging just like when programming against a compiler. That's what writing a CloudFormation template gives you for free.

+1 for boto (use boto3/botocore for latest stuff, not boto2).

I usually click around in the web interface to learn what the options are, then automate the same procedure using a boto script. It's not trivial, but then you never have to do this task manually again. Also, I find the scripts work wonderfully to document the company's IT infrastructure.

AWS at the lowest level is just a virtual server in the cloud (EC2) and files in an unlimited storage (S3). Maybe if AWS is too difficult one is making it too difficult by using services that are overkill at the time. You can slowly add services as you need them. A couple web machines, load balancer and database is as easy to setup on AWS as it is locally and it has many more availability and security guards than any local server.

AWS success is largely from startups and the ability of Amazon to make it easy/fast to setup, yes it can get complex like anything but by default it is pretty simple. They do have many, many services but you only really need a handful or a couple depending on the project. Maybe they need to separate out those huge lists of features/systems into buckets of simple, personal, small to medium, enterprise etc.

For someone who has never used AWS (because there are much cheaper options), every time I look at even setting up a simple server, AWS seems insane. With a VPS server it's trivial. You go to the website of your choice, you pick whatever specs you need, you rent the server, you get exactly that: RAM + cores + disk space. With AWS it looks like you have to rent storage separately and somehow connect it to your instance.

Look at the pricing page, it's ridiculous for someone new:


I have to google what "ECU" is, I have to figure out what "Variable" ECU means, I have to figure out what "EBS Only" means. There's not even a tooltip to explain any of that is.

Even after googling "ECU", and going to Amazon's FAQ page, it's still not clear exactly what I'm paying for.

And that's just the purchasing part.

You've overthinking things. Give the Free Tier a shot and you'll see how easy it is to get an instance up and running. Sure, there is a lot of customization available if you want it though you can make do with the defaults most of the time.

And you have to figure out what a "reserved instance" is. Because that saves you 70%. And so, if you miss that small point, then you are wildly over paying. But who can learn all these details in a short amount of time?

A reserved instance is just an instance you don't pay hourly for. Lots of people use AWS specifically so they can pay hourly, turning up servers dynamically in response to load, and then turning them off when spikes subside, without penalty.

However, most people's load doesn't regularly go to zero - and in fact it still might be advantageous to use reserved instances even if it did, because they're so much cheaper.

The situation is analogous to "base-load vs peak-load" on the power grid. Base-load power (reserved instances) are very cheap but cannot respond to transient spikes in the load. Peaking plants (on-demand instances) can respond to transient spikes quickly, but are expensive to run.

A naieve model is that you find your minimum usage and buy enough baseload to cover that, with peaking beyond that. However, since peaking is so much more expensive than baseload this is not necessarily optimal - instead, you may want to buy some extra baseload that isn't fully utilized during your off-hours, because that offsets expensive peaking capacity during your high-demand hours.

You almost certainly would not want to have zero reserved instances.

.... kind of. A reservation is all about planning capacity. If you know (e.g.) that you need two web servers "hot" all the time to serve a base level of traffic, you can pay for that up front either partially or in total (a reservation) at some pretty significant savings. You can add hourly (or spot) capacity as needed - but paying for what you'll be using up front can make AWS very economical.

If you're paying for X1 servers for an extended period of time on an hourly basis, you should be entitled to a thank you note from Jeff Bezos.

I used a GPU instance today and it cost me 1$. I think you can discover the best setup in a week or so, with just a few dollars to waste. It's just a small "learning tax".

I had the same problems years back when I first tried AWS; I mostly just learned everything by rote.

Recently, though, I tried setting up a simple home "VM lab"—basically a two-node cluster of regular workstations, running bare-metal hypervisors (e.g. Xen, VMWare ESXi, etc.)

When you do this, suddenly everything IaaS providers do comes into stark clarity.

For example, you realize that with even two nodes of scale, that sticking a lot of storage into each VM node is expensive, especially if you're not going to use it all on each node; it's a lot cheaper to have diskless VM hosts (save for an SSD for swapfiles), and stick the disk volumes on a SAN, connected to the VM hosts by something like iSCSI. (This then gives you other benefits, like disks that are copy-on-write clones of other "template" disks on the SAN; the ability of VMs to "fail over" to another VM host (and thus the ability of the host to gracefully restart without killing its VMs—at least, from a user perspective); "free" snapshots with rollback; etc.)

EBS is exactly such a setup—the perfectly-sensible setup for any VM cluster operating at scale. It's the other setup—local disks in the VM hosts—that's insane and doesn't scale; everyone is just familiar with it because most people's experience of providers is with the ones just getting off the ground who are before the ROI intercept.

Or, for another example, when running a heterogenous VM cluster, you immediately realize that the unit that the VM CPU resource commitment is measured by on a VM host is a "vCPU"—which is not exactly one reserved physical core (since it can just represent a hyperthread, and can be overcommitted when other VMs aren't using their resources) and is not the same on each host, since each core operates at a different speed.

The closest you can really get to getting a useful unit of CPU allocation, then, is by rounding off the host differences to integer multiples—such that you could say that one vCPU on host A is (roughly) equal to two vCPUs on host B. That unit, whatever you call it, will then scale up over time, as you phase out previous generations of VM host hardware. That's an ECU.

Smaller hosting companies can get away with telling you what CPU you'll be running on, because smaller hosting companies are new enough to have homogenous hardware, and small enough to not attempt anything like live VM migration. A company that has both heterogenous hardware and must migrate VMs across that heterogenous hardware can't make any such claims.

That means using AWS is just as complicated as using an API in Python or JavaScript. You have to read it carefully, understand their model of thinking, and master the subset of functionality that you need. Nothing new for IT people. But I love DigitalOcean much more. If only they had GPU instances.

  "I can handle basic Linux devops. But AWS is not Linux
  devops. AWS has become its own specialty. With over 70
  services offered, only a full-time AWS specialist can
  properly understand how to build something in AWS."
Truer words were never spoken.

Meh? AWS offers 70+ services, but you don't even need to understand S3 to deploy on AWS the way you would on Rackspace. The number of services on AWS seems likely an extremely weak argument against it.

That's true as long as you ignore the fact that your server's IP address might change when you restart it. Once you understand that, you realize that you have to rethink how you're using AWS, and that it's quite different than using Rackspace. So you get sucked into the complexity, and start thinking about availability zones, auto-scaling... it's a slippery slope into devops hell.

You need to know what an elastic IP address is. You do not need to know about Redshift, or, for that matter, auto-scaling.

This doesn't have to be the case.

AWS documentation is absolutely terrible (incomplete, out of date, etc.), the APIs are often inconsistent or confusing, official SDKs are half-baked (weird error messages, no error messages, etc.), and there are tons of overlapping products.

If Amazon actually made some effort on making AWS more user-friendly, it wouldn't be nearly as time-intensive for a small team to figure it out.

We embraced AWS nearly 6 months ago. We had the occasional use before then, however this time we went all in. We learnt about VPC, proper multi-zone availability, made use of a dozen of services etc. It took about a week to grasp it all and apply it in Terraform (the AWS and Hashicorp docs were helpful and thorough). We were all extremely pleased with the outcome. That was until it came down to managing all those interconnected components.

We are now looking for a devops person simply because the AWS stack in its entirety is daunting. Yes, from experience it is possible to manage with just the software guys. However there's just too much going on. The more you use of AWS, the more difficult and demanding it becomes.

For all that was said, we don't regret moving to AWS (were previously on Rackspace).

> That was until it came down to managing all those interconnected components.

You might find BOSH[0] useful at this level. It was developed to deploy and upgrade Cloud Foundry, which itself is a complex distributed system with heavy HA requirements. It's a large part of what has made it possible to run CF on OpenStack, vSphere, AWS, Azure and GCP. (There are more coming).

There's a hump to get over, but once you do, BOSH is essentially magical. At Pivotal we upgrade our entire public cloud every week or two and mostly, nobody notices.

Disclaimer: I work for Pivotal, we do almost all the core engineering on BOSH. Microsoft and Google do some too for Azure and GCP respectively.

[0] http://bosh.io/

What was your win that you don't regret the move to AWS? From what you described it sounds very expensive (and requires another full time employee!) with no upside.

Not doubting your decision, just curious for more info.

Fair question. Before moving to AWS we were running our own infrastructure on Rackspace and a lot of it was dedicated hardware, not cloud. That was expensive. Fanatical support by RS also dropped dramatically in quality, taking days to resolve issues.

AWS has a lot of redundancy baked in. The notion of "replace, don't fix" also made our codebase more resilient to failure. This is also true for most cloud suppliers.

With AWS we don't need a PostgreSQL admin, it's managed. Redis is managed. Storage is cheap and can be replicated across the globe. So far we've reaped all the benefits without a devops person in-house.

They're moving from Rackspace.

It was already expensive.

If you're using Terraform, why are you having problems with the interconnectedness? That's exactly the benefit of Terraform. Parameterizing all the configuration dependencies is its brilliance. Nothing I've come across is as simple and easy to use and understand while still being able to fully managing an entire infrastructure.

I wouldn't say we were having problems using Terraform for interconnectedness. The complexity of services used at this point warrants a person who can evaluate the architecture we've created. Can we optimise for cost, did we make the right MZ decisions, etc.

Terraform was what made AWS possible for us. I would not have used the GUI.

AWS is fine, as long as you don't try to treat it like a cheap colocated server.

Build for redundancy, replace resources when they fail (rather than necessarily trying to fix them), and you're good. AWS has some learning curve in the form of its ACLs, APIs, and how the pieces fit together, but the result is that you have excellent fine-grained control over your stack.

You can outsource that knowledge to a platform provider, but you're gonna pay for it. Rackspace won't even sell you cloud hosting without a service contract.

Chris told me he was running a 62 node ElasticSearch cluster, which stores over 100 terabytes of data. No doubt, when you are working at that scale, then AWS is very useful. But you need someone with the incredible skills of Chris Clarke, and that is a rare thing. And specialized. Small startups don’t normally have/need/want a devops guy of that skill level, because the devops needs of small startups tend to be minimal.

I call bullshit. Startups have/need/want experienced and skilled devops engineers. AWS launched in 2006: there are available, talented devops engineers who understand it.

The devops needs of small startups aren't "minimal". Even if you go with a devops engineer not intimately familiar with AWS, he or she can get up to speed on AWS platform operations just as quickly as learning any other new stack the startup may be using... If you're planning on hiring for the lowest common denominator for devops in your startup, you'll be in trouble.

First of all, Chris is great. I'd know: I hired him. (I'm Parse.ly's CTO.)

But I will mention that we were not always on AWS. Indeed, I hired Chris at a moment in our company's life when we were growing a bit too fast and still had a "snowflake server" setup in Rackspace Cloud. Served us well for the first year of revenue, but not so well given our growth rate. One of his first jobs was to move us from snowflake to proper config management. Then we made an 18 month pitstop in colo before adopting AWS.

I think what the blog post fails to recognize is that there is "commodity AWS use" and then there's "AWS platform use." For example, at Parse.ly we use Route53, Cloudfront, ELB, EC2, S3, RDS, and EMR.

The first 5 are basically commodity services. DNS, CDN, load balancer, VMs, backup.

Whether you use Rackspace, DigitalOcean, or your own colo, you'll need to figure something out in each category to deploy a real webapp. (Or, accept the devil's bargain of a PaaS.)

The last two services, RDS and EMR, are basically AWS-specific value-adds that saves our team time. We used to run our own Postgres EC2 node, but meh, why bother. Our SQL DB is small, and RDS handles a lot of things for us. We also used to run our own Spark clusters (using spark-ec2), but in June 2015, Spark got EMR support, so again, that saves us time/money. We actually breathed a sigh of relief when that came out, since we were essentially building Spark EMR ourselves with boto.

Spot instances and reserved instances are very neat cost optimizations at scale. But complex.

We adopted RDS/EMR hesitantly because of the "lock-in" they represent. We could definitely run without them. I don't think I have interest in the other 70 AWS services, mainly because I hate lock-in, I like open source, and I want to Keep Things Simple. (So Chris stays sane!)

AWS is definitely more complex than DigitalOcean. But I think running production web apps in the cloud is "simply complex", no matter which provider you use.

I also think the OP has a good point, which amounts to, "premature scaling is the root of all evil". Parse.ly may run on 200 EC2 nodes across multiple regions & AZs today, but our first prototype ran in a single 1U rackmounted server I built myself and snuck into a friend's server cage (2009). It's important not to waste time on "devops" when you are still figuring out what the heck your startup is even supposed to be doing. If I wasted time on that in the early days, we simply wouldn't be here.

>Parse.ly may run on 200 EC2 nodes across multiple regions

DevOps has become "developers doing operations" but thats a misapplication of the term. DevOps started as Developers and Operations working together.

If your running this many EC2 nodes, and you don't have an Operations person or team, then your running on the razors edge.

>At the time of our conversation I had crashed an AWS instance and we were having trouble fixing it. Among the points I made to Sean was that the problem was specific to AWS. If we’d been in the Rackspace cloud, I simply would have rolled back to yesterday’s image. A problem that would have taken 5 minutes to fix instead dragged on for 2 weeks.

Honestly you have already been given the sign post that this could go very badly for you. Imagine the case where "crashed instance" was expensive critical infrastructure. How many hours or days of outage before your business is irreparably harmed?

What about Heroku or Lambda or Cloud Foundry or all the other companies that claim that startups don't need any devops at all?

As a nitpick: Cloud Foundry isn't a company, it's an opensource project.

Lambda isn't, uh, quite as fully featured as Heroku or CF yet. And it ties you very closely to AWS. With Heroku you have a reasonable shot at migrating to Cloud Foundry or vice versa.

Disclaimer: I work for Pivotal, we're the major donors of engineering to Cloud Foundry. We sell a commercial distribution (PCF) and have a public service (Pivotal Web Services).

Good point. If startups are willing to pay much extra for a solution with vendor lock-in, the can certainly follow that path.

His key example seems...odd.

Yeah, spinning up an Elasticache instance with Redis has some complexity...it's a managed service. You have to tell it some things for it to be managed for you.

If you don't want to learn about it, you just spin up an EC2 instance (which itself is not really any more complex than spinning up a VM in any other virtualized platform), and install Redis onto it. If your experience is in Linux sysops, you can -totally do that-.

That's really the key thing about most of these cloud providers...the barrier to entry really is about learning how to spin up a basic VM. All the other offerings, as complicated and unintuitive as they may be, are opt in things, designed to solve common needs. You want a scaling NoSQL DB? You can spin up EC2 instances and install your favorite flavor. It's then on you to ensure it scales properly. Or, you can read up on Dynamo, determine whether it fits your needs, then do the grunt work of setting it up in AWS, and let Amazon manage it for you. Etc. Complaining that taking advantage of ~all~ the offerings is a huge mental burden seems silly; of course it is. THERE ARE SO MANY OF THEM. But nothing is stopping you from just using EC2 to start, and using those other services only as you find the time and drive to learn about them and determine that you can benefit from them.

For those of you saying, "just use EC2 like any other VPS, it's not that complicated," you're missing the point. You can find better-performing VPS offerings for the same money at other providers. AWS isn't the best option if that's all you're looking for.

For those of you saying, "it makes getting started dead simple and gives you room to grow," you're also missing the point. Something like Heroku or LSQ is probably a better option for 90% of such use cases.

For those of you saying, "oh, just use Amazon ABC for x feature, DEF for y, GHI for z, etc. and tie it all together with Amazon foo," you're also missing the point. Yes, that's the proper way to use AWS, but it requires very specific (i.e. not common Linux sysadmin) knowledge that early-stage startups may not possess and may not have the resources to acquire. This is the main takeaway from the article that I'm not sure many folks actually read...

Honestly, if they just chose descriptive names that's about half the battle. Right now everything has difficult to decypher code names. RDS (Amazon Database) EC2 (Server Instance) S3 (k/v filestore)

Oce you do learn their names, comes the inconsistencies. Why are Dynamo and Redshift not part of RDS while Aurora is? There's just no rhmye or reason to any of it. No unifying vision. Someone at Amazon needs to take the time just to organize it into logical collections.

You do realize that the cited code names are all acronyms for simple descriptive names, right? (Relational Database Service. Simple Storage Service. Elastic Compute Cloud). It should surprise nobody that simple descriptive names are often inconvenient in day-to-day use, and hence get turned into abbreviations.

> Why are Dynamo and Redshift not part of RDS while Aurora is?

Because Aurora is a relational database (see above), and Dynamo and Redshift are not?

Of course, it would seem that some marketing weasels managed to get into the organization; new names are fizzy opaque brands instead of the old descriptive-ish naming.

Isn't Aurora just accelerated MySQL RDS?

Dynamo and Redshift are database services, but they're not much like RDS.

Of all the AWS database services and permutations, and there are many, RDS is the only one you really should know.

DynamoDB is a sharded NoSQL thing with eventually consistent secondary indexes that is nothing like an RDBMS. RDS is very similar to Master-Slave RDBMS servers you would manage yourself in EC2s.

According to Amazon, the "R" in RDS is for "Relational". Dynamo is not a relational database, so it doesn't go in the RDS group.

Dynamo and Redshift have much different use cases than any of the RDS services.

I agree with the sentiment of this post. I have also worked at small startups in the past for which AWS is overkill but they insist on being on AWS because they use it as a selling point to investors (I know, this sounds silly), i.e. they are using it to name drop "cool" tech, something investors seem to "love".

My perception of AWS is that it is an infrastructure environment onto which you can build your own hosting environment. AWS isn't just designed for web based applications, even though most of the AWS users will use it for that purpose. It's a box of parts with which you can piece together the environment you need and that requires skills, knowledge and a good grasp of how it all works (on an AWS specific level).

Another issue I keep running into is that developers use some of the AWS services as integral parts of their application, SQS is a great example of that. I'm convinced that SQS is not for developers to use on an application level but it is a service that allows infrastructure engineers to build complex and scalable environments on AWS.

The other issue I always struggle with when using AWS is the cost thing. You never quite know how much you're going to pay at the end of the month. For small startups the difference between $500 and $1,000 is massive whereas for larger organisations the the difference between $15,000 and $20,000 isless of a problem.

I agree with this sentiment. AWS can be an overwhelming challenge for small teams.

Convox (YCS15) is addressing this head on and helping lots of small teams immediately get the best out of AWS.

It's open source, built by a core team and a growing community of AWS specialists.

It fully automates AWS setup and deploys and infrastructure updates so you and your team don't have to.


Disclaimer: I work at Convox full time

I work at a company producing a competing product, but I don't see why you were downvoted. Convox fills a particular niche for pure-AWS shops.

As with so many other things, it depends on what you want to do. If you just want to serve static web pages with high availability and relative immunity to large traffic surges, it's pretty hard to beat S3. A simple script will push any changes you make locally to your S3 bucket. Yeah, you do have to figure out how to write the script, but that is a write-and-forget task that's orders of magnitude less complexity than playing sysadmin for a dedicated server (also, unless you're made of money the dedicated server is going to have far less tolerance for traffic spikes).

I have some static pages that I used to host on a shared hosting service. I moved them to S3 years ago and haven't looked back. Zero sysadmin hassle, and the cost is actually less than I was paying for the shared hosts (which would invariably fall over if I got a link from a high-traffic site).

Edit: yes, at some traffic level it's going to become more economically efficient to run your own server, but that level is pretty damned high. Note how many gigs of S3 traffic you can purchase for the cost of one sysadmin salary (hint: a lot of them). And if you're doing it yourself (the "small team" that the article is talking about) your time is probably better spent working on your product than on spending your days monitoring security mailing lists and screwing around with apt-get or whatever.

I don't know much about AWS, I assume it's not that different from Azure which I do know. If I'd be asked to create an architecture like that with a small team on Azure, I'd create an IoT hub, connect it to stream analytics, put my logic in there, and dump the results to a database (SQL or Hadoop or anything), adding some Power BI on top of that all for nice charts and graphs that would be embedded in my service or app. All that secured using Azure AD and monitored with Operations Manager.

That would be a total vendor lock-in architecture, but I'd never have to think about configuring redis clusters, failover, backup or anything. I would have no idea what technology stack is below it. It scales, and it just works. The only devops work would be to create a deployment script that would automate a deployment from sourcecontrol. The initial setup of the whole infrastructure would cost a few days maximum, and none of the hard operations problems would be mine. I'd click boxes like "Geo redundant availability: yes/no".

I think that's the benefit of these expensive cloud services. If you do low level stuff yourself on expensive cloud providers like AWS or Azure, you are doing it wrong.

Rackspace Cloud tries to balance delivering the power of cloud with the simplicity/constructs of traditional infrastructure. It doesn't have the complexity of AWS, but that strength is also a weakness since it lacks the features and sophistication of AWS. So, depends on what is desired. The AWS complexity can absolutely be challenging for small teams. That's one of the reasons Rackspace created Fanatical Support for AWS (http://www.rackspace.com/aws). While AWS provides economies of infrastructure, Rackspace Fanatical AWS provides "economies of expertise" via management tools and hundreds of AWS devops engineers, architects, etc. The two services are very complimentary and even small teams can tap into expertise at scale to overcome the AWS complexity hurdle. One option worth considering for those who want the power of AWS and are concerned about complexity. [Full disclosure: I work for Rackspace. I helped build the Rackspace Cloud and now work on our Fanatical AWS team.]

From a technical perspective, I'm a moron compared to most of the people here.

Still, I was able to get my first app up and running on Elastic Beanstalk and a MySQL database up via RDS without much trouble. It's not rocket science.

Given the breadth and depth of available services, I can't imagine how you can declare "AWS is inappropriate for small startups" with a straight face.

Yep, second this. Company I work for uses S3, EC2 + Opsworks + VPC, RDS (aurora), elasticache, route53, cloudwatch (for logs) for our stack, I found it easy enough with the docs and various online guides.

I cannot understand how any half-competent startup dev team would find it difficult, most of it can be done with point and click UI, especially for setting up your basic load balancer + web servers + database stack.

AWS is fine as long as... you do not want to control it with shell scripts and without a large scripting language.

(No Perl, Python, Ruby, Go, etc.)

I tried this when I first experimented with AWS after I read the story behind it, i.e., the directive Bezos allegedly gave to disparate groups within Amazon to make their data stores accessible to each other.

The AWS documentation claimed everything could be controlled via HTTP. Great. I know HTTP. Sign me up.

I have no trouble interacting with servers via HTTP using the Bourne shell and UNIX utilities, without using large scripting languages. I have been doing so for many years.

But after a few hours trying to get AWS to work using UNIX it was such a PITA I gave up. And I do not give up easily.

But it turned out there were small errors in the documentation, so even if one followed their specification to the letter, things still would not work.

The Amazon developers in the help forums would just say use the Java programs they had written.

Of course AMZN had a "web interface" from Day 1. But I have little interest in another hosting company with a web GUI.

At the time all Amazon offered for anyone interested in the command line was Java. Installing OpenJDK and a hefty set of "Java command line tools" just to send HTTP requests? This did not inspire confidence.

Then came Python. Everyone loves AWS. How can anyone criticize it?

I concluded that if AWS was well-designed (according to Bezos alleged directive) then it would be possible to interact with it without having to use a large scripting language and various libraries.

I guess I am either too stupid or I set the bar too high.

AWS, as I understood it back then (before the massive growth), is a wonderful idea but I am not sure the implementation was/is as wonderful as the idea.

As someone who has to build things solo most of the time, AWS does not need a specialist.

Oh yes, to squeeze the most out of it, I am sure it does.

But you can get along with EC2, Route 53 etc. pretty fine.

Personally found the multiple terms, docs (outdated) and other stories of people being billed lots pretty intimidating but when it came down to it, it wasn't as scary as I made it out to be and I've set things up at about half the cost of Digital Ocean and others.

If you're in the tech startup game and find AWS too complicated, you have bigger problems.

We've been interviewing external "AWS specialist" firms to help with the complexity, review our architecture, leverage new AWS services, etc.

Any recommendations? Have spoken with LogicWorks, 2ndWatch and RackSpace so far.

I strongly recommend that you talk to the same Sean Hull who was referenced in the article. He consulted with us at OpenRoadMedia.com and he was extremely talented and knowledgeable.


Thx Lawrence. Appreciate it.

If using Ruby specifically I would highly recommend Reinteractive: https://reinteractive.net/

Though the boys there also have experience with many systems.

Disclaimer: Previously employed there.

What you need is some kind of abstraction layer so that you can expose only the functionality that non-technical users need exposed. Check out www.parkmycloud.com for example.

If I'm a one man show, doing my own DevOps, what my best option?

I'm technically proficient, but I want easy.

If your app can fit the model, then PaaS, including "serverless" is a fine fit that eliminates many of the classic DevOps challenges. With "serverless" there are still many operational problems, however, so I'd only suggest this for those comfortable being early-adopters.

For microservices architectures, having just gone through this, my choice was Google Cloud with Google Container Engine (Kubernetes). You build Docker images, deploy them, specify how those resources are connected, how many copies of each container should be deployed, etc.

I didn't have to mess with Terraform, Chef/Puppet/Salt/Ansible, deploying Kubernetes, Swarm, or any other container management system. Because it's just Kubernetes, it's easy enough to bring on-premise or deploy on AWS.

The main draw toward Amazon are all of the other services that they offer, of course. I see a lot of users of Kinesis, Redshift, etc. If you need them, then you need AWS, but deploying and managing your own apps brings many more barriers, IMHO.

I work at Pivotal, so I'm partial to Cloud Foundry.

My workflow for deploying a new app for my own projects is to cd into the source directory and type

    cf push
After a while I get tired of that, so I set up a Concourse pipeline and let it type `cf push` on my behalf, after going through the testing pipeline.

Interestingly, I used to be much more bullish on moving from the buildpacks model to a container model a la everyone else in the PaaS space. To the point that I presented a case internally that buildpacks should essentially be pushed aside in favour of that model.

But once I started actually writing Dockerfiles I changed my mind. It's a bloody inconvenience. Composition is not really A Thing in Docker, you can only stack. For all the talk about elegance (and it's elegant as an implementation), you're left with what is essentially a single inheritance hierarchy with no clean way to compose containers.

The ability to get your app running immediately is something Heroku pioneered and Cloud Foundry continued (although, OK, if you insist on hurting yourself that way, you can push docker images too). Once you hand off the frankly boring business of automatically building a container to the platform, containers as the unit-of-deployment lose all of their magic.

But again. I'm biased. I've only seen both alternatives up close in my day job and in my personal projects.

Google App Engine + their other no-ops services. Google is way ahead in this area.


The design in this article is barely using AWS at all, basically just using it for server hosting which is causing them to do way more devops than needed. There are parts of this design that could much better leverage AWS and remove a lot of the devops. Sure this means you're tying yourself to AWS but you could re-implement on another system. The whole point of using somthing like aws is to remove all of "extra" work that is not central to your system. Things like redis cluster management, database management, etc. I didn't see what they were doing with mongo and if they needed its access patterns. But I'll assume they do. So I'll keep that part of the design. The rest of the system would like this in an aws design:

Ingestion is either timed jobs in lambda or an elastic beanstalk set of processes doing the same. I'm not sure what the ingestion is so I can't say much here.

Data gets tossed on a kinesis queue. If possible have the data being ingested tossed on the kinesis queue. Hook lambda workers to the kinesis queue or an elastic beanstalk layer. Kinesis here removes the need to manage a redis cluster. One layer of devops removed.

Next you have the workers, the easiest is to toss them in lambda... they'll be invoked to read from the kinesis stream. They'll scale up and down. You don't need to run a supervisor, it's baked into the system. Removing lots of dev ops here, no complicated deploys, no cluster management, etc. If you need 2 different sets of workers you can set a split join set of kinesii... or I believe you can have 2 readers from a kinesis stream they just don't like it if you have lots.

[Another variant is just a cluster of Elastic Beanstalk hosts scaling on the side of the kinesis stream]

They'll shove data into mongo db... you keep this because you want it. Otherwise use RDS or dynamo if it's suitable. I can't tell.

Of course you pay for some of this removal of dev ops. Some of it is free. Lambda limits some of your language choices and is mostly stateless so you may hit some services for things like lookup harder.

But the code that is running on lambda can get moved into worker fleets managed by you later if you need more control.

The point is here that the interesting parts of your app become the actual code dealing with your specific problem not the devops. Eventually if you have to, you build up more devops and more custom solutions to handle different scale up to meet the scale or price part of the problem. But the solution above will go bigger than most people need and lets you not have to spend hundreds of hours on all that devops at a real cost of engineer time.

There are costs everywhere, money, time. Just decide how you want to use them.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact