
AWS mistakes to avoid - hellomichibye
https://cloudonaut.io/5-aws-mistakes-you-should-avoid/
======
ddevault
Full disclosure: I work for Linode and I am super biased. This is my own
opinion etc etc

Perhaps the first AWS mistake you might make is... using AWS? Even before I
started at Linode, I thought it was terrible. It's extremely, unreasonably
pricey. The UI is terrible. Their offerings are availble elsewhere. I started
MediaCrush, a now-defunct media hosting website, on AWS. After a while, we
switched to dedicated hosting (really sexy servers). We were looking at $250 a
month and scaled up to millions of visitors per day! I ran our bandwidth and
CPU usage and such numbers through the AWS price calculator a while ago - over
$20,000 per month. AWS is a racket. It seems to me like the easiest way to
burn through your new startup's seed money real fast.

Edit: not trying to sell you on Linode, just disclosing that I work there.
There are lots of options, just do the research before you reach for AWS.

~~~
nicobn
Since you're dealing in absolutes, let me return the favor; you're being a
little silly.

Yes, Linode beats AWS when it comes to price. On the other hand, Linode's
offer is incredibly basic and simplistic. AWS offers service after service
that Linode simply doesn't and realistically cannot.

Sure, you can emulate a subset of these services (and a subset of their
features) using open-source software but at what price ? That's the major flaw
in your reasoning. Getting to the point where your installations are as stable
and reliable as AWS', given a large stress on the system, will cost you a lot
of money and time. Directly comparing the cost of hardware access is ignoring
other costs and headaches that are not very easy to estimate.

There's a market for a service like Linode and there's a market for a service
like AWS. You've simply never worked on a project/system that works better on
AWS than on Linode. I know I couldn't run the systems that I currently operate
on Linode without multiplying the workload that is needed to maintain them.

Linode and AWS are competitors but there's space in the market for both; they
simply fill different niches. Establishing one as absolutely superior to the
other is silly and closed-minded. A lot of people chose AWS; go and ask them
why (feel free to reach out to me at nick at nasx dot io - I'll be more than
happy to talk to you).

~~~
Sir_Cmpwn
I'm really not trying to compare Linode to AWS here. I just felt that it was
necessary to mention that I work for Linode because there _is_ some overlap.
I'm really also not trying to sell anyone on Linode in particular. I'm
pointing out that from my own personal experience, AWS is grossly overpriced
and worse than the alternatives for a large number of use-cases.

~~~
Naomarik
AWS is insanely overpriced. It's weird to me that everyone jumping on AWS and
some of their customers never come close to attracting the internet traffic
required to set up CloudFormation. I think if you're starting a webapp out of
your own pocket and not millions of dollars of investors, it makes sense to
save where you can.

Using locust.io, I've seen that my current site on two $10/month Linodes can
scale up to approximately 300k people/day and increasing that substantially
just means I press a button and upgrade my app/database servers.

If it came to a point where I was growing at a pace I didn't want to manage
and money was flowing in, and I was out of ideas on software optimization,
only then I would consider spending tens of thousands/month on AWS.

I'm not denying the great benefits AWS gives, I honestly would love to use it
now and just be done with most of my devop headaches, but the costs are
prohibitive.

~~~
albertoleal
What are some alternatives to AWS? Are there realistic alternatives?

~~~
jasode
The "alternatives" depend on your situation.

Picture a continuum between brain-dead simple websites and business-critical
complex websites:

    
    
      simple:  static website, WordPress blog
      moderate:  small business CMS, etc
      complex:  Netflix, AirBNB
    

If you're running a simple WordPress blog, the AWS prices are absurdly
overkill. For this use-case, there are a zillion alternatives. Linode, Digital
Ocean, Rackspace bare metal, etc, etc.

On the other end of the spectrum, you want to run a high-availability website
with failover across multiple regions like Netflix. You need the value-added
"services" of a comprehensive cloud provider (the " _I_ " and " _S_ " in "
_IaaS_ " as in " _Infrastructure Services_ "). For that scenario, there are
currently 4 big competitors: AWS, MS Azure, Google Compute Cloud, and IBM
SoftLayer. However, many observers see that Google and IBM are not keeping
pace with AWS and Azure on features so at the moment, it's more of a 2 horse
race than a 4.

Keep in mind that the vast majority of cost comparisons showing AWS to be
overpriced are based on comparing Amazon's EC2 vs bare metal. The EC2
component is a small part of the complete AWS portfolio.[1] If you're doing
more complicated websites, you have to _include the costs_ of Linux admins +
devops programmers to reinvent what AWS has out of the box. (The non-EC2
services.) Even if you use OpenStack as a baseline for a "homegrown AWS",
you'll still need extensive staffing to configure and customize it for your
needs. It may very well turn out that homegrown on Linode is cheaper but most
articles on the web do not have quality cost analysis on the more complicated
business scenarios. Anecdotes yes! But comprehensive unbiased spreadsheets
with realistic cost comparisons?!? No.

[1][https://aws.amazon.com/products/](https://aws.amazon.com/products/)

~~~
randomflavor
Digitalocean is a great alternative for simple things.

I use it for dev / early projects and as things get complex or need more
redundancy I make the production spend on AWS.

------
kevindeasis
You know what I've realized that's really important. More AWS tutorials is
really needed. There's numerous of new programmers who want to learn AWS, but
can't finish building anything because they get buried in documentation.

I find there are a lot of high-level abstracted tutorials, but for the new
services, there aren't a lot of detailed tutorials.

For instance, an implemented cognito->gateway->lambda->dynamodb is really hard
for a newbie to do.

~~~
tedmiston
I definitely agree that better tutorials and/or a simpler interface would
makes AWS more accessible and user friendly.

A YC S15 startup, Convox ([http://convox.com/](http://convox.com/)), aims to
"make AWS as easy as using Heroku." It looks really promising.

~~~
nzoschke
Convox member here. Thanks for the shout out.

This is definitely a goal of Convox: to remove as much AWS complexity as
possible.

Our approach matches this guide to a tee. We are using CloudFormation to set
up a private app cluster, as well as to create and update (deploy) apps. We
are also using ASGs.

The instance utilization point is spot on too. The fist thing convox does to
make this easy is a single command to resize your cluster safely (no app
downtime).

Coming next is monitoring if ECS and CloudWatch and Slack notifications if we
detect over or under utilization.

I strongly believe that these AWS best practices can and should be available
for everyone. For anyone starting from scratch or migrating apps off a
platform or EC2 Classic onto "modern" AWS.

~~~
davidbarker
I hadn't heard of Convox before, but it sounds interesting.

I'd like to use AWS more, but each time I tried to get into it I felt
overwhelmed. I currently use PagodaBox a lot, which is great (most of the
time) because it handles a lot of the complexity for me, but it can often be
expensive. How does Convox compare to PagodaBox?

~~~
nzoschke
I've never used PagodaBox but it looks like a nice PaaS.

Convox has the same goal of a PaaS: to give you and your team an easy way to
focus on your code and never worry about your infrastructure.

One big difference with Convox is that we accomplish this with single-tenant
AWS things. You and your team's deployment target is an isolated VPC, ECS (EC2
container service), and ELB (load balancers).

If you're asking for a cost comparison, we're building Convox to be extremely
cost competitive by unlocking AWS resource costs for everyone.

Its easiest to compare the cost of memory across platforms, though not always
apples to apples...

The base Convox recommendation is 3 t2.smalls which is 6 GB of memory which
costs about $100 / month. If your app can be sliced up into 512 MB processes,
you can easily run 10 processes, which could be 2 to 5 medium traffic PHP apps
on the cluster.

I'm finding PagodaBox pricing calculator a bit confusing but 6 512 MB
processes, so 3 GB of memory, is $189.

~~~
tedmiston
So eventually Convox will have to make money, but I'm not sure I see the
path... given that I have free access to the software and the infra is
provided by AWS.

Do you plan to eventually charge a monthly fee for using the command-line
tool?

~~~
nzoschke
Thanks for your question. A much more thorough pricing page is in the works.

The most straightforward model, and where we are already making some money, is
running a Convox as a managed service.

In this setup you and your team get Convox API keys. Convox installs, runs and
updates everything for you in our accounts. You get a monthly bill that's your
AWS resource costs plus a percentage to Convox for management.

We will be tweaking this model to sell packages so bills are really easy to
understand.

Some other experiments we're doing...

We sell support packages and professional services for app setup, migration
and custom feature development.

We have a per-seat model for productivity features. Private GitHub repos and
Slack integrations are $19 / user / month. There are more closed SaaS tools
like this coming.

Infra is trending to commodity prices industry wide.

We'll be selling SLAs, support, productivity tools on top of that infra.

You'll get a cutting edge private platform without hiring and managing your
own devops team to build and maintain it.

Open source users will help grow the user base and make the platform better
without us running a freemium platform.

------
narsil
_> There is no reason - beside manually managed infrastructure - to not
decrease the instance size (number of machines or c3.xlarge to c3.large) if
you realize that your EC2 instances are underutilized._

CPU/Memory aren't the only measures of underutilization. If you require high
instantaneous bandwidth throughput, then the networking capacity available to
your instance roughly increases with the size of your instance. This includes
both EBS as well as other Network traffic.

Table with Low/Medium/High: [https://aws.amazon.com/ec2/instance-
types/#instance-type-mat...](https://aws.amazon.com/ec2/instance-
types/#instance-type-matrix)

Example benchmark with c3 instances:
[http://blog.flux7.com/blogs/benchmarks/benchmarking-
network-...](http://blog.flux7.com/blogs/benchmarks/benchmarking-network-
performance-analysis-of-c3-instances-using-iperf-tool)

If you're more concerned with just EBS network throughput, check out the table
on this page instead:
[https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-
ec2-...](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-
ec2-config.html)

~~~
thebigjc
This is super important - I've run a number of 'oversized' instances just to
get better networking. AFAIK none of the cloud providers offer a way to create
a 'network' optimized instance, though containerized deployments should help
with under utilizing the rest of the instance.

~~~
zbjornson
This was a big issue for me too. Google Compute Engine seems to have no
throttling like AWS does, or at least it's a much higher cap. On an
n1-standard-1 I routinely get 250+ MB/s, and near 1 GB/s for n1-standard-4 and
larger. The only high bandwidth AWS instances I've found are the few expensive
ones that spec 10 Gbit.

~~~
vgt
According to this page, you can get ~14Gbit networking on Google Cloud's
n1-standard-8

[http://googlecloudplatform.blogspot.com/2015/11/bringing-
you...](http://googlecloudplatform.blogspot.com/2015/11/bringing-you-more-
flexibility-and-better-Cloud-Networking-performance-GA-of-HTTPS-Load-
Balancing-and-Akamai-joins-CDN-Interconnect.html)

------
elwell
I'd like to add, if using Elastic Beanstalk, don't directly attach an RDS
instance when creating the environment. If you do, you won't be able to
destroy your environment without also deleting the RDS instance. Instead,
create the RDS instance separately, and just add the proper security group for
the environment to be able to access the host. Then you can easily create a
new eb environment with any config changes (there are some config changes you
can't make to an eb environment without creating a new one from scratch) and
then connect to your existing db.

~~~
softdev12
interesting. i was under the impression you could just create a snapshot
(before destroying the environment) and then restore as needed. am i missing
something?

Edit:
[http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_R...](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html)

~~~
sdroy
If you can take the database instance offline and if you do not need to access
it from outside of the beanstalk app then that is fine.

------
misiti3780
Another, somewhat obvious one:

Be extremely careful when using public customized AMIs, a lot of times
~/.ssh/authorized_hosts contains public keys and this is obviously a huge
security problem

~~~
kbar13
no, public keys are not a security problem.

~~~
joseph
They are if you don't know who has the corresponding private key.

~~~
kbar13
oh right when using public AMIs. I thought parent meant publishing AMIs. derp.

------
peterwaller
I like CloudFormation. Unfortunately it is very unwieldy to write
CloudFormation templates directly, and we're not about to start using the AWS
CFN GUI editor!

It seems like the assembly of the AWS ecosystem.

Does anyone else have a favourite hammer for this particular nail? I'd love to
have something better than our home-baked solution, but I'm yet to find
anything which doesn't introduce other flaws, such as an incomplete
implementation (missing parameters or resource types) or ultimately making a
leaky abstraction on top of CloudFormation somehow.

I ended up brewing a reasonably straightforward solution using Python as a
(minimal) DSL which emits JSON. Its primary purpose is to support _the whole_
of the CFN ecosystem (not just implement some small part of EC2, for instance)
while also not trying to be too clever.

It has about 50-100 lines of python which implements helper functions such as
ref(), join() and load_user_data(), and not many other things. There is an
almost 1-to-1 correspondence between the generated CFN configuration and the
python source. As a bonus it checks for a few common mistakes like broken refs
or parameters which aren't used.

I have heard that similar solutions have been reinvented in a few places,
including the BBC. But I'm yet to see a good public solution!

~~~
andystanton
If you'd consider something other than CloudFormation, there is also
Hashicorp's Terraform. It has an AWS provider
([https://terraform.io/docs/providers/aws/index.html](https://terraform.io/docs/providers/aws/index.html))
which creates resources and maintains the state in a file that you can store
in version control
([https://terraform.io/docs/state/index.html](https://terraform.io/docs/state/index.html)).

~~~
eropple
Terraform, as an idea, is brilliant. Mitchell and company isolated a hugely
important need and tried to fill it, and I give them all the credit in the
world for that. Cross-platform cloud provisioning? Gimme. But I cannot in good
conscience not relate what a disastrous experience Terraform has been for me
at both jobs and clients.

Writing reusable code in Terraform is an exercise in frustration due to the
extreme clumsiness of HCL (which, I understand, was used because "YAML is
complicated"\--well, that's true, but YAML isn't a good solution either,
you're _HashiCorp_ , you wrote _Vagrant_ , you already know how to do this!).
The application architecture is reckless and full of race conditions; your
state _will_ be hosed if one resource errors out at the wrong time, while
other resources are being successfully updated--the resources that return
successfully after the failed resource will on many occasions fail to be
persisted to state. What's more, application testing seems to be at best an
afterthought: there have been regressions in the providers that will break
your _existing_ states.

I would under no circumstances use Terraform if I didn't have clients who had
selected it before I was working with them. If in AWS, I would use
CloudFormation, with a tool like Cfer[1] (which is excellent, reliable code)
or SparkleFramework[2] (which is more full-featured but I hope you never need
to debug it) to provision my stuff.

(Full disclosure: I'm building a much, much better provisioner for multi-
provider cloud infrastructure. Neither of the projects I recommend are mine;
mine's not done yet.)

[1] -
[https://github.com/seanedwards/cfer](https://github.com/seanedwards/cfer)

[2] - [http://www.sparkleformation.io/](http://www.sparkleformation.io/)

~~~
jacques_chester
If you're writing your own, you might also look to BOSH[1] for inspiration.

It's older than CloudFormation and Terraform (born 2010). It can manage
anything that someone's written a driver for. So far that includes AWS, Azure,
vSphere/vCloud, OpenStack, VirtualBox, Google Compute Engine, Apache
CloudStack and there might be others I missed.

It stores state in a database. It is able to recover from mismatches between
the state of the world and the desired state. Cloud Foundry users have been
using it for years to deploy and update CF installations. Pivotal Web Services
(I work for Pivotal, in a different division) has been upgrading to the most
recent CF release every few weeks, live, without much fuss, for years.

For any kind of heavily stateful infrastructure, BOSH is a strong candidate.

[1] [https://bosh.io/](https://bosh.io/)

~~~
eropple
Augh, how did bosh slip my mind? I've never used it in production, but I've
used it to roll out a CF environment for testing and was impressed to dig into
it a little more (most of a year ago now, I think your mention of CF was
actually what kicked that off). From (admittedly limited) experience I'm not
crazy about its developer-facing feel, but I appreciate the significant and
_responsible_ effort in it.

------
alttab
Also, dont use Dynamo unless you have a really good reason. "It scales better
than MySQL" is not a good reason.

Have fun migrating data and re-indexing constantly!

~~~
greenleafjacob
Is local secondary indices and lower operational costs than Cassandra a good
reason(s)?

~~~
wsh91
Operational costs are in the eye of the beholder. We evaluated DynamoDB for a
new use case but we're going with Cassandra because the storage costs alone
more than made up for ops overhead. At least IMO, Cassandra's support for
multicolumn range queries (via clustering columns) and the cheaper storage you
can use obviate local secondary indices. (Cassandra has secondary indices as
well but it seems like most folks prefer further denormalization--at least,
that's what we're doing.)

------
nodesocket
My huge recommendation is to put production instances in a completely seperate
region than development and staging instances. I actually just discovered that
you can limit IAM API keys to a specific region, you just need to create a
custom policy. The following policy is an example:

    
    
        {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "SOME-ID-HERE",
                "Effect": "Allow",
                "Action": [
                    "ec2:*"
                ],
                "Condition": {
                    "StringEquals": {
                        "ec2:Region": "us-west-2"
                    }
                },
                "Resource": [
                    "*"
                ]
            }
        ]
        }

~~~
eropple
Why would you do this instead of using multiple AWS accounts? Different
regions have different feature sets (available instance types, beta
eligibility, etc.). I strongly recommend instead using multiple AWS accounts
instead and keeping them in the same region.

That said, since you should be using an infrastructure provisioning tool like
CloudFormation, the tagging solution should not be a particularly big
obstacle.

~~~
kbar13
yes, use multiple accounts. you can use STS to grant permissions between the
accounts if needed.

~~~
nodesocket
STS works, but I feel like multiple accounts linked is more of a hack than
using a single AWS account. Of course, if you use a service that is not
available in both regions, use STS.

~~~
hueving
How is multiple accounts a hack? It's the correct solution to isolation. Using
regions is just stupid because now you've attached extra meaning to regions
and can't bring up production stuff in other regions for better latency.

------
dantiberian
The biggest mistake I've seen with AWS (and committed myself), is not reading
the manuals for the services you're using. While some people complain about
the AWS manuals not being complete, there is still a lot of good information
in there that you might miss if you're just clicking through the console.

------
tomglindmeier
I wonder if AWS is a good location to run a VoIP server. VoIP is a real time
application that is very prone to jitter, latency and packet loss. I'm
concerned about "noisy neighbors" and decreased network performance at AWS.

Does anybody have experience with running a VoIP (e. g. Asterisk) on AWS?

~~~
emerongi
I sell an application built around VoIP. I dislike running my own servers, so
just lately I moved them to AWS.

At first, I used instances that were not very performant, causing some
problems. I quickly moved to more powerful instances and since then there have
been no problems at all. There is a slight delay, which is normal, since the
servers are not in the same country as the users anymore, but the users
haven't noticed it at all.

I can't say how well it scales, though. I have a small user-base, so scaling
has not been a problem yet. Network performance might become a problem if you
have a huge user-base.

~~~
ant6n
How can you be sure that your users haven't noticed?

------
andrioni
I second the recommendation to use CloudFormation + packer + ELBs + auto
scaling groups for web applications whenever possible, it just makes
everything so easy and automatic. Of course, there's a learning curve and you
pay a premium for all that automation, but in my experience it has been
usually worth it so far.

~~~
ZacharyPitts
I like going one step further and making deployments and AWS resources into
reusable code using Troposphere:
[https://github.com/cloudtools/troposphere](https://github.com/cloudtools/troposphere)

------
sergiotapia
I wonder when Amazon is going to invest their ungodly earnings into UX. Their
UI is absolutely terrible and I shudder every time I have to log into it. Is
it purposely built that way to confuse people who don't belong in there?

~~~
mattbillenstein
I've heard this a few times -- AWS is a very complex product that gives you a
lot of freedom in how you build your product using it. I think the web UI
works about as well as it could given that complexity.

If you want something simpler, you have Heroku and a bunch of similar things
which make a bunch of decisions for you -- but you don't have the flexibility
there that you do with AWS of course.

~~~
sergiotapia
There has to be some middle ground between AWS and Heroku though.

~~~
mattbillenstein
There are some options, but the point is, what you're trading for simpler UX
is a simpler overall product with less flexibility.

------
buremba
Use OpsWorks if possible. It's free and provides a simple interface that
allows you to deploy/upgrade your apps automatically and monitors your
instances automatically using CloudWatch.

~~~
wsh91
+1. AWS has done an amazing job with OpsWorks. It's free Chef 12. What's not
to love? (The only drawback I've noticed so far is some standard Chef stuff--
vault in particular--not working. Other than that, very pleased.)

------
rgawdzik
Check out [http://convox.com/](http://convox.com/) (YC S15) for an alternative
to avoiding manual infrastructure.

~~~
nzoschke
Convox co-founder here. Thanks for the shout out!

That's exactly right, avoid manual infrastructure.

Someday we will all have something like Rails for infrastructure. Strong
conventions around best practices.

If you follow these conventions you can avoid bespoke or manual configurations
and focus solely on your app logic.

We're building Convox to advance this goal.

------
cachemiss
So I'd modify these a bit. We run a very large AWS infrastructure as a
engineering team (no dedicated ops).

1\. Use CloudFormation only for infrastructure that largely doesn't change.
Like VPC's, subnets/ internet gateways etc. Do not use it for your instances /
databases etc, I can't recommend that enough, you'll get into a place where
updating them is risky. We have a regional migration (like database
migrations) that runs in each region we deploy to that sets up ASG, RDS etc.
It allows us control over how things change. If we need to change a launch
conf etc.

2\. Use auto-scaling groups in your stateless front ends that don't have
really bursty loads, it isn't responsive enough for really sharp spikes
(though not much is). Otherwise do your own cluster management if you can
(though you should probably default to autoscaling if you can't make a strong
case not to use it).

3\. Use different accounts for dev / qa / prod etc. Not just different
regions. Force yourself to put in the correct automation to bootstrap yourself
into a new account / region (we run in 5 regions in prod, and 3 in qa, and
having automation is a lifesaver).

4\. Don't use ip addresses for things if you can help it, just create a
private hosted zone in Route53 and map it that way.

5\. Use instance roles, and in dev force devs to put their credentials in a
place where they get picked up by the provider chain, don't get into a place
where you are copying creds everywhere, assume they'll get picked up from the
environment.

6\. Don't use DynamoDB (or any non-relational store) until oyu have to (even
though it is great), RDS is a great service and you should stick with it as
long as you can (you can make it scale a long way with the correct
architecture and bumping instance sizes is easy). IMO a relational store is
more flexible than others since you (at least with postgres) get transactional
guarantees on DDL operations, so it makes it easier to build in correct
migration logic.

6\. If you are using cloudformation, use troposphere:
[https://github.com/cloudtools/troposphere](https://github.com/cloudtools/troposphere)

7\. Understand what instances need internet access and which ones don't, so
you can either give them public ips, or put in a NAT. Sometimes security teams
get grumpy (for good reason) when you open up machines that don't need to be
to the internet, even if its just outbound.

8\. Set up ELB logging, and pay attention to CloudTrail.

9\. We use Cloudwatch Logs, it has its warts (and its a bit expensive), but
it's better than a lot of the infrastructure you see out there (we don't
generally index our logs, we just need them to be able to be viewed in a
browser and exported for grep). It's also easy to get started with, just make
sure your date formats are correct.

10\. By default, stripe yourself across AZs if possible (and its almost always
possible). Don't leave it for later, take the pain up front, you'll be happy
about it later.

11\. Don't try and be multi-region if you can at first, just replicate your
infrastructure into different regions (other than users / accounts etc.).
People get hung up on being able to flip back and forth between regions, and
its usually not necessary.

edit: Track everything in cloudwatch, everything.

~~~
chucky_z
Have you looked at Terraform? I have _everything_ defined (unless it doesn't
work very well... which still happens as it's still under heavy development),
and if it needs to be dealt with care (e.g.: core EC2 instances) I'm slowly
filtering it into a separate set of units/variables, and setting the "static"
infrastructure (VPC) as a downstream group, and slurping the statefile
upstream as to not potentially damage anything when playing nice with
deploying/redeploying EC2 instances.

------
vitoc
I work in a team that uses quite a lot of AWS for various reasons beyond cost.
We also use various other clouds too, depending on needs of a specific project
or situation. Our experience with cloud is that it is a journey. Obviously,
coupled with DevOps tooling, it’d allowed us to deal with environment
requirements at the speed of software, i.e. Infrastructure as Code and all.
One thing we find ourselves doing over and over is changing the
infrastructure, either because requirements change, new cloud services are
launched and its useful to us or simply because we find more efficient ways of
running cloud resources (Trusted Advisor helps).

We’d built a tool called Liquid Sky ([https://liquidsky.singtel-
labs.com](https://liquidsky.singtel-labs.com)) to help us keep track of the
cost impact of the changes we make constantly. I did mention that we use cloud
for reasons beyond cost, but we definitely still want to know that we’re
sensible and maximise cost efficiency as well, its just another (important)
factor. Because we change our cloud resources so frequently, we didn’t want to
make it a very rigid process when dealing with the sensibilities of cloud
cost. Hence, we’d built Liquid Sky in a way that gives our engineers the
freedom to explore better way of running things on the cloud while keeping
cost in check as well as keeping the team (including cost guardians) in the
loop.

------
fibo
Nice article, I am working in a small company (Beintoo) dice september 2015
and we use AWS here. I think is a very interesting set if products and you can
build any kind of business. For sure the advices given in the article are
really useful, in fact I will apply them at my job place.

About comparison with other services, I was working in Deloitte Analytics
before, managing the cloud services provided by IBM Softlayer. You cannot
compare them, AWS offers many more and I was not really satisfied with
SoftLayer, for example I had a problem with a network upgrade they did on
January 2014 and I have lost a lot of data, with poor support to restore it.
Also the starting price of 25$ per month is really expensive. AWS is far more
mature and interesting.

Then for my own servers I use CloudAtCost cause is cheaper but if I run a
business for sure I would go with AWS. If you gain money, is not that
expensive and if you stick with Amazon advices and philosophy is very
reliable.

------
tedmiston
I'll add one that was especially common for people coming off the year of free
tier a couple years ago. I'm not sure that AWS has changed it yet.

6\. Not starting a box/instance/database and forgetting it's running until you
receive the bill after your free tier expires.

~~~
molecule
I expected to see something like this generalized in the article as

6\. Not setting up billing alarms,
[http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/...](http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/free-
tier-alarms.html)

------
kennu
I warmly recommend the Serverless framework for building basic web
applications on AWS. It handles CloudFormation details for you, but lets you
customize them if needed. Not suitable for every possible app though.

------
overgard
This might be the wrong place to ask, but I'm curious how people feel Azure
stacks up to AWS? The services seem comparable (maybe even nicer), but I'm
unclear how it compares on cost.

------
ap22213
I don't get all the complaints about AWS prices. I just spun up a 60 node
Spark cluster with over 5TB of memory, processed 100B data records, and spent
$5.50!

------
nikolay
As there's too much push to Terraform, which I personally dislike due to many
opinionated features and the marketing push to cover as many services as
possible and not do one thing and do it great (AWS), you can look at
Bazaarvoice's CloudFormation Ruby DSL [0].

[0]: [https://github.com/bazaarvoice/cloudformation-ruby-
dsl](https://github.com/bazaarvoice/cloudformation-ruby-dsl)

------
matdrewin
Personally never quite got the appeal for EC2. You're basically replicating
what you would be doing on physical servers anyway. The real productivity
gains come from using a fully managed PaaS (Heroku, Azure Web Apps, Google App
Engine, OpenShift etc.) where there is no maintenance and scaling and
redundancy is taken care for you. Granted those are even more expensive but
they are the only ones that provide any kind of value.

------
voltagex_
I find IAM particularly difficult to use - I feel like there should be a
button to create a user/group that can do only X, Y, Z. I realise policy
templates get most of the way there but I still had to go and read the syntax
for them because DescribeRegions wasn't in the list I needed.

I'm also not sure how to make the jump from exporting AWS_ACCESS_KEY_ID and
having my instances automatically request the permissions they need - STS?

~~~
inopinatus
If your code is using the AWS SDK then you don't need to export an access key
for your running code. Just create an EC2 Service Role in IAM with the policy
you want attached to the role; then launch the instance with that role, to get
automatic temporary credentials.

Refs:
[http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use...](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-
role-ec2.html) [http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-
roles...](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-
amazon-ec2.html)

------
siddharth_mal
I'd like to add two more:

    
    
          1) Not giving out your access and secret keys in scripts/buckets.
    
          2) Always using IAM roles with your EC2

~~~
misiti3780
how do you avoid 1 - it seems impossible ?

~~~
mhluongo
IAM roles let you assign temporary credentials to machines running scripts.
The machine can then hit an internal AWS URL to get the temporary credentials.
Many tools know to look for these credentials by default- eg boto checks for
credentials in environment variables, config files, and the machines IAM role.

~~~
idunno246
And there's a few tools to emulate the metadata service locally if you need it
on dev laptops which makes it use a role as if a server

------
ninjay
What are the current ways to make creating CloudFormation templates not so
painful?

~~~
loki77
You should check out troposphere
([https://github.com/cloudtools/troposphere/](https://github.com/cloudtools/troposphere/))
and stacker
([https://github.com/remind101/stacker/](https://github.com/remind101/stacker/)).
I'm a maintainer of both - we try to make CF easier by catching errors
earlier, and allowing you to do things like write for loops, in troposphere.
stacker tries to take your troposphere templates and tie them together as
totally separate stacks.

------
benmanns
Has anyone tried the Trusted Advisor feature out? Have you found it worth the
3-10% on top of existing monthly usage?

~~~
wsh91
If you already have a support plan, it's free. So yes, I have. Nothing useful
so far, but I plan to keep checking it periodically in case I'm off my game.

------
diziet
Something major is missing:

Running Demand instances instead of Reserved

~~~
hrez
Or running Reserved instead of onDemand. It's all about use cases.

------
vacri
> _There is no reason why you should manage your infrastructure manually. It
> 's unprofessional! It's a mess!_

Nonsense. Cloudformation has it's issues. It takes time to learn and
implement. The templates can break, requiring the stack to be destroyed and
remade. In the sample in the article, the database is in the same template as
everything else - what fun that will be when an update breaks the template and
you have to reapply the stack (which destroys the existing database).

Cloudformation is good, but it comes with caveats, and the idea that you
should _only_ manage an AWS stack with CF is utter tripe. As with everything,
it depends on your use case.

Also weird is the article's demand of using autoscaling groups to monitor
single instances. Why not just monitor them directly with cloudwatch?

> _There is no reason - beside manually managed infrastructure - to not
> decrease the instance size (number of machines or c3.xlarge to c3.large) if
> you realize that your EC2 instances are underutilized._

This is wrong, too. Autoscaling takes time to scale up, and it scales up in
stages. If you get sudden traffic, autoscaling can take too long. Again, it's
about knowing your use case. Unfortunately for us, we can get sudden traffic
when one of our clients does a media release and they don't tell us ahead of
time, The five or so minutes it takes for instances to trigger the warning,
start up a new set, and then attach these to a load balancer is too long for
this particular use case, so we just have to run with a certain amount of
excess capacity.

Autoscaling is awesome, but this article is _way_ too didactic in it's No True
Scotsman approach.

~~~
Rapzid
Cloudformation can really be a harsh mistress. I feel I'm constantly
discovering good reasons for not including certain sets of resources in the
same templates for reasons you and others touch on. And nested stacks? You
know, never say never.. Probably never again.

------
ommunist
I attended aws workshop, several years ago, there I realized, that this vast
ecosystem of infrastructure services require a popularisation effort of
similar scale. And not just in plain English. IAM that days was not very
clearly understandable, search supported only ascii and was not really
documented, now that ecosystem is in order of magnitude larger and more
complex. And efforts like cloudonaut's should be greatly appreciated. For the
greater public good. Thank you, man!

