For example, what if you have set up a bunch of auto-scaling instances to handle heavy load. During the middle of the night, your application suddenly goes viral in Indonesia or Brazil or some other large country in a different timezone from yours.
Your bill goes over its threshold - what does Amazon do? Do they terminate some of your instances (potentially losing production data!)?
The problem is that the exact times that "intelligent" bill-limiting features are likely to be used are the exact instances in which they could have absolutely disastrous impact if they make a mistake
I haven't experienced this myself, but I've heard Amazon is actually rather lenient when it comes to waiving fees for clear mistakes (accidentally provisioning an expensive instance that you didn't mean to) - their customer support is supposed to be very good, though I have not had to use this myself.
 By "mistake", I mean that they are following your instructions correctly, but your instructions happen to be incorrect for handling this situation.
It's not terribly difficult to implement one's own bill limiting system using the API.
The right way to do it (currently) is to model the costs yourself and tie it in with your operational automation, then determine what your best course of action would be during a shitstorm. Generally, I would wager the best option is to replace your front page with a static html "overwhelming response" page, maybe with some light vector graphics.
If you have opted in for the "don't let me spend more than $x," then yes.
Create tiers of things that email / SMS additional people as the numbers go up if you want to be very careful etc...
AWS can't exactly do this for you, as explained by previous commenters, but they do give you the basic features you'd need for implementing it.
AWS Costs & Usage Analytics
This is 1st hand information from a AWS Rep I once dated.
I'd assume since it's a spike in costs, Amazon has more interest in keeping you as a customer, at your much lower typical usage, than to have you abandon, even if you have a huge bill.
Obviously if this was a typical problem, you would need to rethink your architecture and infrastructure needs.
In either case, expending a few neural cycles prior to the action would be beneficial.
I'm all for being able to initiate billing inquiries online--in fact, I'm ditching Bank of America because I can't dispute charges on line.
But in Amazon's case, this isn't a dispute, rather this is asking them to forgive a (hopefully) one time transgression, which is much easier to do by speaking to someone, than filling out a form.
I find it similar to haggling for a temporary reprieve in my fantastically high cable bill.
I think thought that it's a failure of technologists to properly address the challenges of billing.
Amazon voided all charges without question.
Their level of customer support is off the charts. But AWS would never be my first pick to start up something. Your costs will be way way lower when going dedicated. Obviously you should be able to scale, but I think startups generally overestimate their success when it comes to that, and wasting a lot of money for things they have no use for yet.
You could have your AWS deployment strategies ready for when the need would arise..
(Having said that, AWS could really do a lot better with it's notification targets than just email.)
And wow. Hell of a price drop.
> Unlike every other service Amazon provides, with Glacier you're not paying for your net usage throughout each month: Instead, you're paying for your peak hour — and once that peak hour has come and gone, you're still paying for that peak until the end of the month, no matter how low your usage might fall. 
Note: I still use Glacier for long-term backup, because that's the exact use case it's meant for. But it's important to be aware of that cost: if I ever need to retrieve that data, I know I need to do it very slowly (and carefully) to avoid racking up a massive bill.
A bit more detail in an older comment of mine: https://news.ycombinator.com/item?id=7147325
"Glacier is designed with the expectation that retrievals are infrequent and unusual, and data will be stored for extended periods of time. You can retrieve up to 5% of your average monthly storage (pro-rated daily) for free each month. If you choose to retrieve more than this amount of data in a month, you are charged a retrieval fee starting at $0.01 per gigabyte. Learn more. In addition, there is a pro-rated charge of $0.03 per gigabyte for items deleted prior to 90 days."
To store 1TB for a year in S3 is now $368.64
To store 1TB in Glacier for a year is $122.88
We're quibbling over ~$250/year?
I would quibble over 25k a year..
I can't download the data that fast anyways, it is a total non issue for us.
>Glacier access costs can get expensive depending on the quantity of data you need to access in a short period of time
Yeah, it is for backups. That's why we are talking about how we use it for backups.
>We're quibbling over ~$250/year?
Times the "few TB" we are talking about, that is over a$1k a year, as I said. We're not quibbling. I asked a simple question. Why would I take the time and effort to move my backups from a 1c/GB service to a 3c/GB service? There is no upside. "You can pay an extra $1000 for no reason, which is a great idea because $1000 isn't very much" is not very compelling.
Timing is everything. Sometimes the current market pricing for a commodity is too expensive to make your business viable today but in the future that will not be the case. Just depends how long that will take.
With replication to 3 data centres.
But even renting space by the 1U and leasing servers directly adds surprisingly little overhead.
400TB even at the Reduced Redundancy Storage pricing is ~$20k/month up until the price drop. With the price drop it looks like it'll be more like $9300 with RRS, which is actually getting cheap enough that it's worth it in some cases, to avoid having anyone spend their time on managing anything. The gap will increase again once you factor in request pricing and bandwidth, though, as even my standard price list prices for bandwidth would add up to less than the cheapest published bandwidth tier at Amazon with just a tiny little bit of hassle (BGP setup to multihome each sites with 2-3 different transit providers with different coverage - lowers cost by letting you pick a transit deal that does not give full access to the internet at very low cost coupled with a full price full coverage one)
Or I could spend about $12k/month at Hetzner for the same storage + an aggregate stated "guaranteed" bandwidth of about 2Gbps per location at no extra cost. Again, without shopping around. The moment you do more than about 20-30TB transfer/month, Hetzner becomes cheaper than S3 with that setup, and probably at about 50TB+/month it'd take effort to beat them when buying my own transit too.
You can definitely beat those rates at large enough scale but for a startup you're talking about dedicating a non-trivial chunk of your skilled technical staff's time in the hope of breaking even. That's an obvious move for Backblaze or Crashplan but for other companies it might be a dubious investment versus nailing the core product.
From that, it looks like their loss over 26 months was ~$300,000. Their total AWS costs was less than $400,000 over the same period.
Downloading 1 GB = 4x the cost of the S3 storage at the reduced prices. If everything is viewed once [doubtful], that is 4x the price of the S3 storage. Given that ti would be impossible to reduce the total cost by even 50%, let alone the 75% they'd need to break approximately even.
I didn't see S3 broken out specifically anywhere which is why I'm just making an educated guess. :/
So basically at their peak, they had ~174TB S3 and ~47TB S3 RRS. That month cost them ~$16.2k, whereas after April 1 they could have put everything in S3 for ~$6.7k. That's a big difference, but they were still getting taken to the cleaners over some other services like RDS.
RRS: 2+3+6+8+9+18+19+22+27+34+42+48+7=~245 TB/months=~
Normal: 7+9+13+19+24+46+56+65+76+95+123+153+174+25=~885 TB/months
0.055 * 885 * 1024=$49,843
0.044 * 245 * 1024=$11,039
$49,843+$11,039=~$60k saved which is 20% of what they were short, if they had started with April 1st's pricing.
If you can reduce the AWS $400k by 40% due to the reduced pricing for EC2, you'd still fall short.
Are there any ready-to-deploy solutions for this?
Whereas software, if you're doing it right, should have much higher margins.
Sidenote: I'd be very curious to know what the "typical SaaS business" has as its gross margin, factoring in all sales against "cost of goods": servers, bandwidth, operations, etc.
Also, I wonder if EC2 is up for a price tweak soon too?
EDIT: Ah - https://news.ycombinator.com/item?id=7475284
My guess is that they are only using AWS for serving the site and handling the uploads, resizing etc. and they are using another provider for the storage of the images (or they are just directly hosting everything with Edgecast and not using them in Pull mode).
Like people have mentioned above, you can get a bunch of dedicated servers with heaps of storage for a fraction of the price of S3. You loose the flexibility and instant scaling, but the fact is these servers would be used more of a backup and would only be hit by Edgecast to upload them to the CDN.
I learned that April 1 jokes on the internet are often taken the wrong way even when there are ample reasons to realize it is a joke.
EDIT: Example: We use Akamai as our caching layer, as their bandwidth costs are much less than Cloudfront (their analytics are better as well, and they don't charge for invalidations like Cloudfront does). We use S3 as the origin as its a lower cost per GB than Akamai's NetStorage, and the S3 API is superior to work with compared to the interfaces exposed by NetStorage (scp/ftp/rsync).
Storage and Delivery should be considered separate, as there are benefits to it being a separate service. If you throw a CDN between your customers and S3 you'll not only get a significantly lower cost, you'll also give your customers much better performance.
If I am wrong about this (and I hope I am), where can I find more information about that?
If your cache gets deleted or expired, you'll have to pay to move the data (at s3 rates) to the CDN again when someone requests it.
I don't disagree that $0.12/GB could be considered expensive, but S3's competitors seem to be charging the same.
However most networks have sine wave pattern diurnal utilization. A common peak to average ration would be 3:1. Our $0.0031 is effectively the rate for an average of 1mbs. To get that average rate we'll be paying about 3x, or $0.0093 per GB transferred. So we're paying $0.01 per GB for the transit alone. This is also in a major metro area in North America. Transit costs can easily be 3-10x in EU or 10-20x in AP.
On top of that there's capex and opex for that 10gb port. Assuming you've got some decent scale you're going to get 96 10gb ports in a $300,000 to $500,000 router, or $3-5,000 per 10gb port. You need ports both north and south, so you get 480gbs throughput per chassis. But you can't run above 80% util, so call it 384gbs of peak throughput. Opex for rack position, data techs, power, etc will cost $2,000 to $5,000 per month. Amortized over 36 months let's call it $10,000 to $17,000 per month all up.
That 384gbs peak throughput is actually more like 128gbs averaged over the month. At 128gb/s average we transfer about 41,472,000 GB per month, or $0.00024 to $0.00041 per GB. Now double that cost 'cause routers actually come in pairs for fault tolerance. So we're paying $0.0005 to $0.0008 per GB of transit connectivity.
And how many hops are there back to your application? I'd hazard around 6 by the time you go transit -> edge -> border -> core -> dc -> agg -> tor. Paying for that internal network infrastructure is going to add $0.0030 to $0.0050 to our bill. All up that gets us to $0.0128 to $0.0151 per GB transferred from our host to a third party network.
Expanding the error bars lets round to $0.01 to $0.02 of network costs per GB that we deliver. To get these prices you're going to have a capital budget of tens of millions per year. OpEx would be at least $500K, or more likely a million plus, per month. If you can actually beat my pricing I'm hiring and or would like to subscribe to your news letter.
Apologies if I missed something in the math. I had to do some mental acrobatics to go from dollars per gb/s to dollars per GB/month.
That is, our storage is expensive (10c per GB, per month, with the HN-new-customer discount) but our usage/transfwer costs are zero.
ssh email@example.com s3cmd get s3://rsynctest/mscdex.exe
AWS users will end up with the same problem in time. I wonder why the bandwidth costs are so subborn, does the cost price per traffic unit really never come down, or is someone (some people) in the infrastructure layer(s) screwing us?
It's the same way you pay a fixed hourly rate (in most cases) for your retail electricity, and your provider is paying wholesale spot market rates that can fluctuate in ~5 minute windows.
TL;DR You're paying for consistency and abstraction away from the underlying IP transport costs.
My understanding is that S3 is near indestructible from a serving perspective, but it may be computationally expensive for AWS to scale up/scale down for bursty/peaking outbound traffic serving needs.
Another possibility is that Amazon doesn't charge for inbound, but charges a premium for outbound to balance their traffic ratios so they can peer with providers vs having to buy their transit. The free inbound traffic to S3 offsets their outbound traffic not only from AWS, but their consumer-facing web properties.
As always, just my assumptions/observations.
I have no idea why you are picking on Hetzner in particular. They offer clearly defined bandwidth packages with their servers (20-50 TB) and charge for overages (0.2 cents per GB).
Hetzner is perhaps not what you would call a premium provider, but their prices are about in line with industry wholesale rates. For example a very high quality provider that competes with Hetzner in their home market charge 0.4 cents per GB and they are happy to deliver any traffic you can serve at that price point.
Then they are not "other options", that's the point.
Webhostingtalk is a good resource for the curious.
With similar pricing to amazon. Are you not reading any of the replies you've gotten?
There are cloud providers with cheaper prices than Amazon. Please feel free to read up on the alternatives.
The whole point of this exchange has been to point out that wholesale prices are far less than what Amazon charges. Some cloud providers are more prone to follow the wholesale cost level and others are not.
If you want to discuss alternatives to Amazon then first you have to define what cloud features you want/need.
But I did ~50 TB of traffic per month at various hosts for <$20 without any issue. Just choose them well, and know exactly what kind of connection they offer.
I've seen Internet Archive and Backblaze estimates that indicate $60-100/TB-storage installation and $7/TB-year in power. If drives last 5 years, we could expect a commodity cloud-storage price around $20/TB-year? One order of magnitude to go.
Backblaze, for example, has two gigabit ethernet ports for their new 180TB pod. That means, even assuming no overhead, copying 1 POD to another will take ~9 days! Even were they to add additional/faster ports/interconnects, the 45 drives in a POD are still sitting on a PCIe 2.0 8x, thus with a theoretical throughput of 4GB/s, thats still half a day.
Now, I do not know how many drives Amazon/Google load into their S3/GCS servers, but my (dayjob) live storage servers rarely have more than 4 - 8 drives each. Even if we assume 8, thats a 5-6 fold increase in theoretical throughput - and THAT is why S3/GCS will never be in the same pricing zone as Backblaze.
S3/GCS, even Backupsy, DO, etc are all "live" storage solutions, and seems most are converging to ~3 cents/GB-month. Backblaze is "archival" storage, and will (probably) continue to be cheaper by a factor of 5-10 (or its won't be worth the hassle).
Currently I am only using S3 for hosting PDF or other big size files which are downloadable by the public, since the bandwidth is pay-as-go.
However for regular storage, Dropbox/GDrive are still the primary choice, due to the fact, the data sync across devices, speaking of sync, what is holding Amazon back from a similar app for S3?
I believe I can solve the former with S3, though I still don't know of a turnkey solution. The latter is not quite there but this update brings it a lot closer. At this point it would cost me $15/month or $180/year. That's not terrible, but for that price I can easily have two 2TB WD Red drives. The box to house them would cost just a bit more since I need no horsepower, but just enough to run ZFS.
Glacier is a more attractive option, but the fact that the price is so complex when transferring data out of it, I'd be looking at taking months to restore everything just to not pay to dollar for it.
The GD client / back-end and large-file handling seems to be a long way behind the alternatives, sadly.
I don't think this is relevant for you if you have less than a few Terabytes of data.
It requires some technical chops to setup, but I doubt that would be a problem for you (or other people on HN).
Probably ... 2x the price of S3, given the cost of data transfer which, for us, is zero.
It may not be for you, but some folks find phone support, 7d+4w snapshots, straight-up, native unix interoperability and 2 free physical delivery events per year to be compelling.
the $12,000 hammer reinvented!!
"Because the AWS GovCloud (US) Region is physically and logically accessible by US persons only and also supports FIPS 140-2 compliant end points, customers can manage more heavily regulated data in AWS while remaining compliant with federal requirements."
I propose someone makes a subsection of Hacker News dedicated solely to jokes or funny stuff.
Or someone create a new site... named.... Hacker-News-Chan?