Hacker News new | past | comments | ask | show | jobs | submit login

Beware that retrieval fee!

The retrieval fee for 3TB could be as high as $22,082 based on my reading of their FAQ [1].

It's not clear to me how they calculate the hourly retrieval rate. Is it based on how fast you download the data once it's available, how much data you request divided by how long it takes them to retrieve it (3.5-4.5 hours), or the size of the archives you request for retrieval in a given hour?

This last case seems most plausible to me [6] -- that the retrieval rate is based solely on the rate of your requests.

In that case, the math would work as follows:

After uploading 3TB (3 * 2^40 bytes) as a single archive, your retrieval allowance would be 153.6 GB/mo (3TB * 5%), or 5.12 GB/day (3TB * 5% / 30). Assuming this one retrieval was the only retrieval of the day, and as it's a single archive you can't break it into smaller pieces, your billable peak hourly retrieval would be 3072 GB - 5.12 GB = 3066.88 GB.

Thus your retrieval fee would be 3066.88 * 720 * .01 = $22081.535 (719x your monthly storage fee).

That would be a wake-up call for someone just doing some testing.


[1] http://aws.amazon.com/glacier/faqs/#How_will_I_be_charged_wh...

[2] After paying that fee, you might be reminded of S4: http://www.supersimplestorageservice.com/

[3] How do you think this interacts with AWS Export? It seems that AWS Export would maximize your financial pain by making retrieval requests at an extraordinarily fast rate.

[(edit) 4] Once you make a retrieval request the data is only available for 24 hours. So even in the best case, that they charge you based on how long it takes you to download it (and you're careful to throttle accurately), the charge would be $920 ($0.2995/GB) -- that's the lower bound here. Which is better, of course, but I wouldn't rely on it until they clarify how they calculate. My calculations above represent an upper bound ("as high as"). Also note that they charge separately for bandwidth out of AWS ($368.52 in this case).

[(edit) 5] Answering an objection below, I looked at the docs and it doesn't appear that you can make a ranged retrieval request. It appears you have to grab an entire archive at once. You can make a ranged GET request, but that only helps if they charge based on the download rate and not based on the request rate.

[(edit) 6] I think charging this way is more plausible because they incur their cost during the retrieval regardless of whether or how fast you download the result during the 24 hour period it's available to you (retrieval is the dominant expense, not internal network bandwidth). As for the other alternative, charging based on how long it takes them to retrieve it would seem odd as you have no control over that.

Former S3 employee here. I was on my way out of the company just after the storage engineering work was completed, before they had finalized the API design and pricing structure, so my POV may be slightly out of date, but I will say this: they're out to replace tape. No more custom build-outs with temperature-controlled rooms of tapes and robots and costly tech support.

If you're not an Iron Mountain customer, this product probably isn't for you. It wasn't built to back up your family photos and music collection.

Regarding other questions about transfer rates - using something like AWS Import/Export will have a limited impact. While the link between your device and the service will be much fatter, the reason Glacier is so cheap is because of the custom hardware. They've optimized for low-power, low-speed, which will lead to increased cost savings due to both energy savings and increased drive life. I'm not sure how much detail I can go into, but I will say that they've contracted a major hardware manufacturer to create custom low-RPM (and therefore low-power) hard drives that can programmatically be spun down. These custom HDs are put in custom racks with custom logic boards all designed to be very low-power. The upper limit of how much I/O they can perform is surprisingly low - only so many drives can be spun up to full speed on a given rack. I'm not sure how they stripe their data, so the perceived throughput may be higher based on parallel retrievals across racks, but if they're using the same erasure coding strategy that S3 uses, and writing those fragments sequentially, it doesn't matter - you'll still have to wait for the last usable fragment to be read.

I think this will be a definite game-changer for enterprise customers. Hopefully the rest of us will benefit indirectly - as large S3 customers move archival data to Glacier, S3 costs could go down.

I wasn't holding my breath, but I was thinking there's a possibility they were using short-stroking to speed up most of their systems hard drives by making a quarantined barely touched Glacier zone in the inside of their drives: https://plus.google.com/113218107235105855584/posts/Lck3MX2G...

My backup wouldn't it be cool if is, unlike the above reasonableness, a joke: imagining 108 USB hard drives chained to a poor PandaBoard ES, running a fistful at a time: https://plus.google.com/113218107235105855584/posts/BJUJUVBh...

The Marvell ARM chipsets at least have SATA built in, but I'm not sure if you can keep chaining out port expanders ad-infinitum the same way you can USB. ;)

Thanks so much for your words. I'm nearly certain the custom logic boards you mention are done with far more vision, panache, and big-scale bottom line foresight than these ideas, even some CPLD multiplexers hotswapping drives would be a sizable power win over SATA port expanders and USB hubs. Check out the port expanders on OpenCompute Vault 1.0, and their burly aluminium heat sinks: https://www.facebook.com/photo.php?fbid=10151285070574606...

That would definitely be cool. Pretty unlikely, however. When it comes to hardware, they like to keep each service's resources separate. While a given box or rack many handle many internal services, they're usually centered around a particular public service. S3 has their racks, EC2 has theirs, etc. Beyond the obvious benefit of determinism - knowing S3 traffic won't impact Glacier's hardware life, being able to plan for peak for a given service, etc - I'm guessing there are also internal business reasons. Keeping each service's resources separate allows them to audit costs from both internal and external customers.

Then there's failure conditions. EBS is an S3 customer. Glacier is an S3 customer. Some amount of isolation is desirable. If a bad code checkin from an S3 engineer causes a systemic error that takes down a DC, it would be nice if only S3 were impacted.

I probably shouldn't go into the hardware design (because 1) I'm not an expert and 2) I don't think they've given any public talks on it), but it's some of the cooler stuff I've seen, especially when it came to temperature control.

It wasn't built to back up your family photos and music collection.

But at its price points, with most US families living under pretty nasty data cap or overage regimes, it sounds superb, with of course the appropriate front ends.

There's no good (reliable), easy and cheap way to store digital movies, e.g. DVD recordable media is small by today's standards and it's much worse than CD-Rs for data retention (haven't been following Blu-ray recordable media, I must confess, I bought an LTO drive instead, but I'm of course unusual). And the last time I checked very few people made a point of buying the most reliable media of any of these formats.

In case of disk failure, fire, tornado (http://www.ancell-ent.com/1715_Rex_Ave_127B_Joplin/images/ ... and rsync.net helped save the day), for this use case you don't care about quick recovery so much as knowing your data is safe (hopefully AWS has been careful enough about common mode failures) and knowing you can eventually get it all back. Plus a clever front end will allow for some prioritizing.

Important rule learned from Clayton Christensen's study of disruptive innovations (where the hardest data comes from the history of disk drives...) is that you, or rather AWS here, can't predict how your stuff will be used. So if they're pricing it according to their costs as you imply they're doing the right thing. Me, I've got a few thousand Taiyo Yuden CD-Rs who's data is probably going to find a second home on Glacier.

ADDED: Normal CDs can rot, getting them replaced after a disaster is a colossal pain even if your insurance company is the best in the US (USAA ... and I'm speaking from experience, with a 400+ line item claim that could have been 10 times as bad since most of my media losses were to limited water problems), so this is also a good solution to backing up them. Will have to think about DVDs....

Not saying it couldn't be very useful for the data-heavy consumer. Just guessing this is more of an enterprise play than anything else. :)

Very possibly, but who knows; per the above on disruptive innovations, Amazon almost certainly doesn't.

I personally don't have a feel for enterprise archival requirements (vs. backups), but I do know there are a whole lot of grandparents out there with indifferently stored digital media of their grand-kids (I know two in particular :-); the right middlemen plus a perception of enough permanent losses of the irreplaceable "precious moments" and AWS might see some serious business from this in the long term.

The math doesn't come close to replacing tape - basically once you go north of 100 terabytes (just two containers - at my prior company we had 140 containers in rotation with iron mountain) Glacier doesn't make financial or logistical sense. Far cheaper and faster to send your LTO-5 drives via driver.

It may not make sense today. Amazon is notorious for betting on the far future. They're also raising the bar on what archival data storage services could offer. When you ship your bits to Amazon, they're in 3+ DCs, and available programmatically.

Separate from the play for replacing tape, there's also the ecosystem strategy. When you run large portions of your business using Amazon's services, you tend to generate a lot of data that ends up needing to be purged, else your storage bill goes through the roof. S3's Lifecycle Policy feature is a hint at the direction they want you to go - keep your data, just put it somewhere cheaper.

This could also be the case where they think they're going after tape, but end up filling some other, unforeseen need. S3 itself was originally designed as an internal service for saving and retrieving software configuration files. They thought it would be a wonder if they managed to store more than a few GB of data. Now look at it. They're handling 500k+ requests per second, and you can, at your leisure, upload a 5 TB object, no prob.

But maybe you're right. The thing could fail. Too expensive. After all, 512k ought to be enough for anybody.

Thanks very much for the insight - what you are saying actually makes a lot of sense in the context of systems inside the AWS ecosystem. After all, they need to archive data as well. Also - my 140 container example w/Iron Mountain was Pre-versioning and always-online differential backups. We basically had a complex tower-of-hanoi that let us recover data from a week, a month, six months, and then every year (going back seven years) from all of our servers. (And, by Year seven, when we started rotating some of the old tapes back in - they were a generation older than any of our existing tape drives. :-)

Clearly, with on-line differential backups - you might be able to do things more intelligently.

I'm already looking forward to using Glacier, but, for the forseeable future, it looks like the "High End" archiving will be owned by Tape. And, just as Glacier will (eventually) make sense for >100 Terabyte Archives, I suspect Tape Density will increase, and then "High End" archiving will be measured in Petabytes.

Thanks again.

Have you considered the cost of the tape loaders? Our loaders cost significantly more over their lifetime than the storage costs of the tapes themselves.

The tradeoffs will be different depending on how many tapes you write and how often you reuse them.

Until I took over backups, and instituted a rotation methodology, the guy prior to me just bought another 60 AIT-3 tapes every month and shipped them off site to Iron Mountain.

Agreed - how-often you re-use tapes (and whether you do) - has a dramatic effect on "system cost" of your backup system.

This was interesting to wake up to this morning ...

Right now we sell 10TB blocks for $9500/year[1].

This works out to 7.9 cents/GB, per month, so 7.9x the glacier pricing. However, our pricing model is much simpler, as there is no charge at all for bandwidth/transfer/usage/"gets", so the 7.9 cents is it.

7.9x is a big multiplier. OTOH, users of these 10TB blocks get two free instances of physical data delivery (mailing disks, etc.) per year, as well as honest to god 24/7 hotline support. And there's no integration - it just works right out of the box on any unix-based system.

We had this same kind of morning a few years ago when S3 was first announced, and always kind of worried about the "gdrive" rumors that circulated on and off for 4 years there...

2013 will be interesting :)

[1] http://www.rsync.net/products/pricing.html

I've spent several hours reading about this and talking with colleagues, reading the (really great) HN threads on the topic and doing a bunch of math - and I've come to the conclusion that rsync.net/backblaze/tarsnap/crashplan probably don't have too much to worry about for _most_ use cases.

The wonky pricing on retrieval makes this inordinately complex to price out for the average consumer who will be doing restores of large amounts of data.

The lack of easy consumer flexibility for restores also is problematic for the use case of "Help, I've lost my 150 GB Aperture Library / 1 TB Hard Drive"

The 4 Hour retrieval time makes it a non starter for those of us who frequently recover files (sometime from a different machine) off the website.

The cost is too much for >50 Terabyte Archives - Those users will be likely be doing multi-site Iron Mountain backups on LTO-5 Tapes. After 100 Terabytes, the cost of the drives is quickly amortized and ROI on the tapes is measured in about a month.

The new business model that Amazon may have created overnight though, and beats everyone on price convenience, is "Off-Site Archiving of low-volume low value Documents" - Think Family Pictures. Your average shutterbug probably has on the order of 50 GBytes of photos (give or take) - is it worth $6/year for them to keep a safe offline archive of them? Every single one of those people should be signing up for the first software package that gives them a nice consumer-friendly GUI to backup their picasa/iPhoto/Aperture/Lightroom photo library.

Let's all learn a lesson from [Edit Mat, one t] Honan.

Online backup for my photos and other data was my initial thought, but I'm afraid it would cost too much to do a restore- if I store 3 TB of photos/documents/etc for 2 years, then have a house fire (local backup destroyed), I want to be able to restore my data to my new computer as quickly as my Internet connection will let me, and I don't want to be stuck with a huge bill for retrieval on top of all the other expenses relevant to such a disaster. AWS should make the monthly retrieval allowance should roll over and accumulate from month to month, so that I can do occasional large retrievals.

re: "I want to be able to restore my data to my new computer as quickly as my Internet connection will let me"

Really? Why? If you have say 10 years of home pictures/movies, and you know they are 100% safe in Amazon Glacier, why do you need them all on your new computer as fast as possible? I don't understand why its such a rush.

If it's a rush, you pay the fee. If you can afford to wait a month or two or three to get all the data back for free, you trickle your pics/movies back to your new computer one day at a time.

It seems Amazon charges by the peak hour, so if you can throttle your retrieval so that it takes 3 or 4 days to get the data back, the fee would be a lot less.

A 5 GB per hour download would cost $36 for the month. You could download your entire 3TB files in less than a month for $36. So I don't think that's a crazy fee when your computer was destroyed by a fire...

To get your data back in a week requires 17GB per hour, which is $128. Not unreasonable either considering the urgency and the circumstances.

Agreed. Let me add that a lot of us are living under severe data cap or overage regimes, for me and my parents it's 2 AT&T plain DSL lines, each with 150 GB/month free, go over that 2-3 times and you start paying $10/50 GB/month on a line.

So uploading as well as downloading would have to be throttled. But this sounds like a superb way to store all those pictures and movies of the grandchildren, especially for those who don't have a son with a LTO drive ^_^. All the other alternatives are lousy or a lot more expensive.

I do my online backups to BackBlaze. $3.69/month and It backs up both my Internal 256 GB SSD and my External (portable) 1TB HD that I keep all my Aperture "Referenced Masters" on.

A Full Document restore isn't done over the Internet, I have Backblaze fedex me a USB hard drive - though, unless something has gone really, really wrong (Building burned down?) - I have a within-a-week old or so SuperDuper Image of my Hard Drive.

My Use Case for Glacier is Dropping a 10-20 year archive every 5 years. 50 Gigabytes of data will cost me $120 to leave their for the next 20 years. I can make good use of that.

It's Mat. (I didn't recognize the name, but when spelled as Mat, I immediately know "oh, the guy hacked for his three letter twitter handle")

For an allegedly "simple" archival service, that's a bizarre pricing scheme that will be hard to code around. If you wrote an automated script to safely pull a full archive, a simple coding mistake, pulling all data at once, would lead you to be charged up to 720 times what you should be charged!

First, the reason the "peak hourly retrieval rate" of "1 gigabyte per hour" is there in the article is to answer this question. At a relative allowance of 5.12 GB/day and 1 GB/hour transfer rate, that gives you a "peak hourly retrieval" of .79 GB (at 5.12/24, your first .21 is free), and so we multiply:

.79 * 720 * .01

Giving me a little less than $6.

Now, do you think Amazon is likely to think they can get away with selling a service that charges you $22k for a 3TB retrieval?

Second, you have ranged GETs and tape headers; use them to avoid transferring all of your data out of the system at once. [Edit: looks like ranged GETs are on job data, not on archival retrieval itself. My bad.]

At 1GB/hour retrieving a 3TB archive takes 6 weeks...

So multiply it by your download speed. Let me know when you get to $22k.

At 10Gbps I can retrieve 3TB within an hour.

If you can afford 10Gbps to the internet, $22k is probably chump change.

10Gbps EC2 instances start at $0.742/hour. Welcome to the cloud. ;-)

I assume the cost is in retrieval though and counted per the Job Creation API, regardless of whether and how quickly you download the data.

but you're right that the 3TB/hour use-case is very hypothetical. Internet archival is just not suitable for those kind of volumes. I think the point OP was making that mistakes like using archives that are too large, or requesting many at once could cost you a lot.

Well, yes and no.

If you actually USE 10gpbs your data transfer bill is going to be around $167k per month (That's for transferring 3.34PB).

Actually, a bit higher than that since I calculated all based on the cheapest tier EC2 will quote on the web, 5 cents per gigabyte.

For a one time 3TB download to an EC2 instance, priced at the first pricing tier of $0.12/gigabyte, that transfer will cost $360, and take around 40 minutes.

Glacier to EC2 traffic is free if your instance is in the same region as the Glacier endpoint and $0.01/GB otherwise.

Afford a 10Gbps connection? You can buy 1Gbps transit for under $1/Mbps, and much less at 10Gbps. So, with a monthly bill of, say, $5K, for the 10Gbps IP, $22k is not quite "chump change".

I would hope he's talking about direct connect (http://aws.amazon.com/directconnect/) as otherwise, you're correct.

You can afford 1Gbps for as little as $70/mo. Currently only in Kansas City, but it will probably expand to other cities soon.

That's a 1Gbps connection. Try sustaining 1Gpbs and see if you don't start getting nastygrams.

There is a huge difference between a residential connection with a peak speed of X and an X speed connection.

I think you're making an incorrect assumption about which is the most plausible method for calculating the hourly retrieval rate.

The most obvious way to me would be to assume it is based on the actual amount of data transferred in an hour less the free allowance they give you. Which is actually what they say:

"we determine the hour during those days in which you retrieved the most amount of data for the month."

This also ties in with what the cost is to them, the amount of bandwidth you're using.

In your example you would need to be getting transfer rates of 3TB/hr. Given the nature of the service I don't think they are offering that amount of bandwidth to begin with. (I'm sure they get good transfer rates to other amazon cloud services but customers could be downloading that data to a home PC at which point they will not be getting anything even close to those transfer rates)

At that point a bigger issue might be how long it takes to get the data out rather than the cost.

At an overly generous download speed (residential cable) of 10GB/hr your 3TB archive would take over 12 days to download.

Given tc's edits above regarding additional charges for transferring out of AWS I'm starting to change my mind. I still can't believe amazon would ever end up charging north of $20k for a 3TB retrieval but it seems the intended use-case (as enforced by pricing) would be write-once read-never! Other use-cases are possible but as others have noted you would want to be very careful how you go about setting it up to avoid getting some ugly charges.

> It's not clear to me how they calculate the hourly retrieval rate.

Probably based on the speed and the number of arms that the robot has that will grab the right tapes for you :-)

I'm not joking.

Based on the ZDNet article linked elsewhere on the comments, this system does not use any tape at all. It is all commodity hardware and hard drives, pretty much in line with the design of the rest of the services from AWS.

But why there is retrieval delay then?

Having a multi-hour delay in retrieval lets them move it into their off-peak hours. Since their bandwidth costs are probably calculated off their peak usage, a service that operates entirely in the shadow of that peak has little to no incremental cost to them.

If every request is delayed by the same amount, then it just moves the peak. Otherwise it just flattens/spreads it around peak_length + 4.

I'm not talking load so much as network traffic. Companies like Amazon and Google have huge peak hour outbound traffic during US waking hours, and then a huge dip during off hours. If they can push more of the traffic into those off hours they can make the marginal cost of the bandwidth basically zero.

So if you make a request at peak hours (say 12 noon ET), they just make you wait until 11 PM ET to start downloading, shifting all that bandwidth off their peak.

Even if you're just "flattening" the peak, when it comes to both CPU and bandwidth, that's a major cost reduction since their cost is driven by peak usage and not average usage in most cases.

Amazon has ridiculous internal bandwidth. The costly bit is external. The time delay is largely internal buffer time - they need to pull your data out of Glacier (a somewhat slow process) and move it to staging storage. Their staging servers can handle the load, even at peak. GETs are super easy for them, and given that you'll be pulling down a multi-TB file via the Internet, your request will likely span multiple days anyhow - through multiple peaks/non-peaks.

I was referring to the external bandwidth. Even if pulling down a request takes hours, forcing them to start off peak will significantly shift the impact of the incremental demand. I'm guessing that most download requests won't be for your entire archive - someone might have multiple months of rolling backups on Glacier, but it's unlikely they'd ever retrieve more than one set at a time. And in some cases, you might only be retrieving the data for a single use or drive at a time, so it might be 1TB or less. A corporation with fiber could download that in a matter of hours or less.

I get it - but I'm arguing that the amount of egress traffic Glacier customers (in aggregate) are likely to drive is nothing in comparison to what S3 and/or EC2 already does (in aggregate). They'll likely contribute very little to a given region's overall peakiness.

That said - the idea is certainly sound. A friend and I had talked about ways to incentivize S3 customers to do their inbound and outbound data transfers off-peak (thereby flattening it). A very small percentage of the customers drive peak, and usually by doing something they could easily time-shift.

I think the point is that they're trying to average the load across the datacenter, only part of which is Glacier. If they can offset all the Glacier requests by 12 hours, they'll help normalize load when combined with non-Glacier activity.

An uneducated guess? Maybe there is some type of spare storage pool just for staging the glaciar restore requests, and they've done the math to figure out how much space they need on average over time for this. The 24 hour storage expiration probably helps with this and they've calculated how much space they need to have on-hand and for spikes for restore requests and the restore delay helps factor in these demand spikes so they can move storage pools around on the backend if they need additional online storage capacity within the next X hours. Plus there could be limited bandwidth to these back-end archival arrays <-> restore pool hosts to save on cost etc which is also part of the pricing equation/delay time.

Why wouldn't they stage in S3? :)

I suspect it may actually be related to the energy cost of higher data rates on the device itself, rather than the network costs.

IOW, it would require more energy to spin the disks faster, and burst a higher peak rate.

Hopefully, they've also calculated the on-call response time for the tape operator making the drive in to work(or to put the console game on pause and walk over DC). Unless they've come a long ways, robot/library drive mechanisms/belts often need adjustment. Besides, someone has to pick up the LTOs that slipped from the gripper.

According to this post[1] they charge based on how long it takes them to retrieve the data. The hourly retrieval rate would be the amount of data you requested divided by how long it takes them to retrieve it (3.5 - 4.5 hours).

If it takes them 4 hours to retrieve your 3TB, then your peak hourly retrieval rate would be 768GB / hour (3072 GB / 4 hours). Your billable hourly retrieval rate would be 768GB - 1.28GB (3072 * .05 / 30 / 4 hours).

Total retrieval fee: 766.72 * 720 * .01 = $5520.38 (~180x your monthly storage fee)

The pricing appears to not be optimized for retrieving all your data in one fell swoop. This particular example appears to be a worst case scenario for restoration because you haven't split up your data into multiple archives (doing so would allow you to reduce your peak hourly retrieval by spacing out your requests for each archive) and you want to restore all your data (the free 5% of your data stored doesn't help as much when you want to restore all your data).

[1] https://forums.aws.amazon.com/message.jspa?messageID=374065#...

A spokesperson for AWS confirmed this for me for an article [1] I wrote for Wired: "For a single request the billable peak rate is the size of the archive, divided by four hours, minus the pro-rated 5% free tier."

[1] http://www.wired.com/wiredenterprise/2012/08/glacier/

Good catch, in fact it totally fits with the description of the service as a store and forget for compliance and access only a small subset in the case of retrieval requests — for example when storing customer records.

I also must say that the way you calculate the retrieval fee is really looking like black magic at first sight. I hope they will add a simple calculator to evaluate some scenario and provide the expected bandwidth available from Glacier to an EC2 instance.

This has all of the AWS services but not Glacier quite yet: http://calculator.s3.amazonaws.com/calc5.html

From a Wired article citing this post[1]

'Update: An Amazon spokesperson says “For a single request the billable peak rate is the size of the archive, divided by four hours, minus the pro-rated 5% free tier.”'

This seems to imply the cost is closer to 4k instead of 22k.However, the spokesperson's statement seems to describe intended system performance , not prescribe the resulting price. So if it actually does take them an hour to retrieve your data, you might still owe them 22k


its not reasonable to draw conclusions from the parent without comparing to the cost of archiving and retrieving 3TB on other services.

Seems this is the right formula for estimating monthly cost:

0.01S+1.80R.max(0, 1-0.0017S/D)

S is number of GB stored.

R is the biggest retrieval of the month. Parallel retrievals are summed, even if overlap is only partial.

D is the amount of data retrieved on the peak day (≥R)

e.g., for 10TB storage, max. 50GB per retrieval, and max. 200GB retrieval per day: $2188.20 / year


I'm not seeing this at all for my use case. Unless I've figured it wrong, if I were to use this for an offsite backup of my photos, my ISP's Acceptable Use Policy limits my rate enough that I'm seeing only about a 10% penalty beyond normal transfer costs. See http://n.exts.ch/2012/08/aws_glacier_for_photo_backups for some sample "real-world" numbers.

3TB is a huge archive. I'm also not sure about your maths, billable peak hourly chiefly. [ed: dot multiplier for formatting]

Let's run 100GB, X. Allowance limit: 100GB . 5% is 5GB/mo, or per day, 100GB/(2030), 0.166GB/day; X/600.

Hourly rate necessary for a sustained 24 hour cycle of 100GB is: 100GB/24hr, or 4.166GB/hr, X/24. Peak hourly, this.

To determine the amount of data you get for free, we look at the amount of data retrieved during your peak day and calculate the percentage of data that was retrieved during your peak hour. We then multiply that percentage by your free daily allowance.*

To begin all that's stated here is, break your data-retrieval out over a day. Their example:

you retrieved 24 gigabytes during the day and 1 gigabyte at the peak hour, which is 1/24 or ~4% of your data during your peak hour.

We're doing 4.166GB in the peak hour/100GB in the peak hour, or ~4%.

X/24 / X = 1/24 = ~4.1666666% if you don't fuck your meteringly up.

We multiply 4% by your daily free allowance, which is 20.5 gigabytes each day. This equals 0.82 gigabytes [ed: free allowance hourly]. We then subtract your free allowance from your peak usage to determine your billable peak.

Free allowance hourly rate: 4.16666% . 0.166 = 0.006666, or (X/600/24), X/15000. A is at (12 . 1024)/15000, or indeed 0.8192 free, to verify.

billable peak hourly is then: hourly peak rate - free rate, 4.1666 - 0.00666 = 4.160, or (X/24) - (X/600/24) or (X-(X/600))/24 or (599X/600)/24 or simply, billable peak hourly will always be for sufficiently non-incompetent implementations: ~0.0415972222X. Always.

Let's check: 100GB . 0.041597 = 4.15970. Cannot compare to amazon, because their hourly rate is calculating a 24GB of 12TB archive download, but, 1-0.8192 still checks out. It would be 511.14666666 if their entire set, or (12 . 1024)/24 - 0.8192, 511.1808GB/hr peak hourly (nice pipes kids).

Retrieval fee is then, 0.041597X . 720 . tier pricing, and tier pricing I really do not understand the origin of at all but all examples seem to be 0.01. So, $29.95/100GB. For 12TB, say hello to $3680.25599 transfer fee. 3TB is $920.064.

720 . (599X/600) /24 /100, so for the transfer of your entire set X GB of data, evenly done across the day, you will be charged: (599X/600).(3/10)$,

0.2995$/GB to pull data out in a day.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact