
AWS Customers Rack Up Hefty Bills for Moving Data - ballmers_peak
https://www.theinformation.com/articles/aws-customers-rack-up-hefty-bills-for-moving-data?pu=hackernews7d0vdx&utm_source=hackernews&utm_medium=unlock
======
mostlystatic
I recently started running more Google Cloud VMs in the UK instead of the US.
It was costing me £2 a day to run the VMs, but somehow I also paid £12 a day
for "GCP Storage egress between NA and EU”.

Turns out that by default Google's Docker container registry only stores the
Docker images in the US. So each time I launched a VM the Docker image was
downloaded from the US. I wrote more about it here:
[https://www.mattzeunert.com/2019/10/13/reducing-docker-
image...](https://www.mattzeunert.com/2019/10/13/reducing-docker-image-size-
and-cutting-gcp-cost.html)

The billing interface didn't show that the Cloud Storage cost was related to
the Docker images. I was investigating my normal Cloud Storage use, but it
didn't explain why I was being charged so much. Only after a few days did I
get the idea that it might be the Docker images that were causing it.

~~~
merb
actually there is a eu mirror for the gcp registry eu.gcr.io

~~~
chatmasta
That seems like the kind of thing that should be configured by default on
servers in the EU.

~~~
sieabahlpark
It's in their docs

~~~
freehunter
Their docs say it’s configured by default on EU servers?

~~~
skj
You say what image to use. There is no default.

------
gruez
Is there a plausible explanation why egress fees from cloud providers costs
around $0.1/GB? "Traditional" server providers such as Hetzner are able to
offer bandwidth at orders of magnitude lower price (eg. $1.1/TB). I understand
that cloud providers may have better interconnects or better uptimes, but that
doesn't justify the magnitudes higher pricing.

~~~
ben509
Disclaimer: I worked on an AWS service team.

This is, oddly enough, similar to a debate people have about consumers TV or
Internet: should pricing be "unlimited" or "a la carte"?

AWS is combining all your networking charges into one lump "outgoing data
transfer" fee. So it's heavily marked up in comparison to what they're paying
for the outgoing data transfer, and you're not sure how much is profit vs.
whether it's going to cover all their other costs.

So it might be fairer if AWS broke out separate line items for internal,
incoming and outgoing data transfer, plus all the additional systems a
customer uses.

I think AWS's billing is probably already on the falling side of diminishing
marginal returns. That is, it's complex enough that more information would
tend to hinder customers from getting the best price. Right now, if I plan to
reduce my data charges, I have one variable to tinker with. If we expand this,
it would mean I'm having to balance incoming / internal and outgoing charges.
That sounds simple, but in terms of engineering it can be very complex.

The next claim is that this biases customers not to move. Of course, Azure and
GCP have the same arrangement, so while you pay to move _out_ of AWS, you
don't pay to move _in_ to Azure or GCP. So all the vendors are attempting to
lock you in to their product, and at the same time trying to extricate you
from their competitors, overall it's a wash.

So, yes, part of the motivation for egress charges is that ingress is a loss
leader. But it's also true that egress is a metric that does, for the vast
majority of their customers, directly translate into customer value. If
there's a compelling case for doing it differently, someone should do it and
see if it works.

~~~
throwaway_bad
> If there's a compelling case for doing it differently, someone should do it
> and see if it works.

Cloudflare doesn't charge for bandwidth. I always throw cloudflare on top of
anything I do, not because I really need a CDN or anything, but because the
bandwidth cost would bankrupt me otherwise. The ceo of cloudflare gave the
rationale on why they don't charge:

> There’s a fixed cost of setting up those peering arrangements, but, once in
> place, there’s no incremental cost. That’s why we have similar agreements to
> Backblaze in place with Google, Microsoft, IBM, Digital Ocean, etc. It’s
> pretty shameful, actually, that AWS has so far refused. When using
> Cloudflare, they don’t pay for the bandwidth, and we don’t pay for the
> Bandwidth, so why are customers paying for the bandwidth. Amazon pretends to
> be customer-focused. This is a clear example where they’re not.

[https://news.ycombinator.com/item?id=20791563](https://news.ycombinator.com/item?id=20791563)

~~~
qes
According to Cloudflare, they do not have any bandwidth pricing arrangement
with Microsoft for Azure users.

They also do charge for Enterprise plans, but instead of transparent pricing I
got high-pressure sales techniques and black box pricing offers - which then
anchored our rate so that as we grow past our current contract, we're forced
to upgrade at any point with pricing based solely on our original negotiation.

Frankly, while I save money using Cloudflare over Azure's CDN right now, it's
left a very sour taste in my mouth and I'll be jumping their ship as soon as I
have time to find a suitable alternative.

~~~
bam_boo
> high-pressure sales techniques and black box pricing offers - which then
> anchored our rate

If you have the ability to shift your entire enterprise CDN away from them,
why not first try renegotiating?

------
chickenpotpie
I've actually been working on a library to help mitigate cloud storage lock-
in. The idea is to treat cloud storage providers like disks are treated in
RAID. For example, you have 3 separate cloud providers. Cloud providers 1 and
2 have every other byte of data striped across them. Cloud provider 3 has
parity data. To pull a file you can only need 2 of the 3 cloud providers. If
you don't like how a cloud storage provider is treating you or charging you
just pull from the other 2 providers and use them as a backup in case one goes
down. You can also just remove them entirely from the equation, but then you
have no redundancy if one of the others goes down. It gives you a lot of
negotiating power to lower egress costs because you can just pull them out of
the equation at any time and reinstate them once you get better pricing.

~~~
pepemon
Sounds exactly what is Tahoe-LAFS for.

~~~
chickenpotpie
Yeah it's pretty similar. However, I'm focusing entirely on the cloud, keeping
the package lightweight, and giving the consuming application decisions on how
to store the data based off their needs.

------
tyingq
_" The charges don’t appear to be a case of cloud providers gouging their
customers"_

I disagree on this one. The margins on egress are, well...egregious.

~~~
partiallypro
If I'm not mistaken AWS has the highest egress rates of the major cloud
providers.

~~~
tyingq
All of them (AWS, GCP, Azure) are priced outrageously compared to high
quality, well peered, bandwidth.

~~~
q3k
GCP is pretty good though. Cold-potato, very fat backbone, and very good
presence at a ton of PoPs. When using GCP you basically have the same global,
high-bandwidth direct connectivity presence that Google uses for its products,
and that is very difficult to match by traditional T1 ISPs.

~~~
heleninboodler
Ok, help me out: "cold-potato"? :D

~~~
lokar
Vs "hot potato"

Hold onto the packet for as long as you can vs hand it off to your peer as
quickly as possible.

Most networks do "hot", Google does "cold" since their network is almost
always better than that of the peer.

~~~
Jarwain
The origin of which, for those who aren't familiar, is a game called "hot
potato" where you try to pass a ball around as quickly as possible as if it
was a hot potato

------
pmoriarty
I recently saw a talk by a couple of former Google employees who have a
business helping cloud customers save money, and they were saying that they
see a lot of money being wasted by companies in overprovisioning and
neglecting to shut down or delete unused resources like VMs or virtual disks.

Some of their advice for saving money was to keep track of who created each
resource and why, so there's less reason to doubt whether an apparently unused
resource can be deleted, and to make some limits regarding how many resources
can be automatically created (especially in dev environments). Some other
ideas were to look for signs of inactivity like low CPU or bandwidth use, and
consider deleting such little used resources.

There was much more to the talk, but those were some of the highlights that I
can remember without digging out my notes. It was a good talk.

~~~
spike021
Makes sense. If you're not properly tagging your resources then it can be very
hard to track down if/where it's been used, how often, or when it was used
last. You can automate/template as much orchestration as you want with stuff
like Terraform to bring up well-defined resources, but there will always be
outliers without tagging.

------
ceejayoz
> A person close to AWS said its data transfer charges reflect a range of
> technology costs customers would normally pay if they weren’t using cloud
> services, including fiber optic connections, networking hardware devices and
> software, cybersecurity services and network monitoring software.

I'd believe this more if the pricing of bandwidth on AWS hadn't stayed pretty
flat since its launch.

Plus, it's frustrating that AWS Lightsail
([https://aws.amazon.com/lightsail/pricing/](https://aws.amazon.com/lightsail/pricing/))
offers a $3.50/month plan with 1 TB of transfer. That terabyte alone will cost
you $92.07 on a normal instance, and the $3.50 includes storage and an
instance!

------
prolificd
It does appear like price gouging. Once you're in, you're locked in.

Any chance some of those high costs are due to movement of data due to GDPR
compliance? Maybe Apple did all its prep in 2017.

Also there are some misconceptions on inter-AZ data transfer as well:
([https://www.lastweekinaws.com/blog/aws-cross-az-data-
transfe...](https://www.lastweekinaws.com/blog/aws-cross-az-data-transfer-
costs-more-than-aws-says/))

~~~
mitchs
I assure you there is nothing sinister about the price asymmetry. Fiber and
routers have as much bandwidth coming as going. The inbound traffic is mostly
composed of little http(s) requests. The outbound is full of images and
mountains of JavaScript. Cloud providers don't charge for ingress because they
got it for free when they grew their egress capability to meet demand.

~~~
ec109685
When was the last time AWS reduced egress rates?

And they could negotiate the best rates on the planet given their scale.

~~~
riking
I get the impression cloud providers don't "negotiate rates" but rather "build
infrastructure" most of the time.

~~~
dx034
They don't own many of their data centers. Locations were leaked a while ago,
most are colocated in major data centers. They won't dig new trenches and lay
cables between those but rather negotiate rates for existing fiber.

------
mattbillenstein
Fairly large optimization if you're smaller and a large amount of your data
out is cachable is to run a varnish cache on some of the clouds that give you
"free" bandwidth.

ie a $20/mo instance on linode gets you 4TB of transfer -- $0.005 per gig.
Scale enough of these in various DCs around the world and you have a pretty
cheap self-hosted CDN.

~~~
buboard
a $20/mo old server from hetzner gives you unlimited 1Gbps - thats 324TB /mo

~~~
mattbillenstein
Good deal - I do most of my work in the US, I don't know of a similar deal
here.

------
jedberg
Maybe I missed it but where did they get this info? Was it stolen from AWS?

As far as I know, Apple would _never_ share cost data like this, nor would a
lot of others on this list. Apple doesn’t even publicly acknowledge that they
are customers.

~~~
ghaff
>"The chart, which is based on internal AWS sales figures obtained by The
Information,"

I'm not sure that "stolen" is the right word but, yes, it certainly appears
based on what The Information wrote that either someone at AWS or a third-
party with access to the info (probably not too likely) leaked the
confidential numbers. Disgruntled former employee or... Who knows.

------
js4ever
I have recently talked with a friend about a very cheap way to egress from
AWS, at $2.5 per TB with the AWS lightsail $5 instance which include 2TB of
egress per instance. This friend have extracted 200TB of backups for $500
instead of $17,000!

~~~
crazysim
That's against Lightsail TOS, if you care.

~~~
ValentineC
To save everyone else a lookup:

> 66.3. You may not use Amazon Lightsail in a manner intended to avoid
> incurring data fees from other Services (e.g., proxying network traffic from
> Services to the public Internet or other destinations or excessive data
> processing through load balancing Services as described in the
> Documentation), and if you do, we may throttle or suspend your data services
> or suspend your account.

Source: [https://aws.amazon.com/service-
terms/](https://aws.amazon.com/service-terms/)

------
yeldarb
Wow, does it really cost 4-5x more to download a 1 GB file from S3 or Google
Cloud Storage than it does to store it for a month? That is mind-blowing.

~~~
qvrjuec
Bandwidth is expensive, storage is cheap.

~~~
throwaway2048
cloud providers _charge_ a lot more for bandwidth yes, but in reality
bandwidth is much, much cheaper than storage for them to provide.

------
united893
Just curious, how on earth did Kevin and Amir get such incredibly detailed of
AWS spending of these accounts.

Either they put it together from public resources, or (more worrysome) someone
at AWS leaked them line-item based expenses of their top customers.

This has got to be incredibly sensitive data for AWS, not something they'd
want leaked. If I'm (say) AirBnB or Snap, I'd worry that this data leaks
information that would be detrimental to their ability to negotiate for cloud
computing with Google, Microsoft etc.

~~~
milesskorpen
The article specifically says its based on internal data.

------
DoctorOetker
does anyone know of a cheap service to order large amounts of data by mail on
physical media?

I am often interested in some dataset for which I can afford the storage in
the form of a hard drive but not in the form of a download through my home
connection.

If a service existed that simply offered the following:

* customer provides URL (and optionally hash checksum)

* customer pays, and later receives hard disk drive / SSD drive with the download contained

* democratic pricing for the media, or alternatively send your own media (hence at twice media shipping cost...)

* possibly eventually local brick and mortar locations / affiliate locations to drop off and pick up media, in the larger cities

* preferably without account, although an account is not a large impediment

* definitely not coupled to a financial account in a credit fashion, i.e. no qualms with topping up the account, but the service should not be able to withdraw money from my financial account. i.e. debit only (like the typical european bank cards, yes I know credit cards are available in europe as well...)

This would seem like a profitable side business for many programmer types who
have high bandwidth connections (or have access to them and are allowed to use
the connection for this purpouse).

If someone builds the software stack for a main portal such that affiliates
can advertise their physical location, and their pricing for media, for
download and for copying to media, then customers could compare and choose on
the basis of price.

~~~
chillfox
I don't think it would be a cheap service if you set a modest goal of earning
a cinema trip with popcorn and a drink for every trip to the post office.

1TB HDD: 59 AUD International Shipping (assuming we can keep it under 1kg with
packaging): 38 AUD

That is 97 AUD before considering any profit you might want to make (35 AUD
for the movie + small popcorn and drink), fixed setup costs like a NAS to
cache data sets or the bandwidth costs.

~~~
DoctorOetker
If someone hails a cab on uber, one doesn't hail a cab from a different
continent...

there seems to be plenty of opportunities here, like buy or rent an old bank
building with the individually lockable drawers, put your drive in the locker
which has a USB cable or ethernet cable, close the locker, use some app to set
the URL / hash, pay, and you get a ETA, download complete notification, and a
deadline to pick it up (or else incur a fee to unlock the drawer proportional
to overtime).

I guess the idea could be pitched to those operating rentable local PO boxes,
using similar lockers, but with internet connectivity.

------
balozi
The only thing worse than the data transfer costs is explaining them to your
non-cloud-savvy stakeholders. Go ahead, try explain this sh*t to your board.

------
kresten
AWS Azure or Google cloud data transfer: 9 cents per GB

Digital Ocean: 1 cent per GB

The really weird thing is this should be the absolute lead on all Digital
Ocean marketing but they don't even mention it. It's their single biggest
selling point.

------
skunkworker
For a service I built on AWS lambda, the far majority of the cost was just
data egress. Unfortunately I haven’t found a competitor that offers the same
kind of features that I’m using lambda for (running golang with additional
custom binaries)

~~~
cosmie
Azure is part of Cloudflare's Bandwidth Alliance[1]. If you use Azure's
serverless functions and put your service behind Cloudflare, you'll get free
egress.

[1] [https://www.cloudflare.com/bandwidth-
alliance/](https://www.cloudflare.com/bandwidth-alliance/)

~~~
skunkworker
I was aware of the bandwidth alliance, though looking on their KB for now it's
only discounted egress [1]. Same with Google, though not part of the bandwidth
alliance, data delivered through their CDN Interconnect program starts at
$0.04/gb (NA) [2]

[1] [https://support.cloudflare.com/hc/en-
us/articles/36001614391...](https://support.cloudflare.com/hc/en-
us/articles/360016143912-How-do-I-get-discounted-data-transfer-with-my-cloud-
provider-)

[2] [https://cloud.google.com/interconnect/docs/how-to/cdn-
interc...](https://cloud.google.com/interconnect/docs/how-to/cdn-interconnect)

------
eric_b
It's nickel and diming all the way. I always thought the S3 pricing was sort
of ridiculous. Pay for the amount you store (fair). Pay for the egress data
(less fair but ok...). Pay for the number of API calls! (lol!)

~~~
spullara
It wasn't true at first but people abused the hell out of it, for example, by
storing data only in the filename. The API cost prevents very inefficient use
of S3.

------
Elect2
Data transfer fee is the biggest reason stopped me from using AWS/GCP/Azure.
Anyone using Oracle Cloud? I found their bandwidth is really cheap.

~~~
satanspastaroll
The thing about that is that it's Oracle. You might survive stuffing your hand
in a hornet's nest, but it's unlikely.

------
rhizome
Cloud providers can launch and advertise and advocate every little product
they're able to refactor out of the prevailing development and systems
engineering practices, but transit remains expensive, and basically
consistently so. Not only that, but AWS transit is like 20x the price of
boring old hosting providers, providers who don't also charge for egress.

------
Merrill
My GP's practice went to a SaaS provider for electronic health records. Never
mind transmission costs -- there is no standard way to export their patient
records in order to move to a different provider. It requires a custom
consulting services contract with their existing provider to retrieve their
patient records, at an exorbitant price tag.

~~~
cabaalis
This is quite typical. Even with MIPS PI measures and the push for
interoperability. Getting patient data in bulk out of a hosted EHR can be a
frustrating exercise. What they are doing is holding the providers' data
hostage, in the name of HIPAA and security.

------
lacker
Even the huge-looking bill for Apple is only 6.5% of their total bill. Moving
data into AWS is free, so it's not totally surprising that they pay for that
by making other stuff more expensive.

I found it more interesting just to see a list of their top ten customers. In
particular I didn't realize that Capital One had so much infrastructure.

~~~
notyourday
> Even the huge-looking bill for Apple is only 6.5% of their total bill.
> Moving data into AWS is free, so it's not totally surprising that they pay
> for that by making other stuff more expensive

Moving data into any network that is outbound heavy is free because both paid
peering and transit is settled based on a peak percentile traffic (unless it
is flat rate).

That's why the "gansta" position is to have a balanced in/out for any network
as in that case you get to effectively double charge for the same pipes --
your eyeball heavy customers pay for incoming and your web farm customers pay
for outgoing.

~~~
lacker
This doesn't deserve to be downvoted, it makes a good point! I had not thought
about the likely-to-be-outbound-heavy nature of cloud providers and how that
affects things.

------
notyourday
Cross AZ traffic:

[https://www.lastweekinaws.com/blog/aws-cross-az-data-
transfe...](https://www.lastweekinaws.com/blog/aws-cross-az-data-transfer-
costs-more-than-aws-says/)

No connection to the blog.

------
chriselles
It sounds very similar to physical documents/records storage service “roach
motel”-like fee structures.

Make it nearly free to move documents in.

Charge monthly rent per box of documents.

But charge like a wounded bull if customers try to permanently remove
documents.

------
minitoar
We mostly deal with this by keeping the data inside an AWS/Azure region as
much as possible. You pay to get it in there, and you pay for storage, but you
can access it for free within the same region.

------
toomuchtodo
Anyone (no longer under NDA) know if Apple intends to migrate user iCloud data
from AWS and GCP object stores to their own data centers at some point?

~~~
whoisjuan
> no longer under NDA

Do NDAs expire? AFAIK you sign an NDA and you’re bound to respect that
forever. Unless the knowledge becomes publicly available.

~~~
brightball
IANAL

I have spoken to a lot of lawyers about contract terms and what I've been told
for both NDA and Non-compete style agreements is that they have to have very
clearly defined scope to be enforceable.

For a non-compete for example, it requires distance and specific field with a
clearly defined time period (that's reasonable).

For an NDA you have to ensure that the covered information is explicitly
labeled (which is why people have confidentiality labels in email footers) and
that there is a defined time to expiration. An NDA without those two criteria
place an undue burden on a person who is not being compensated for their
compliance.

Reasonable time period is generally 2 years.

~~~
brodouevencode
This is typically true of all contract law. "In perpetuity" is not a thing and
if it is, it usually nullifies the contract.

------
3wolf
The $0.01/GB inter-availability-zone data transfer cost can be a killer for
misconfigured workloads. It's a shame AWS doesn't make this more clear on
their pricing page. I've seen people run Spark clusters across multiple AZs,
incurring a huge costs whenever a shuffle happens.

------
wodenokoto
What's up with Apple's spend? Did they move most of their data out of AWS?

~~~
outworlder
My take is also that they were moving out. The article does acknowledge that
they spent a fraction of that on the next year.

------
vetrom
I've worked at at least one company where we found using Direct Connect to a
data center somewhere and egressing via that data center instead of amazon
saved us significant bandwidth costs.

~~~
altmind
amazon still charges for traffic through direct connect(50%), dont they?

~~~
vetrom
2cents per GB vs 9cents per GB.

------
jrochkind1
Where is the article getting these numbers from? Did they say and I missed it?

------
lazyant
where's Corey Quinn

------
Polyisoprene
Cheap shovels and expensive ladders.

~~~
jrockway
Amazon has gone one step past that and made the shovels expensive too. The
fear of buying a computer has made them billions.

~~~
wuliwong
>The fear of buying a computer has made them billions.

Are you saying that AWS service aren't providing value to their customers but
it's fear-based marketing?

~~~
frenchyatwork
I think you read too much into that comment. AWS didn't have to market the
fear into the customers. People are naturally afraid of things. AWS does
provide value, but it comes at a steep cost.

------
lykr0n
The cloud is like the Hotel California. Can checkout anytime but never leave.

