
iOS 6.0 Causes CDN Overages - woodhull
http://labs.prx.org/2012/11/14/ios-6-0-devours-data-plans-causes-cdn-overages/
======
robomartin
I have never considered Apple to be a solid software company. They are a
marketing company in the tech sector. If they made washing machines they'd be
sexy as hell. In my years of exposure to Apple tech I have always been in awe
of their industrial design, mechanical design and manufacturing prowess. Their
software, however, often leaves much to be desired. I say this both as a
developer and a user. From OSX to iOS there are amazing patches of software
manure in plain sight. And it isn't getting any better.

~~~
pretoriusB
> _I have never considered Apple to be a solid software company._

What does that even mean? That they make bad software? For one, they make the
most popular certified UNIX OS in the world, and one of the two most popular
mobile OSs. Their professional offerings are great too: Logic Pro, Aperture,
Final Cut Pro, etc. In fact, FCPX aside, they are professional staples, with
few competitors.

And huge number of desktop and the majority of mobile users use a browser they
developed to the best game in town starting from humble open source origins
(khtml -> webkit).

And that Clang thing, that FreeBSD recently adopted? Their work too, along
with other LLVM infrastructure.

> _Their software, however, often leaves much to be desired. I say this both
> as a developer and a user._

Care to mention any substantially better mobile OS than iOS?

(One might argue that Android is better. But substantially better, no way in
hell).

~~~
devcpp
Popularity means good software? Windows must be the best OS by very far then!

Regarding mobile OS, we all know this now depends on the number of "apps"
available. Many may have made a superior OS, but adoption has been too low
because of the lack of apps.

~~~
pretoriusB
> _Popularity means good software? Windows must be the best OS by very far
> then!_ Yes, windows too, has had historically delivered the things its users
> cared about.

"Good software" is not some mystical status of code perfection, it's doing the
things its users want it to do. So, yes, popularity equals good software.

Software, as any engineering effort, is measured by its utility and use. It's
not like art where popular is not necessarily good. Unless you are some kind
of dreadful "code poet", whose programs nobody uses.

But notice how I didn't just say that Apple's software is good because it's
popular. For example, I challenged anyone to mention a "substantially better
mobile OS than iOS".

> _Regarding mobile OS, we all know this now depends on the number of "apps"
> available._

No, even without apps, the basic OS is just as important. iPhone, at it's
introduction, had by far the best OS of every other smartphone out there, even
without third party apps at all.

> _Many may have made a superior OS, but adoption has been too low because of
> the lack of apps._

Really? Disregarding apps at all from the comparison, do you care to mention
one "superior mobile OS", let alone "many"?

------
untog
This is where Apple's closed nature is very annoying. Assuming the article is
correct, it was bug introduced in 6.0, and fixed in 6.0.1. So Apple knew about
it, investigated it and published a fix.

Yet they didn't tell anyone about it. Or the crippling effect this bug could
have on their data allowances. Why not? Are they really that terrified of
admitting to making a mistake?

~~~
rachelbythebay
This same behavior has been happening to me since before iOS 6 came out. I can
show the same "206 storm" happening as far back as August of last year with
iOS 4.3.5.

~~~
randall
I used to work at <http://castfire.com/>, and we've seen that similar issue
from a LOT of iOS clients. Stats become increasingly difficult, because we saw
one unique request bounce with upwards of 50+ 206s for similar / overlapping
media... and then it bounced through a cell provider, to a WiFi network, and
back, etc.

(we identified by redirecting all requests through a complex process of
assigning an ID to them)

Hoping this is fixed. Podcast metrics are tricky, and this has been
exacerbating the issue.

------
Ensorceled
This is the reason Forstall has left the building, iOS 6 is a disaster:

    
    
        * Maps was not ready for prime time (so bad it required a public apology)
        * Podcast was not ready for prime time.
        * Passbook was not ready for prime time.
        * App store is a step backwards.
        * Battery and WIFI issues.
        * Soooo many bugs.
    

There are usually problems but this is the first time I've seen so much utter
crap shipped including tons of minor bugs and issues (well Mobile Me was
pretty bad :-)

I'm using Audible, Downcast and the built in Music app where I used to just
use the Music app. iOS 6 was a huge step backwards for me.

Rather than waiting until it's ready (think copy/paste on iOS) Forstall was
responsible for shipping a mess. Hopefully this will fix itself with new
leadership.

~~~
gurkendoktor
How do you know that Forstall was the person advocating a yearly OS upgrade
cycle? It seems just as likely to me that this rigid schedule was forced onto
him and he couldn't deliver. Things like OS X Lion (a huge change under the
hood) and the Maps places database just aren't things that can be built to a
deadline.

~~~
Ensorceled
Well, a mess was shipped and he was the guy in charge. So I'm assuming he was
the guy responsible. If a mess really was forced upon him he should have
resigned instead of sticking around to ship the mess.

I have in that exact same situation, "This will not be associated with me, I'm
out"

~~~
gurkendoktor
Now you are talking about saving face, not about saving iOS. Maybe he _was_
the best guy for the job and he still couldn't do it.

------
dageshi
Is it just me or are apple making more dumb mistakes lately? There's this and
that ajax caching issue from a month or two ago.

Perhaps I'm just having selective memory?

~~~
danso
I dunno, how crummy the App Store has become is a perfect example of this. The
App Store is one prominent way they make loads of money and build loyalty to
the ecosystem, and yet it is a shitshow on IOS6...the responsiveness is as bad
as I've seen on non-iOS software, but for me, the two really inexplicable
failures are:

1\. the use of modals to show individual apps on the iPad. So if you get to an
app page and accidentally touch outside the modal, the app page disappears and
you have to go back to Safari to find the link that brought you there.

2\. On my brand new iPod, the Genius View has never NOt crashed. I know Genius
(and most of Apple's search services) are poor to begin with, but it's a
prominent button on the nav. How does it just not work on a brand new device?

3\. I'm also guessing the lack of a light sensor on the new iPod was the
result of a screwup in the dev cycle...I would've never guessed that its
absence would be such a deficiency but when you can't see a single thing on
the iPod merely because you stepped from indoors to outdoors...to the point
where you have to go back inside to adjust the brightness...that's just
infuriating.

I can't remember the last time I've been infuriated by an Apple device...I
feel infuriated all the time with my expensive Sony camera and how its video
button is right where you can accidentally bump into it when shooting a
pic...which prevents you from shooting a photo. The last few Apple devices
I've had (two laptops, 2 phones, 2 iPads, countless iPods) have always been
satisfactory in terms of polish...not iOS6 nor the new iPod touch

~~~
scrumper
Your Sony: is it an NEX-7? There's a firmware update just out which fixes the
video button problem: <http://blog.sony.com/alphafirmwareupdates>

~~~
danso
You have made my day...that blasted button is so clearly in the wrong spot
that every time I bump it I secretly hope the Sony engineer responsible has
been fired

------
chrisrhoden
Author of the post here, happy to answer any questions. I'm pretty confident
in our findings here, and some of the behavior is just bizarre.

~~~
sigzero
Does 6.0.1 fix it?

~~~
binarycrusader
From the fine article: "We have been unable to reproduce the issue using iOS 5
or using iOS 6.0.1".

So the problem's already been fixed as far as they can tell.

~~~
kookster
True. The problem is that we are seeing > 20% of users still on iOS 6.0.0, and
still chewing up bandwidth. Upgrade, please upgrade.

------
riobard
Hmm, so this explained my 3G data overage last month. I have 6GB/month of
data, and usually I'd be using around only 700MB. But last month my iPhone 4S
(running iOS 6.0) consumed nearly 8GB in total, resulted in quite some overage
fee.

Should I ask Apple for a refund for this? :|

~~~
malyk
Last month I moved and didn't have an ISP at home for the first 3 weeks or so.
I set up tethering on my phone and a 5GB data plan. Yesterday I got a notice
that I was over my data limit. I haven't done a whole lot of web surfing in
the new house, but I have listened to a ton of podcasts using the Podcast app
which I have never done before.

So if I just blew through 5 GB of data in 2ish weeks and the only thing
different was using the podcasts app...

(I did watch 3 episodes of walking dead on Netflix which obviously contributed
to the data use, but a 45 minute episode can't be that big)

~~~
burriko
Netflix's 720p streams use 2.3GB/hour, which would certainly put you over 5GB.
I'm not sure about their lower quality streams though.

~~~
malyk
Hmmm, maybe that is it then. I'd be surprised if I was getting that data rate
on my 3G connection (iphone 4s, at&t).

I'm still shocked it happened so quickly though. I guess 5 GB isn't what it
used to be! ;)

------
noinput
Same here, noticed a huge spike in bandwidth for my startup's shows hosted on
CloudFront. 206 responses are flooding my logs and I moved everything off of
CloudFront for my users to triage the cost. This breakdown explains it all:
<http://cl.ly/image/2H3u3u2O333g>

Talk about a pain, not to mention the 10x bill from Amazon that came as a
surprise.

~~~
donavanm
"I moved everything off of CloudFront for my users to triage the cost."

Can you explain more? Was there some cloudfront behavior exacerbating the
issue? AFAIK this is just a client (mis)using range requests and CloudFront
replying as requested.

~~~
noinput
You are correct and I should have wrote that note differently. I moved some
high traffic files to some endpoints I have where bandwidth isn't a concern to
change the formula a bit (primary) and the rest are back to using basic S3 for
now. The bandwidth is still being consumed, however now it's easier on the
wallet.

After looking more closely at the differences for pricing between
S3/CloudFront I realize it's appears cheaper to be on CloudFront. I assumed
wrong that CloudFront data-out was billed in addition to S3 because of the way
it's notated on the usage page ("AWS Data Transfer (excluding Amazon
CloudFront)"). To prove it I ran two reports on CF (left) and S3 (right):
<http://cl.ly/image/0q0y3u2X0g1a> and you can tell where I made the change, as
well data is only counted on the service in question and not on both.

Can anyone else confirm the above?

~~~
donavanm
Yeah, that's a bit confusing. Using Cloudfront with an S3 bucket is double
billed, the first time. On a cache miss Cloudfront pulls the object from S3
and serves it to the client. S3 bills the regular data transfer out.
Cloudfront then bills the regular data out to client rate.

On subsequent requests, cloudfront cache hits, you're only billed by
Cloudfront. Cloudfront request + byte rates are cheaper than S3 in Us-east-1,
IIRC. So on popular or high ttl objects it's cheaper to serve through
Cloudfront. On low ttl or low rps, like a few requests per day, it's cheaper
to use standalone s3.

The same origin + CDN vs CDN Hit math applies to EC2 as well. I do wish the
billing was clearer in these scenarios.

~~~
noinput
Thank you so much for breaking that down.

------
owenfi
Our very small outfit at verbmill.net is seeing similar signs. For instance,
700MB+ downloaded by a single client when an individual podcast is a maximum
of ~50MB. Checking my logs now to see if I can narrow down device/version
number. All of our files are hosted on S3.

~~~
owenfi
Ominous example.

    
    
      Reqs	Total	Avg	B/s	Client
      666	2G	3M	127K	mobile-​198-​xxx-xxx-​088.​mycingular​.​net.​
      666	AppleCoreMe​dia/1.​0.​0.​10A403 (iPhone; U; CPU OS 6_​0 like Mac OS X; en_​us)
    

I hope this user is on an unlimited plan :(

~~~
fruchtose
Wow, that is truly unfortunate. How long did it take for that 2GB to be chewed
up?

------
erickhill
I believe this is directly related to phone data usage overages as well.
Having used an iPhone 4 on ATT for 2 years, I could almost guess my data usage
perfectly each month.

By the 2nd day of using my new Verizon iP5, I was receiving warning texts from
Verizon that I was about to exceed my data limits.

Thinking it was me, I increased my plan. That only stopped the bleeding for
another few days.

Oh, and I installed the "fix" that was sent out to supposedly repair the known
issue. (I was always on Wi-Fi networks but still being charged for data.)

I recently called Verizon to dispute the data. They quickly erased all of my
data overages. Snap, like that. The man I spoke with called a technician on my
behalf.

The people who can erase overages do not officially have this issue in their
database as something to be aware of. But the technician he spoke with said
there was absolutely a problem, and both Verizon and Apple knew about it and
were working on it.

Then, I was given a "diamond ticket" to absolve me of any future overages
until a fix is found.

------
fernandezpablo
Makes you think twice about those graphics where iOS devices take a big piece
of the pie of the "internet usage"

~~~
eridius
No it doesn't. Those aren't tracking bytes streaming over the internet, but
rather web browsers visiting pages.

~~~
smackfu
Even ignoring the bytes, it looks like one client may show up in the web logs
as downloading the file a dozen times. If you aren't careful about cleaning up
the stats, that may overrepresent iOS.

------
andrewcastmate
I don't have anything wildly new to add, but I'm part of a podcast hosting
site (<http://Castmate.fm>) and we've seen this bigtime, it's crazy.

We host some high-traffic podcasts with a huge range of listeners, and I've
gone through the logs and regularily see stuff like NUMEROUS listeners making
4000+ http requests in 30 minutes, all the 206 calls everyone here is
reporting, it's just nuts, so bizarre.

I feel terrible for all the people getting hit with overages, it's so gross. I
did some gorilla math the other day and feel pretty confident that this has
boosted our network usage by ~25%, I should have narrowed it down to just the
affected iOS 6.0 devices, must be astronomical.

------
rachelbythebay
This just seems to be how they handle audio media. It's not even anything
specific to podcasts. My own scanner stuff is just a web page, and it gets hit
by this from time to time. Each call is a single file, it's static, and
there's nothing funny going on with it. It looks like the playback stuff is
jumping around in some attempt to figure out exactly what it is first. It
seems like it would be far better if they always downloaded at least some
reasonable quantity and then decided to bail out later if it seemed overly
huge, but this is where we are.

To me, it feels like someone who's trying to be fancy and do things like
seek() in a stream. Just because HTTP has progressed to the point where you
can effectively do that with Range: requests doesn't mean it's a good idea.
Offering seek() type behavior for a HTTP resource seems like a great way to
get people to use it and not even realize what sort of messes they may be
creating. Leaky abstractions, eh?

Here's just one example from last week. Note: the file is 7001 bytes. There's
no reason for this insanity, and yet here it is.

    
    
        x.x.x.x - - [04/Nov/2012:16:02:47 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 2 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:47 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 7001 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:47 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 2 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:47 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 2 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:47 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 2 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:48 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 7001 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:48 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 7001 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:48 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 2 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:48 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 7001 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"
        x.x.x.x - - [04/Nov/2012:16:02:48 -0800] "GET /audio/16-1352070236.mp3 HTTP/1.1" 206 7001 "-" "AppleCoreMedia/1.0.0.10A403 (iPhone; U; CPU OS 6_0 like Mac OS X; en_us)" "scanner.rachelbythebay.com" "-"

~~~
chrisrhoden
Yep, as I mentioned in the article this is the AV Foundation framework -
presumably used in all Apple apps since it's the official API they recommend.
Doesn't surprise me that the browser would be affected.

The weird thing is that, when we first started doing the tests, it would
happen immediately on pressing play. After some period of time running the
tests, it started happening later and later in the process, indicating - what?
No clue.

------
kemiller
This is exactly what I saw with the Podcast app last week. Used up 2GB in
about 6 hours and put me into overage. I bet the class action lawyers are
already circling.

------
kokey
I am anxiously awaiting an update to the Podcast app which fixes the issues
that I am having, but perhaps I'm too optimistic. I listen to some podcasts
from the BBC and the app doesn't recognise that a new daily show with the same
name is new, the only way around it I have found is to delete my subscription
and manually find the podcast every morning. Apart from that the thing crashes
and runs very slowly at times.

------
kannan
Now that explains the sudden spurt in my 3G data usage in the last 3 weeks.
This forced me to switch off my 3G :(. Hope Apple releases a fix soon

~~~
farski
We (PRX) haven't seen any evidence to show that it is still an issue in iOS
6.0.1, so you should be all set if you've upgraded, or an upgrade should fix
it. If you are upgrade already and seeing overuse, please comment.

~~~
kannan
I was (and still) using iOS 6.0.1 when I started seeing the surge in data
usage

------
meej
Hmmm, I had to turn off the LTE on my iPad two days ago because I hit my limit
10 days prior to my monthly renewal. I upgraded to 6.0.1 last week, but I
wonder if this bug explains how I managed to chew through my data plan faster
than ever before.

------
kevingadd
The iOS 6.0 podcast bug apparently caused huge overages for GiantBomb as well.
They're discussing it right now on one of their live streams; it apparently
was so bad that their corporate/finance people are upset with them.

------
bquarant
NoAgendaShow.com has been discussing this since iOS 6 launched. Don't upgrade.

~~~
berberich
As pointed out above, 6.0.1 fixes this, so upgrading shouldn't be an issue.

~~~
bquarant
Good point. But then you have Apple Maps.

------
Mizza
Ouch, elsewhere on the internet it seems like people are actually getting hit
by this. Hopefully damage won't be too bad. Good research, PRX!

------
blinkingled
Wonder if the CDNs don't have any safeguards in place to avoid single client
downloading same resource multiple times?

~~~
chc
That doesn't seem practical. How would they know if it's the same client? Just
watching for multiple requests from the same IP will result in a lot false
positives (e.g. most college students will be SOL). And what would they do if
they could detect it but the client didn't realize it already had the file?
Just drop the request?

~~~
blinkingled
Load balancers can insert cookies to identify clients uniquely. (That solves
the proxy/single-ip-multiple-clients problem - multiple clients from one proxy
IP can still be uniquely tracked this way.) The newer load balancers
(F5/iRules for e.g.) are also capable of doing per resource, per client
throttling using the inserted cookies or session table.

~~~
donavanm
Server side state per client is insane. Pushing it client side would be what,
a kb? Wih an average object size of 50kb that's 2%. I doubt CDN customers
would be willing to take a 2% cost hit for this feature.

Now, assume we solve all that. The state is pushed on the client. The
performance hit is so small as to be free. And the CDN eats the dev and opex.
So what does the CDN do when "bad" range requests are detected? Throw a 400?
What customer is willing to break all IOS 6 clients?

Also, iRules. Serious LOLs.

~~~
blinkingled
For starters, average podcast download is 40Mb or so. More than 50Kb in any
case. Next maintaining state per client isn't anything new - it might not be
ideal for CDNs but many LBs keep state for things such as maintaining session
persistence etc.

If there is a potential for clients intentionally or unintentionally jacking
up your CDN bill - I am sure somebody has a solution to prevent it -
especially since money is at stake.

You laughed at iRules - I got that part, but what about iRules doing HTTP
Request throttling? Far as I know it can do that based on source ip and port,
uri etc. which should work for this kind of scenario with some modifications.

~~~
donavanm
Your average podcast source material may be 40mb. Clients will most assuredly
not be fetching that all in single request. Your state transfer overhead is
per request. The 50kb estimate is an educated guess at the average size of an
http response from a CDN. You have to keep state on every request, otherwise
how can you detect your "bad request" patterns. Now, you could only stick your
state on to Range'd requests. But a lot (most?) of the customers affected by
this are doing a significant portion of requests as ranges already.

IRules, specifically, don't work for any reasonable packet rate. Every request
must now come off the nic, across the bus, hit CPU, hit a few times memory,
and back. Tracking state, in general, kills packet rates. There's no guarantee
that your flow is going to be hitting the same interface, interrupt,
processor, or even host. At every one of those levels shared state
dramatically increases complexity and reduces your max possible packet rate.
Silicon really can't flip bits that quickly.

Which gets us back to the business question. So you've found a "bad" client.
What aceptable action can you take? Throttle all iOS 6 users? Throw 400s?

To be honest here this excess cost is going to be absorbed in three places. 1)
end users will suck it up in data charges because they have no alternative 2)
sites will eat the bandwidth charges. They can't passit on if they have no
directly associated revenue. Or they don't want to lose customers. 3)
CDN/providers will take a relatively small hit issuing credits to keep their
customers happy.

Notice who won't lose a cent here? Apple and other broken client providers.

