-1 For Google for creating what is the biggest threat to content providers by enabling easy-to-use DDOS attacks across the entire interwebs.
Seriously, is this what we have to look forward to when Google Spreadsheets, and God knows what else, become ever-more popular?
Think about all the additional onerus costs that would be incurred by content providers as more and more Google Spreadsheet users hotlink images, mp3s, videos...
This has to be a bad design decision by Google, there's no need to redownload assets by-the-hour, on-the-hour, regardless of whether the user's spreadsheet is open or not.
Is it time to go back to the days of putting your web assets behind $HTTP_REFERER?
Otherwise, maybe it's the new type of flood attack, cost-of-service (CoS), applicable against those who use ‘invincible’ cloud infrastructure such as Amazon's.
Cost-of-service-attack: Consuming bandwidth or other resources in cloud based solutions to drive up the cost of running the service.
Very easy when the cost is so tightly coupled with the resource use...
This is why most cloud services have a customisable limit on how many virtual servers can be booted automatically, a solution is yet to be found for protecting S3.
"[snip]..one might envision that instead of worrying about a lack of resources, the elasticity of the cloud could actually provide a surplus of compute, network and storage utility that could be just as bad as a deficit"
I dunno though, it's all technically bandwidth flood in the end. The consequences (or goals) of flood may include immediate or eventual denial of service, high cost of service, various security attacks, or probably all at once—further classification surely is complicated.
Meanwhile, both terms DoS and CoS are a bit vague, too. Say, turning off a server or forcing providers to alter their DNS records more or less qualifies as denial of service attack (which, by Wikipedia's definition, is “an attempt to make a computer or network resource unavailable to its intended users”).
Aanyway.. You're right, it boils down to bandwidth flood (or maybe CPU..) in the end.
Maybe CoS is not really a technical term, just made more relevant by the technological changes. OP would have been charged much less had he not been using "the cloud".
Lets say I have small'ish competitor hosting their app on EC2. Using classical DoS techniques, how much can I ramp up their bill before they notice? Or can actually stop it? What if I run it on a smaller scale, "just" doubling their bandwidth bill.. Will they notice? Maybe they'll believe it's extra users ;)
If it requires a lot of effort to productize, Amazon could potentially offer it as a service $0.5/hr to ensure you are not making major goof ups.
In any case, it may not be a denial-of-service, but if you can find someone with an S3 bucket with large files you could maliciously cause them to rack up a huge bandwidth bill using this mechanism. I guess you could say it's a DDOM (Distributed Denial of Money).
About the by-the-hour downloads, well there again is a trade-off between providing data quickly and doing a lazy evaluation.
I think, as the author notes, it was just an unfortunate event that was a consequence of good design decisions gone bad circumstantially and wreaking havoc for the author.
It was certainly nice of Amazon to have made that refund. The resources (read bandwidth) were used after all.
I don't see why Google don't redownload them using If-Modified-Since, and reusing the original downloaded photos in the case that the origin server says they've not changed.
can't you also view a url as a password? (If only I know the url, then only I can download the file).
I am able to give out a url to someone else, so they can access the file, likewise, I can give out my file server's username and password, and whoever has it can also access my files.
There have been similar complaints about attempts to aggregate the contents of Hacker News posts, for example.
Where would we be without Google taking such a noble and strongly principled stance on user privacy, applied equally and consistently, striving to avoid legal responsibility, even when it costs unrelated parties thousands of dollars.
Wanna help make the web better? Stop using Google and use a competitor like DuckDuckGo.com. Google has too much power, too much control.
The result was an email account that got an email message every second or so in a variety of languages.
We quickly realized that you could have some fun by forwarding that address to someone else's email account.
Fast forward a few months, and Gmail smartly requires a confirmation before allowing you to forward.
I think I was hoping to produce an "email cannon" similar to yours - an email spammer to be directed at whoever I liked.
Amazon did a nice thing, but the services were used. Asking to user to pay for those services (even if it was a mistake), would not have been a "jerk move"
And it makes good business sense besides. They can let the guy off and eat the probably less than a hundred or so bandwidth this guy actually cost them due to their upstream providers, get a good writeup and look better as a result,
..or kill his account, take him to collections, etc, etc, etc, (which would probably cost them more than $1K anyways), and lose a customer and get a PR black eye while they're at it.
So yeah, it would be a jerk move, and a pretty dumb one at that.
Which would give Amazon more trouble? 1,000 TB spread out over a month, or 1,000 TB spread out over a day? Their current pricing model assumes both of these are equal, which they are plainly not.
Aside: Caps make the same mistake. If you have a 250GB cap, the ahem ISP charges the same whether you burn through that cap in a month or in a day.
Ever had a cell phone? During the day counts against an allotment, after a certain time at night, calls are free. For a home internet connection it would probably be reversed, as peak usage is right after work lets out.
Some power companies also do the same thing.
But in any case, even if in theory nothing is used, are you sure that Amazon doesn't have to pay for it too? Because then it's irrelevant whether it was really used or not.
Maybe it wouldn't be $1000, but it'd still be significantly more than $0.
Would probably even be easier to completely drop a bill as opposed to modifying it to charge a lesser amount.
So this traffic must be almost as cheap as Amazon's own intra-territory exchange. Probably that's why they were being so nice. They probably are even somewhat happy to generate some more traffic directly for Google as the ratio is usually a leverage for the network guys.
But, surely, it's still very cool of them to drop the customer's bill.
I wonder, though, how was one ever to know that Google stripped their feedfetcher of even the basics of "intellect" and allowed one single spreadsheet to generate incoming 250 gigs hourly while not being open.
So, after all it's Amazon's and Google poor designs that hammered the guy and a common business sense for Amazon to let it go for the customer and just study the case together with Google.
Offering the refund was really nice. Asking to pay for the services would not have been a jerk move, because Amazon wasn't responsible for the extra costs either. I usually root for the little guy, but in this case, if anyone else might be expected to foot the bill, it should be Google for building a system that needlessly slurps 250GB/h without any apparent capping or rate-limiting, caused by a single spreadsheet. Just because Amazon S3 is huge and probably doesn't feel a thing doesn't mean it's a good idea, what if it were 500 spreadsheets doing this?
What if that FeederBot were to gorge itself onto one Google's own CDNs? Sure it's probably just a drop in the ocean for Google as well, but I bet they'd rather not experience this "minor inefficiency".
Does merely having the spreadsheet passively open in a browser trigger that, or was some other process re-loading the spreadsheet every hour? (If the former, I wouldn't be as forgiving of Google. I understand the desire not to cache possibly-private data, but proper URL design and conditional GETs should be able to prevent the entire download on an automatic hourly schedule. And even if the latter – the author had chosen to reload the spreadsheet each hour – I'd want Google's design to allow browser-side caching to work for such embedded and/or generated images.)
In either case, this seems odd, unless the URL is especially noted as 'volatile', and/or there are other parts of the spreadsheet that might trigger conditional calculations/notifications based on that URL's contents. (And don't S3 resources have last-modified-dates or etags for conditional GETs?)
This is the main reason I'd never use AWS to host anything public.
First, the problem isn't inherent to virtual hosting services, you could just as easily get hit by this on a bare-metal site, though the interplay of S3 and Google Docs is an added dimension.
Cutting off all services opens the door for a class of DoS attacks. Simply direct enough traffic at a single account's assets, and you'll knock them offline for a given billing cycle. If the attack is cheap to launch (botnet, URL referral network, etc.) it's a cheap attack. Different entities would have different cut-off and degradation policies.
Better would be to identify the parameters of a specific anomalous traffic pattern, but this can be hard.
A more general solution is to set asset (server side) and client (remote side) caps in tiers. You'd want generous (but not unlimited) rates for legitimate crawlers, your own infrastructure, and major clients. The rest of the Net generally gets a lower service level. Such rules are not trivial to set up, and assistance through AWS or other cloud hosting providers would be very useful.
I dont' see this as a real issue (the potential DOS is there, but not a real problem with having a bandwidth metric kill switch). I'd assume this would be a configurable setting and would have to be enabled by choice. In the author's case I'm sure he'd prefer to have is service cut off prior to running up a 1k bill. If one of my test instances started bleeding bandwidth I'd prefer for it to just get killed then to rack up an ungainly bill. If its your production service then don't configure a cutoff.
Er... Because caps can never be raised mid-cycle?
Depending on the organization, though, you'd have to get purchase approval on the overage, etc. That will depend on specifics. Makes for sticky questions to answer there as well, which is another argument for putting better management tools at the cloud level.
As for the "purchase approval" nonsense -- if the knob isn't right for your company, then don't use it, but there are a huge number of companies where the guy turning the knob is the one and only person with any say over money being spent at all.
Those of us who can't afford an extra £1000/month (or more if it's a dedicated attack) would love to be able to just turn it off when a threshold is hit.
Users may lose access, but at least you don't go bankrupt in the process.
Only if they are sure enough about their finances and their web service to be confident that the spike will be profitable.
For example in Linode you can configure trigger conditions for CPU (triggers in x% cpu for y time) and Network usage (there are other conditions as well)
Very useful for flagging suspicious activity (or just plain accidents)
I killed my EVDO wireless service based on a similar experience with my cell provider, after requesting multiple times that they provide me capped service. It simply wasn't worth the downside cost risk.
(before mentioning Amazon's scalability, consider that Hacker News is ran on a single dedicated server, and the moral of the story seems to be how not to scale especially when you don't want to)
 - http://www.hetzner.de/en/hosting/produkte_rootserver/x3
I'm signing a deal (if all goes well) tomorrow for $0.65 per MB/sec. 5 gigabit commit on a 10GB port, ($1.15/megabit overage charge, billed on the 90th percentile.) From Cogent, probably the cheapest provider that claims 'tier 1' status (there's a lot of argument about Cogent's "tier 1" status... which is kind of funny; If you only have one provider, a good tier 2 is going to be more reliable than a tier 1 anyhow.) but it is a real uplink and I really can use 5 gigabits of that.
If you ran one megabit full out for a month, assuming 2,629,743 seconds in a month and assuming seagate gigabytes, divide by 8 and you get 328717 megabytes. so, uh, yeah, two dollars a terabyte? assuming they are larger than I am, and their BGP mix is the low-end stuff (Cogent and HE.net or the like) they aren't losing money.
Guess the more you commit to, the better pricing you get!
On top of that, bandwidth costs fall... dramatically in competitive markets, and I'm in silicon valley, probably one of the most competitive transit markets. According to the graphs I've seen I'm paying below average, but it won't be very long before what I'm paying is average. I was explaining this to the real-estate guy that owns my data center that was wanting to get in on the bandwidth business (he wanted to get people in for cheap and crank up the price on renewal, like you do with real-estate and data-center space) His response? "but then why would anyone want to be in the bandwidth business?"
I mean, the fiber in the ground? that's like real-estate. the prices go up, the prices go down, eh, whatever. But, the amount of traffic you can push over two strands of fiber? that goes up all the time; I have some old cisco 15540 ESPx DWDM units that can do 16 10GB/sec waves over one pair of fiber. They were awesome back in the day. Modern DWDM equipment? you can get 80 100GB/sec waves on one pair of fiber.
It's irritating, though, as like everything in this industry, you have to negotiate for months to get the "real price" - I asked Cogent for a single, capped 1 gigabit port for $1000/month north of 6 months ago. "Call me back when you can get me a buck a meg" They kept calling me back "how about $3 a meg? how about $1.75 a meg?" I mean, even now, they beat the buck a meg price point, but I had to buy a lot more than I needed (I'm splitting it half and half with another company, at cost, so while the transaction cost was huge, once you factor in the discounted setup fees, well, I am still paying more than a grand a month, but it's still a pretty comfortable fee for me.) I imagine Cogent has spent several thousand dollars of salesperson time, and I /know/ I have spend several thousand dollars of my time on this, and they are charging me rather less than if I had wasted almost none of their time. They even dropped the setup charge down to almost nothing.
And now I've gotta do the same thing all over again with a second provider (most likely he.net) What a waste of time and effort all around. I mean, a little bit? it's kinda fun, I mean, sales people are always ridiculously overdressed extroverts, and in this industry, most of them can pick up on my personality and act in a way that is tolerable or even fun for short periods of time, but I really am an introvert. I mean, it can be fun for a while? but man, I have had like 5 meetings the last two days, between dealing with Cogent and dealing with the people that are buying half the pipe from me. It's exhausting, and I guess I have a hard time seeing how this is the most efficient way to sell bandwidth. I mean, I guess some people throw up their hands and pay the $3/meg asking price, and if they can break even on me, god damn, you wouldn't need many of the $3/meg customers to get really rich.
But then that leads the question: why bother with me? I mean, I'm going to turn around and sell transit very near this cost in a very public way, which will mean that more of those $3/meg customers are going to turn around and ask Cogent for a discount.
I guess they can rely on the fact that, well, I'm a scruffy introvert, and no large corporation will do business with me. Still, I mean, Cogent will let me drop 1G (capped, sadly) ports into datacenters where I don't have equipment for $650/month, and I've been running ads saying I would be willing to sell those at cost (and make my profit off of the difference between the list setup fee, $2500, and what they are actually charging me.) - This was mostly a way to get the higher commit pricing without actually paying for it all.
$4 a meg sounds really expensive, at that point it is almost worth it going up, have room for future expansion and pay less for it, or do you do, and split the BW and cost.
Also, majority of people will not be needing more than a single EC2 instance which costs ~$20 a month, lesser for Linux. Amazon reviews and lowers prices periodically.
This is pretty clearly a different league of attack: rather than attacking systems or spending money, you'd be exploiting a Google feature to use Google's resource for free to incur a giant amount of data transfer.
But you could do a denial of service by making the service too expensive and all of your work would be hidden behind the anonymity of the google bot.
I can't help but think that if benign decisions lead to disasters like this in the cloud, how much destruction could robots wreak in the future due to similar benign choices?
Pretty interesting though and if this becomes a big enough story you can bet Google will be changing something; the last thing they need is someone using Google Docs to DOS websites.
As long as they don't do WMFs, with their code-injection-by-design functionality...
Also, why would the spreadsheet be calling these images every hour. Did you have the spreadsheet open? Does google do this call even when no one is viewing the spreadsheet?
Not on S3, no.
Plus you cannot put a robots.txt at s3.amazonaws.com so if the url is accessed through the https://s3.amazonaws.com/.... url, the robots.txt will not work.
I do think they should change their process (making it lazy-load instead), but that's a different issue to robots.txt.
It's manually triggered to start downloading resources every hour regardless of whether someone needs them.
In that sense, any web spider is "manually triggered" as well ;-)
This is mentioned specifically in the article, in fact.
AWS can actually do a setting for max bandwidth per hour or so & alert early if there is suspicious activity.
All this automation is very well but this illustrates the dangers of running stuff on other's machines of unknown complexity and out of your own control, while having to pay for whatever may happen. Not for me, thanks.
I call in to question that these are "perfectly legitimate design decisions", basically, if google thinks that the data is private or too sensitive to cache, then it shouldn't be this easy to have it automatically keep hitting a site like this. Google should have realized the potential for abuse here. I'm guessing it truly is an oversight on their part as I can't imagine they would want to waste all this bandwidth either, however its something they should figure out a solution for.
Clearly the client's not presenting If-Modified-Since: pragmas as I believe S3 honors those.
You could rate limit the IPs.
The question is how many IPs is Feedfetcher using ?
"But how come did" indeed.