* No, using a colo saves you way more than just 20%. In one of our facilities, our org has maybe ~15k servers under management and dc costs are ~ 1 million/mo. Build out of the cages and racks wasn't incredibly expensive, I think ~2 million. Power isn't insane either.
* I have no idea what he's talking about re: the fiber statement. We have a blend of different backbone providers that give us ~100 gigabit for ~$0.60 per gig. Compared to bandwidth costs for AWS, it's dirt cheap.
* You don't need ultra fast hot swappable robotic arms if you're not FAANG. With a handful of dc techs and a simple monitoring system, you can swap out disks just as easily, at the expense of a little more opex.
* PCI requires no consultants, its terms are pretty black and white - you will need an ISA though. HIPAA and can be more of a bear, but your org will need this consultation regardless of whether you own a dc or put it in the cloud. AWS does not give you automatic compliance.
* Yes I can totally believe that working at Google, you'd give TCOs that make Google look good.
Literally none of us except Lyft's operations knows what's good for Lyft. 100 million a year is a ton of cash, but gives them presence around the world. My company has datacenters around the world and it is very operationally complex to maintain them. It's definitely not for everyone, but this guy is making it sound worse than it actually is.
There are many steps you can take before buying concrete to build a DC. Even before you have to buy your own rack.
But, uh, wasn't this about their current expenses? Or their expenses at some point after their AWS bill got bigger than 1-2 employees?
And even with current expenses, building out a core competency in infrastructure that has nothing to do with your domain of expertise can be a distraction.
If you can't find anyone to even handle the hiring, let me stay far away from this company that teeters the verge of collapse when someone gets sick.
In fact, I was offered a job as a “digital transformation consultant”/“enterprise architect” to be the face of a company that did just that - take over a complete project for funded startups or other companies who had an idea but no in house expertise. Of course I would be the face of the company where most of the work would be “rural sourced” and outsourced.
The alternative being that they should have build an expertise in using AWS instead?
I'm not particularly trying to argue either way; I'm just saying that claiming that using AWS doesn't need expertise is a false premise. AWS isn't special here, either.
If instead you use some sort of abstraction and/or tooling, now you need expertise in that tooling.
Uber's doing that and they're a comparable company.
Even Amazon uses their own separate, dedicated systems for a lot of stuff - not public AWS I heard.
Microsoft is the most valuable company in the world. When it comes to stock Facebook probably would be not in FANG anymore because of their drastic stock drop over last year.
There are dedicated REITs that hold only high-tech/data center real estate.
But yes, it was disturbing when Tesla was using Ubers for people who had exposed bone.
Providing transportation for patients is so that __people can get healthcare__! A hospital/HCSP not making a few hundred/thousand is completely insignificant when compared to the real problem: People can't get healthcare or can't get to it.
I just take issue with this viewpoint where the concern for the dollar is placed before people's needs.
On top of that you have to hire people to manage it.
I think they'd be much better off just renting some space out in a datacenter and run the show themselves!
The way it's phrased, it's like each ride directly translates into 14 more cents for Amazon. But really, the figure comes from averaging Lyft's Amazon costs over Lyft's total rides.
Now, in some cases, that would be valid accounting, if all (or nearly all) of your costs (with respect to that service) are variable. For example, most of the cost of sending a letter is the truck/plane/employee, which are variable, so it makes sense to divide total letters over total truck etc costs.
But would Lyft pay twice as much to Amazon if they served twice as many rides? Would they have to double every AWS service they're buying? I don't think so. They might have to add more of one or two kinds of server/services, but most of that 14 cents figure comes from amortizing fixed costs, which will go down if they scale up further.
Am I off base here?
IT's version of the Laws of Thermodynamics is that nearly every interesting action you can do programatically is going to take at least O(log n) time, where n is the amount of data you have (at rest, in flight, or both).
If I have 16x as much traffic as I used to deal with, my load isn't 16x higher, it's 64x higher - if I'm fortunate and we've engineered for scale. But it could easily be three orders of magnitude higher (16164) if I'm not.
For the 64x scenario I tell my boss to buy more hardware and I put medium to high priority tickets in the backlog to improve the worst cases. In the 3 orders situation, those tickets bump all of our other priorities, and there are more of them. Performance is now a feature. It sucks up a significant and very visible fraction of our engineering budget, which comes with its own sort of opportunity costs.
If you try to scale vertically, we all know that except at the low end, buying a server that's twice as beefy costs far more than twice as much. If you go horizontal, there are plateaus where your network topology has to get more complex (hardware cost, maintenance cost, speed, pick two). Personally, I think the biggest lie in cloud computing is that 10G ethernet fixes all of your network topology problems (ie, it's treated as magic that you don't have to worry about). As disks and PCIe get faster over the next couple years I think that'll be back on people's radars.
This despite it being fundamentally meaningless because of course it needs to be multiplied by some time-dimensioned value anyway.
Logs grow slowly.
But it could very likely be true that doubling the number of rides would have a potentially material impact on additional AWS spend for Lyft.
Edit: And, for that matter, they may be using AWS for things other than serving rides, e.g. dev tools and internal service that scale with the number of office employees, not rides.
Managing 5 shards is not the same amount of work as managing 50. For one the surface area for cross shard communication goes up. What happens if London or Manhattan are too big for one shard? Multi-tenant providers have been wrestling with the Big Customer problem forever, and sharding is only a bandaid.
However, in the case of AWS and other cloud service providers, virtually all products and services are charged on a per-usage basis. Meaning that if the average Lyft ride requires 60 database api calls (say a 10 minute ride with location updates every 10 seconds = 60 location updates/db writes) then AWS charges you for 60 api calls. If the average ride takes 11 minutes, then it would take 66 api calls and so forth. In this case, this cost is a variable cost and maps directly to a specific ride. If that ride had not existed, then that cost would not have been incurred.
Take a look at AWS pricing pages and you will see that all prices are broken down into very specific per-usage costs. Some services break this up very deeply, on ingress usage, egress usage, storage usage, api usage, etc all together calculate the cost that the service uses.
Looking at the Lyft Case Study that AWS published (you can view here: https://aws.amazon.com/solutions/case-studies/lyft/), it appears they primarily use Redshift, Kinesis, DynamoDB, and EC2/ECR services. All of these are heavily usage based. The least usage-based service they use would be their EC2 servers. Since a server is rarely running at max-capacity it means that one extra lyft ride doesn't necessarily mean another server. So the EC2 costs for that ride would be the same if it had or had not existed. But it looks like they are heavily reliant on using ECR to auto-scale their servers up and down based on usage. So they are probably running many clusters of smaller servers that move up and down as rides increase. This means that even their EC2 servers are fairly closely mapped to actual usage of their platform. During a slow period they will be running far fewer EC2 servers than during a peak period. So even in this case their costs are relatively variable.
All in all, I would say that you actually could extrapolate the total cost of AWS on a per-ride basis, because that is the variable that effects their costs. If they had 1 less ride this month, their bill would ACTUALLY be 14 cents lower (or somewhat close to it). They do incur a direct cost with each individual ride. If there are fixed AWS costs in there, it probably accounts for less than 1 penny of that 14 cents. For example there are a few fixed costs if they are using AWS DirectConnect or AWS Private Link to connect to AWS Datacenters. But this accounts for probably a few thousand dollars a month at most and in total would have minimal effect on thier per-ride cost at the current scale that Lyft is at.
My suspicion is that a bulk of those costs aren't things that scale with more rides coming in.
With that said, I'm definitely confused by (the evidence introduced in) the GP's response (linking to the Lyft/AWS video). I mean, both AWS and Lyft are trumpeting the setup as an efficient use of AWS, and it still comes to $0.14/ride? That doesn't add up.
In contrast, EC2 alone would have to save us lots of pains.
You may say that my previous company was not technically great. Maybe. If that's the case, how many companies can be really better? I think the fundamental problem is that companies tend to underestimate the effort to build a world-class infrastructure from scratch as well as the opportunity cost due to loss of productivity. Netflix's leaders were truly wise by claiming that Netflix didn't want to work on undifferentiated heavy lifting early on.
Of course building one datacenter and own oceanic fiber to host your own services is a bad option for almost every business - not every business can fill a full DC, and 1 DC is not enough to guarantee global operations.
> E.g. if you are +30% of the internet traffic (nflx) it doesn't make sense to pay rent to telcos any more and feed their margins. You have the volume and stable demand to justify ownership. For the rest, cloud is where they'll live and die.
That's 50 cents per hour to colocate dozens of servers, and there's 324TB of data thrown in for free.
Obviously you have the tin to buy and maintain which will cost you. Leasing costs for a typical HP server are say $100/month.
10G transit on it's own is around $2k/month with H.E, or 1 cent for 16GB if it's fully saturated.
That deal actually only includes 15A of power, which isn't enough to fill the cabinet if your machines aren't idle.
I couldn’t believe it was a 15A circuit for the whole cabinet when I went there.
This was several years ago but the other issue I had was DDoS against someone else there collaterally taking down our network. Ended up moving to 55 S Market in San Jose and have been very happy. Those racks are at least 40 amps.
And if mods feel it's appropriate, then I don't mind if the link is swapped.
Otherwise, maybe consider voicing opinions like these about things you actually know something about.
ADDED: Meant to respond to a different comment that didn't suggest any inside knowledge. (Though I'd still point out that even insiders often can't understand how their own company has so many employees in X department, is so inefficient, etc. And this is more or less the case everywhere.
Given that they work for "one of few >10b unicorns" per their comment history, I have a suspicion you may be telling a Lyft employee they don't know anything about Lyft.
In terms of data usage and storage, matching is by geography, meaning their data is easy to shard.
So, how's this $8M per month?
edit: fixed a typo
Part of my team one month stopped all development and solely focused on reducing AWS costs and we cut our monthly bill by 50% (probably around 4 dev's yearly salary)
It's not easy, but optimizing AWS resources is an absolute must.
Analytics, authentication, support etc. you don't know what Lyft uses exactly here from AWS. For example if they use Auth0 instead of Cognito they pay that part to Auth0. AWS prices are very competitive most of the time.
AWS is the Walmart of enterprise IT, they sell you everything else too.
I love when indirect costs are naively calculated for marginal cost estimates.
There are middlegrounds, like colocation & dedicated servers. If you get dedicated servers 4x cheaper than cloud-shared-vps with remote-hdd, then you can overprovision.
Especially now that hardware is getting bigger you need even less space (assuming your software scales vertically).
And they ALWAYS make it like the next hour you will have 10x requests and your database will autoscale that quickly.
- only parenthetically mention colocation, as if it isn't absolutely normal to rent a rack/cage/suite as needed from an existing DC operator (which in addition would make datacenter rent an OpEx and actual server equipment be amortized over 2-3 years, not the 10 for real property)
- somehow present "intercontinental traffic" as something both necessary and tremendously expensive, as if all the major public clouds aren't charging 10x what the market rates for bandwidth are
- imply that "the cloud" is immune to outages, as if GCP didn't have multiple major global- or region-wide outages over the past few years
I mean, I understand why most companies don't go on-prem despite all that, but this series of tweets is borderline FUD.
If you don't blow this up to stupid proportions, that should be able to run on modern hardware in a few racks at a colo for millions of users per day, especially since you can neatly shard the load geographically, thus distributing load (and rented racks) over multiple DCs, ideally with failover in place for emergencies. The only thing they really need to merge is the billing data at the end, but handling billing data of a global userbase of tens or hundreds of millions of users in a single system is a solved problem nowadays and does not even constitute a case worthy of the overused "big data" moniker.
You can never hand them enough dollars to get their SLAs/latency on par with the cloud. There will be outages. There will be delays. There will be scale problems. There And you'll have to address these somewhere else, either in your tech stack, your operating model, or your PR department.
Paying a 2-5% TCO premium to have a throat to choke, 5 9s redundancy, GDPR compliance, and the law of large numbers on your side of the court is a pretty fair trade.
Good luck with the throat choking.
Also what does GDPR have to do with renting vps in cloud ?
Maybe they could save a bunch by colo or something else. But would 14 cents per ride really matter at all for their competitiveness. I’m not going to notice a 14 cent difference even if I do bother to price compare Uber with Lyft.
This is a VC fueled market. It isn’t really about small margins of this size.
Lyft lost a billion dollars on $2B of revenue in 2018.
Moving from AWS, or doubling AWS spending, will make no difference to the company's viability, it's not worth the time in meetings to discuss it.
dotcom v3.0 companies are all about the potential and cornering the market. Amazon was exactly the same - it was founded in 1994 but didn't make a profit until 2001.
There's no reason they need to be spending such crazy amounts on servers - ostensibly to allow faster iteration. A new version of their app just isn't going to move the needle. Signing up new drivers will though.
I fail to see how the benefits AWS provides are so important they need to spend such crazy amounts.
Every extra driver costs them money, every lift costs them money. If they can hold out the illusion of "we know we can save money here when we've got time and have won the market", maybe it keeps the money rolling in.
Not sure how you can possibly pay 0.14$ in computing for a single ride (if that's accurate). That's more than 3 hours of a t3.medium instance for example...
AWS costs that wouldn't really fit into a per-ride accounting:
* Redundancy of instances (regions/AZs)
* Non-ride related AWS costs (hosting/processing of analytics, test & automation infrastructure, etc.)
I'm sure there are others. I'm not saying that there's no way they could get their AWS costs down, btw.
APPL, VRZN each get 5c/ride - just for showing up.
I started this comment as a joke. But now i am not so sure.
Suggesting lyft needs 5-9s is ridiculous for 1000s of servers is ridiculous. Its entire event stream comes from a mobile network which is probably less reliable.
But, all in all, there aren't all that many practical use cases that really benefits from stuff like that. Most projects are quite happy with some storage, compute, network, and a few managed services. Most companies don't do rocket science with their software and infrastructure.
And there are things that you simply can only afford at scale, like security management for your supply chain, dedicated automation for updating firmware of every tiny controller that's in your hardware, and so on.
Does anyone know anything more about these robotic hard drive replacers?
My understanding is that the biggies - Google, Amazon, Facebook, etc. - don't bother with individual drive replacements (and certainly not with an expensive robotic arm system). They wait for an entire rack to fault over a certain percentage and then just swap out the whole thing.
However, the robotic part is not to swap parts in case of failure - instead, it is to let ~64 tape drives access ~100k individual tapes within the library and ~inf tapes in off-site storage (as the data stored is not, and _shouldn't_ be, directly accessible).
Another interesting question is whether that profit margin includes the original capex of when AWS was not reporting the revenue/profit of AWS in its early years. There's also all sorts of creative accounting methods to hide capex such that I wouldn't be surprised if Amazon is selling at cost to buy marketshare in a way somewhat analogous to Uber subsidizes rides to gain early marketshare.
Our results of operations vary and are unpredictable from period-to-period, which could cause the trading price of our Class A common stock to decline.
Our results of operations have historically varied from period-to-period and we expect that our results of operations will continue to do so for a variety of reasons, many of which are outside of our control and difficult to predict.
I find the term "period-to-period" rather vague. It could mean quarter-to-quarter or even year-to-year.
Still what they are effectively saying is that Lyft cannot be trusted to provide any growth assessment.
I still find it weird thaf lyft needs so much processing per ride, it just doesn’t sound efficient
And a lot of ppl seem to find this normal. Its not, this could be money that pays the driver more
They get paid that ~14 cents but they don't 'make' that much.
What they make is what is left after costs.