
The Uber Engineering Tech Stack, Part I: The Foundation - kfish
https://eng.uber.com/tech-stack-part-one/
======
Animats
It's interesting that they don't break the problem apart geographically. It's
inherent in Uber that you're local. But their infrastructure isn't organized
that way. Facebook originally tried to do that, then discovered that, as they
grew, friends weren't local. Uber doesn't need to have one giant worldwide
system.

Most of their load is presumably positional updates. Uber wants both customers
and drivers to keep their app open, reporting position to Master Control.
There have to be a lot more of those pings than transactions. Of course, they
don't have to do much with the data, although they presumably log it and
analyze it to death.

The complicated part of the system has to be matching of drivers and rides.
Not much on that yet. Yet that's what has to work well to beat the
competition, which is taxi dispatchers with paper maps, phones, and radios.

~~~
matthewrudy
I work on an uber-like system, but with ~3 backend devs rather than 100s.

We made the opposite decision, cloning our full stack for each new market.

That's great for scalability, but is a nightmare for devops.

If anything we want to find a way to move to one global system, And then slice
down the bits that can be local:

Eg. Create a local order matching service, but keep orders, payments, and user
accounts global

~~~
danpalmer
As far as I'm aware, Hailo (>3 backend devs, not quite 100s) did exactly this
as well, and the ex-Hailo devs I've spoken to considered it a pretty bad move.
It took them ages to refactor into a global system if I remember rightly.

~~~
matthewrudy
Yep, I had a long chat with one of their engineers (Matt Heath) and seems like
they had the exact same problems we have.

And solved it with their restructure and move to microservices.

------
e1g
I'd love to know how many people are responsible for devops/operations/app at
various stages of any company's journey. Wikipedia says Uber employs 6,500
people so if even 15% of that is on the tech side of the business that's still
1,000+ people allocated to tech. I think this metric would be a useful reality
check for a "modern" SaaS project with 3-10 people that's trying to emulate a
backend structure similar to the big league.

There are 20+ complex tools listed in the stack, and to run a high-visibility
production system would require high level of expertise with most of them.
Docker, Cassandra, React, ELK, WebGL are not related in required
skills/knowledge at all (as, for example, Go and C are). Is it 5 bright guys
and girls managing everything, like the React time within Facebook? Or a team
dedicated just to log analytics?

~~~
dkarapetyan
That's all bloat. Pure and simple. At the end of the day Uber just does
routing and basic allocation. It's a simple operations problem that has been
solved since the 70s and no one back then needed ELK, Docker, Cassandra, etc.

I've seen this bloat everywhere. It is usually a result of internal politics
and posturing by management types. The kinds of people Steve Jobs would have
called B and C players. Now the actual people operations is another matter
entirely but the tech stack definitely doesn't need to be that complicated.

~~~
Animats
I have to admit, I could see this running for a city the size of SF on a
desktop machine under the table at the taxi depot. Uber has 11,000 drivers in
SF, but probably only a few thousand are on at any one time. A ride takes a
few minutes, so if you figure 3,000 active drivers and 4 rides per hour,
that's only about 3 ride transactions per second. You have a transaction at
ordering, one at ride start, and one at ride end. Plus you have tracking of
where all the active drivers are, pinging maybe once a minute. That adds up to
only at 10-20TPS. You can offload the routine web and app stuff to some front
end machines. And you want to do some analysis every minute or two to see
where there are "surges".

The only non-trivial part of this is assigning drivers to rides.

~~~
ricardobeat
It's easy to imagine the simplest stack that can serve the core features of
any service, and that is well served by a single box. What's missing from the
picture is the infrastructure to replicate this 500 times by separate teams,
monitoring all of it, backup, auditing, aggregating customer and business
metrics, back-office systems, and more. Plus the fact that these things always
grow organically and embed a host of imperfect decisions - the imaginary
system will always be better designed.

~~~
tlrobinson
This. Armchair software architecture is so easy when you can gloss over the
details that make a product great.

Also routing, ETAs, geocoding/search, etc.

WebGL visualizations and such are probably overkill, but if it makes the
company more fun to work for then it probably breaks even, at worst.

~~~
danpalmer
Couldn't agree more. It's all the invisible details that cause the load.

It's a much more trivial example, but highlights the point well I think - we
have pages in the app I work on that would respond in ~100ms, but might have a
single sentence on them that takes another 100ms to generate because of the
complex data relationships involved in figuring out what that sentence needs
to say. The 'request handler' might be 20 lines of code, with a 50 line util
function to generate that line of text. No armchair architect will ever take
into account things like that, but the end result is a page that is just a bit
more personalised to the user and therefore improves their experience.

In an app of any real size, I imagine there are anywhere from hundreds to many
thousands of tiny little details like this that all together drastically
increase the amount of power needed to run a service.

~~~
orf
> No armchair architect will ever take into account things like that

An armchair architect would say it's not needed. They would question whether
spending 50% of your response time generating a single sentence is in any way
worth it, and wonder what kind of architectural mistakes led to that.

~~~
jdavis703
The problem with this line of reasoning is that it implies the business exists
to serve the software. Unless you work at a tech-focused non-profit, the
software actually exists to serve the business.

~~~
orf
> the software actually exists to serve the business.

Sure it does, but the business also wouldn't exist without the tech in Ubers
case (and a lot of other cases). And it's going to be your head on the line
when you keep adding these 100ms sentences because the business wants it for
no good reason and your page takes 3 seconds to load, and nobody buys anything
from the site.

~~~
robbles
You're making the assumption that the additional features slowing down the
service aren't adding value.

More common is a "death by 1000 cuts" scenario where the various causes of
slowness are apparent to the developers, but quite difficult to remove because
they've become necessary to the continued success of the business.

~~~
orf
> You're making the assumption that the additional features slowing down the
> service aren't adding value.

No, I'm questioning whether the value added is greater than the value lost,
and in this hypothetical example clearly not. So it's your job to point that
out to whoever and not silently obey.

------
NotQuantum
Uber is really strapped for engineering talent. Especially when it comes for
SRE. Myself and many friends working SRE at various Bay Area companies get
consistently hit up for free lunches and interviews. It's really weird
considering that their stack doesn't NEED to be this complex....

~~~
joeblau
It probably could be more simplistic. It seems like with enough engineers
every company I've ever worked at eventually ends up using every technology
they can because of the _one_ thing it does well.

~~~
r2dnb
>because of the _one_ thing it does well.

This "one thing it does well" business is then presented as : "using the right
tool for the right job" and it's difficult to argue against that because the
counterpart can easily deride you as a fanatic of some technology, someone not
objective enough, etc...

It is however interesting that we used relational databases for virtually
everything for decades even though SQL is suboptimal at most things if we take
them in isolation. Some will argue that people are now realizing their
mistake, but the truth is these companies were successful and we were all
getting our paychecks. (PS: I choose to use NoSQL for virtually all my
projects)

The real driver shouldn't be _the one thing it does well_. Many times - if not
most of the time - it's preferable to use a tool optimal for the most
important parts and suboptimal for the rest. I personally prefer to provision
two more instances, than to add two more technology stacks.

~~~
rantanplan
> It is however interesting that we used relational databases for virtually
> everything for decades even though SQL is suboptimal at most things

You have no clue what SQL or ACIDity is. For 99% of the cases SQL/RDBMS is the
right choice. You probably think you belong in that 1%, but from your comment,
I suspect you do not.

> I choose to use NoSQL for virtually all my projects

That's because you have no important data to store.

When you get to store data that are important to your customers you're gonna
have a big revelation.

~~~
r2dnb
>You have no clue what SQL or ACIDity is.

That's quite an attack. I trained for an Expert SQL certification from
Microsoft back then, when I was writing 3000K+ long stored procedures to
migrate an Access application at a fortune 40 company. So I know what it is
and I know quite a good deal about RDBMS. I'm not among those who criticize
what they don't know.

Regarding the gist of your comment on NoSQL, I haven't been able to convince
people coming from where you are with two days of meetings in a row, so I'm
fairly confident I'm not going to change your mind on HN.

~~~
rantanplan
If you have a clue, as you say, and still believe that it's a good idea to
store critical data with NoSQL then I don't know what to say.

Obviously you don't care enough that _almost every_ NoSQL solution out there
has been found to make false claims about their guarantees. The billions that
have been sunk in the blackhole that's called NoSQL in the last decade is
unprecedented.

You don't have to change my mind. I have(and still use) both. And I still
maintain that people who use NoSQL for 99% of their projects are making the
wrong choice.

~~~
brianwawok
So I think a lot of people agree with you on

> Many people choose the wrong tool for the job

I think that is far past database choice. I think many people build SPAs that
end up hurting the product over a traditional setup. I think many people use
microservices where a monolith would have much better performance and
reliability.

As for the basic premise

> that people who use NoSQL for 99% of their projects are making the wrong
> choice.

Is perhaps kinda right? Some people may really only touch giant data sets. So
for them always using NOSQL is smart. The people that write webapps with 12
users? More questionable.

Most cases you can decide if you need to leave RBMS with something like.

1) Do you need to store in the next year > 100GB of data that you need to
access in realtime?

2) Do you need in the next year to store > 1TB of data that you need to access
in semi-realtime?

3) Do you need in the next year to handle > 1000 writes per second?

4) Do you need in the next year to handle > 1000 reads per second?

Not a perfect guide, and I am sure you can think of edge cases that can still
be dealt with in a RDBMS.. but it is a decent starting place. One tricky part
is that if you are optimistic, almost any app can check off #3 or #4 (Like
Uber but for Baby Strollers). Knowing how to realistically estimate demand for
a possibly viral startup is hard.

~~~
rantanplan
The above is good as a rule of thumb indeed.

Another one that I'd add is: \- "Are the records in _each_ table in the
hundred of millions? Then most probably you'll do fine with an RDBMS".

If you go above that, or you have operations that will extrapolate that number
in the billions then you can offload them into whatever non-RDBMS storage you
want and do your thing. But that's the thing with RDBMS, you can always
move(or offload part of) your data to a non-RDBMS solution _afterwards_.

But doing the inverse? I wouldn't want to be in that person's shoes ;)

~~~
brianwawok
Does row count matter that much compared to data size? I.e. if I have a
billion rows but they are 2 32-bit ints, that isn't a lot of data (2 GB +
index). I guess the index starts to get pretty big.. but I always just think
of raw data size vs # of rows.

~~~
rantanplan
Remember, it's just a rule of thumb. Now... tables with 2 32-bit ints as
columns are not exactly typical RDBMS data.

Also, data in RDBMS are... well relational :) Meaning, the rows of just one
table are not that important. The data are going to be queried and combined
with data from other tables. And I know that typical relational data that
consist of hundreds of millions of entries in each table is something that
most DBs can handle.

Again, rule of thumb :D

~~~
brianwawok
Fair enough!

------
sandGorgon
What I'm really wondering about is their app. The UI of the app can be
impacted without an app update. For example the UI during the pride parade. Or
minute of silence ( [http://gizmodo.com/uber-makes-riders-take-a-moment-of-
silenc...](http://gizmodo.com/uber-makes-riders-take-a-moment-of-silence-for-
gun-viol-1783391098) )

I wonder what's the architecture of the app and the API for this.

~~~
possibleNoob
If its updating without your consent to upgrade in the app store then its a
webview youre looking at and they are just updating the webview.

~~~
tlrobinson
I don't think it's that simple. The distinction between "code" and "data" is
somewhat arbitrary. I'm sure Uber could get away with a rules engine that
supports the cases the parent comment is talking about.

------
marcoperaza
Quite an intricate architecture. I can't help but wonder if all of the
complexity and different moving parts are worth it. Does it really make more
sense than throwing more resources at a monolithic web service? Clearly the
folks at Uber think it does, and they've obviously thought about the problem
more than me, but I'd love to understand the reasoning.

~~~
UK-AL
There's only so far throwing more resources at a monolithic app can take you.

At a certain scale you have to turn distributed. Uber is at a large scale

------
sixo
This is just about all the tech there is, right?

------
mickyd54
'wildly complex' wow. and they now have 'eaters'

------
haosdent
"We use Docker containers on Mesos to run our microservices with consistent
configurations scalably, with help from Aurora for long-running services and
cron jobs."

------
legulere
> Screenshots show Uber’s rider app in [...] China

Interesting to see Google maps being used, isn't that blocked in mainland
China?

~~~
xorgar831
US phones roaming in China aren't blocked from using Google services, they may
make an exception for Uber too.

Here's what it looked like for me last May:
[https://www.dropbox.com/s/9rkor22hn3q6z5t/IMG_2183.jpg?dl=0](https://www.dropbox.com/s/9rkor22hn3q6z5t/IMG_2183.jpg?dl=0)

------
ashitlerferad
Anyone know if Uber supports the projects they use with human and financial
resources?

------
50CNT
So much technology, yet I still had to load the site 3 times and fiddle with
uMatrix to get the page to scroll. Now, lots of people do silly things with
javascript, but on a blog article on your tech stack it doesn't speak well of
things.

------
tinganho
This sounds like a blog post for emphasizing the more buzz word you use the
better.

------
creatine_lizard
If it is easy, it'd be nice to edit this the title to be not in all caps.

------
marcoperaza


~~~
minimaxir
In fairness, the title of the original article is in ALL CAPS (due to text-
transform: uppercase), so don't assume malice on the OP's part.

~~~
scrollaway
In Firefox, copypasting text-transform:uppercased text retains its pre-
transform case. I'm a little sad that feature isn't in Chrome.

------
mikecke
For those of you complaining about the title being all caps, it was done so
for aesthetic purposes. Which means somehow the submitter went through the
time to uppercase each character of the HN title before submitting.

    
    
        text-transform: uppercase;

~~~
cynicalkane
What are you talking about? If you copy-paste the title you get all caps.

~~~
mikecke
Oops, I didn't know the behavior persists.

~~~
adamnemecek
I think that this is new though. I don't think it used to be like this (?).
I'm on Chrome.

------
joering2
Sounds like a very solid foundation! I'm glad to see they have sufficient
system in place to continue spamming the heck out of people who never opted
into their advertisement in the first place.

/sarcasm

I only wish LE would treat CAN-SPAM seriously and put more sources into
criminal enforcement.

------
ryanlm
I just got rejected from them. I applied for a SE position, but they didn't
like me I guess. They send you this really condescending rejection letter. I
showed them my programming language that I built in C from scratch, and also
my data structure library where I implement all the common data structures
found in high level languages that I built from scratch in C, among the many
projects

I have. It must have been my state school that turned them off. I know I could
keep up there, but maybe they also turned me down because I'm 5 states away
and they thought I wasn't worth the recruiters time.

edit: downvoter, if you could provide your rationale that would be great.

~~~
minimaxir
You are likely getting downvoted because it is off-topic, at best.

