
Uber's technology is reportedly 'hanging by a thread' - tangled
http://www.businessinsider.com/ubers-technology-is-reportedly-hanging-by-a-thread-but-the-company-has-a-new-cto-to-get-it-together-2015-9
======
cjslep
> The engineers in charge of these systems have been "at odds," which has
> created friction, according to The Information.

> Uber’s chief technology officer, Thuan Pham, later wrote to his staff that
> the mistake “reflects an amateurism with our overall engineering
> organization, its culture, its processes, and its operation.”

This makes it sound as if the two engineers in charge of the Node.js and
Python systems are bickering over which technology stack is better and refuse
to compromise. I get there is worry about career progression if your backend
option loses, the glory of being _the_ engineer in charge of the entire
backend of Uber, and the different specs of different technologies. But to
hold the entire company hostage seems like it would be a career-ending move to
me, regardless of sides. Then again, I am not in management.

~~~
boothead
This sounds very much like two little pigs arguing about the merits of straw
vs twigs when they should be building in brick to me.

~~~
snissn
By brick, which language would you want to use?

~~~
bliti
At that scale my choice would be Java. Proven, reliable, plenty of devs,
vendor support, good type system, and plenty of open source code to leverage.
Plus it's it's a mature ecosystem.

~~~
PopeOfNope
Java the language or any language that compiles to the JVM and can leverage
existing Java libraries?

~~~
bliti
I'd go with Java itself. It's the most mature and stable. Uber is not a
startup. It needs an enterprise solution.

~~~
unethical_ban
Scala, Go, Python, Haskell, Erlang.

------
inthewoods
Some serious link bait in that article title - sounds like they've got issues
but they are working through it. The article ends by pointing out that New
Years Eve went off without a hitch.

~~~
probablyfiction
A more accurate title would have been "Uber's new CTO is doing pretty okay,"
but that doesn't entice people to click nearly as much

------
usefulcat
"The company's engineering staff has grown to 1,200 — a quarter of Uber's
workforce — from just 400 people."

 _1200?_ I'd be very interested to know what they're all doing. Not saying
that what the company is doing is easy on that scale, but it's hard to see how
throwing _a thousand people_ at the problem can be an effective solution.
Unless many of them are working on new products? But Uber seems pretty young
to be investing that much in R&D.

~~~
PopeOfNope
Driverless Cars. They've basically hired the entire engineering department at
Carnegie Mellon in Pittsburgh, where their R&D offices are located.

------
dantillberg
What benefits would they see/get from using their own co-located servers
instead of using VMs on a cloud provider?

For a fast-growing business, it would seem a huge win to not have to worry
about physically scaling your infrastructure. And I can't imagine that their
infrastructure size is so large (they're not, for example, indexing the
internet) that they would get a huge cost savings from using their own
hardware.

But certainly, I must be missing something?

~~~
latch
What benefit would they get from using cloud? A lot more money for much worse
performance. It's like buying a monthly bus pass when you could buy a Honda
Civic for less.

Despite what Cloud providers want you to think, scaling (for most of us) is
largely an architecture, design and software issue and it tends to be rather
specific to your system (until you get to the point where you have to build
your own). "Auto-scaling" doesn't help you make sure you aren't opening too
many forking connections to your database, it doesn't eliminate coarse locks,
un-optimized system calls, making sure you take advantage of cache locality,
sharding, denormalization or really anything that requires effort.

The game changes a bit with the SaaS stuff that PaaS vendors are selling
(dynamodb, for example), but then you pay a lock-in price.

------
sschueller
"reflected an amateurism with our overall engineering organization, its
culture, its processes, and its operation.”

That is not going to get the engineers on your side to solve the problem.
Instead it will create even more friction between managers and engineers.

~~~
late2part
One nice thing about friction is that if it's done correctly, it can rub away
the problems. Forcing people to deal with the reality of the situation often
forces positive change.

~~~
zipwitch
Indeed.

"Uber Raided By Dutch Authorities, Seen As 'Criminal Organization'"
[http://yro.slashdot.org/story/15/09/29/2328232/uber-
raided-b...](http://yro.slashdot.org/story/15/09/29/2328232/uber-raided-by-
dutch-authorities-seen-as-criminal-organization)

------
arethuza
"setting up servers in a new data center on Halloween"

That kind of problem seems remarkably common - the end of months and years can
be peak times for businesses but it's also the kind of date that people pick
for their project milestones ("we'll have the servers in by the end of the
year").

~~~
amatix
or "lets have new capacity in place a month before busy Halloween" ...
progress slips... the project manager pushes and everyone forgets the reason
for the original date.

------
OliverJones
Yeah, this article is in "Business Insider" rather than "Tech Insider." Ya can
tell by reading it. The real WTF is having that many developers without a
solid ops team to match. At least they can charge extra for peak loads, which
many of us cannot do.

~~~
9872
Uber can't legally do it either. They just don't give a shit and do so anyway.

~~~
icebraining
What laws ban surge pricing?

~~~
ryandvm
The anti-capitalist ones.

------
agarcia-deniz
I wonder why the original article wasn't posted rather than the
businessinsider article.

edit:

Paywall. I get it.

------
nikropht
They need EEs like Twitter has.

------
imaginenore
From my perspective, Uber backend can be sharded trivially. No driver in LA is
going to get matched with a passenger in NYC.

~~~
charliesome
I generally find it wise to avoid calling engineering problems faced by other
organisations trivial. There are often a lot of complicating factors that
aren't obvious from an outside perspective.

~~~
latch
Sure, but it's fun to talk about how you'd build it given some assumptions. I
agree with your parent that it seems like a shardable dataset. To boot, I
imagine that you could _easily_ store all cars for a city/region in memory
(type, x, y, driver id, status, ....).

Next I'd grid the area (maybe 250m2?) into blocks. When a user does a search,
figure out his or her block, and start looking for cars in the current block,
expanding outwards. You read-lock the blocks as you examine them. You only
write lock 2 blocks as a car moves from one block to another (the blocks
themselves would have a list of cars, maybe an array, maybe an rtree, maybe
another grid).

It all falls apart when a single server can't handle all the load for a
region. But then you could sub-divide the region, connect the servers with a
queue (this car is now your responsibility), and let clients (not necessarily
devices, but the api servers consuming these data servers) join the data.

~~~
msandford
If you can have one server per city then I agree. Especially if those cities
are small and you never travel between them. But places like LA totally blow
that theory as you're going to have to have multiple servers or shards to
cover everywhere.

~~~
latch
Why LA, # of drivers, size of the city or # of users? Quick googling says LA
is 500 miles squared, with Beijing being 6500.

The only case where I see 1 server failing is # of requests. All their drivers
in the world probably fit in a few GB of memory (if that). 99% of a car's
movement requires no locking (they stay in the same block), so there's very
little write locking...I dunno...give it a 24 core server (or more)...

Maybe the problem is node and python. I don't know python runtime well enough,
but this kind of setup is a nightmare for node. Sharing data across processes
just isn't what it was meant to do (it rather fork and have a copy, but then
your memory doubles per fork, and you have to keep it in sync). That's true
for a lot of dynamic languages.

~~~
msandford
Just using the google answer for "how big is LA" doesn't really cut it as
there are millions of people that live outside of LA proper but still in the
region that most people would call "LA". If you go to Wikipedia for Greater
Los Angeles it's 34,000 square miles.
[https://en.wikipedia.org/wiki/Greater_Los_Angeles_Area](https://en.wikipedia.org/wiki/Greater_Los_Angeles_Area)

sqrt(34000) is nearly 200 miles on a side. And that doesn't really even cover
all the areas that one might Uber from or to. So you'll have a lot of people
crossing shard boundaries.

Greater LA has some 18mm residents, and again that doesn't count some of the
places very close but not really IN the area. Places you might drive to in 15
minutes from the edge, over a pass.

