
TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data - svx
http://highscalability.com/blog/2011/6/27/tripadvisor-architecture-40m-visitors-200m-dynamic-page-view.html
======
scorpioxy
"You own your code and its effects - you design, you test, you code, you
monitor. If you break something, you fix it."

I couldn't agree more. The more I work with code and people, the more I
realize the importance of this statement. Having separate roles for
"Architect", "Programmer", "Tester" and "Support" is excellent if you want to
introduce bureaucracy and process into the work place and consider programmers
as just "cogs in a machine". It's also excellent if you want to see how work
can grind to a halt.

Let people own their work and they will only produce things that they are
proud of.

~~~
kenjackson
The other thing it introduces though is "not my code". There used to be a guy
we used to call the "Teflon Don". You couldn't get a bug to stick to him. He
could always find a way to move a bug to someone else -- even if it came back,
he would spend more time trying to figure out how to pass it on to someone
else, rather than just fix it.

At the end of the day hire good ppl. They'll do the right thing most of the
time. Hire the wrong people and you'll find that your incentive structure will
always lead to deviant behavior.

~~~
websiteguy
(disclosure - I am the author of this article)

"At the end of the day hire good ppl. They'll do the right thing most of the
time." - absolutely agree.

No system is perfect, ours is not. You need to decide what is important to
your organization and emphasize it - they way we do things here at TA has its
downsides, but overall, it works very well for us and what we want to do. The
benefits for us significantly outweigh the drawbacks.

It is critical to have a boss who understands how software development works,
and that mistakes get made. If anything, my boss wants my team to make more
mistakes than we do :-)

------
kenjackson
Really good article.

One point I do disagree with is this:

 _It is better to deliver 20 projects with 10 bugs and miss 5 projects by two
days than to deliver 10 projects that are all perfect and on time._

You never actually have such a choice, but if one did exist I'd pick 10
perfect on-time projects. The problem is I don't know what the 10 bugs are.
While I'd like them to all be typos in the footer of the contacts page, they
could also be data corruption or a password leak.

If I had ppl that delivered on time with no bugs, but at half the speed, it
then becomes my job to do a good prioritizing the projects such that the half
we do are the right half. I think the constraint of being half as fast might
even force us to make better choices.

~~~
kwis
For any project, you're always balancing quality, schedule, budget, scope and
risk to try to get the best business outcome. Because of this, I don't believe
in any of the hard and fast rules about the "right" way to code. It's far too
situational.

That said, you make an excellent point about the importance of project
prioritization. A management team who is good at identifying high ROI projects
will stomp all over one who sprays pointless change requests at their dev
team.

------
jokull
I've helped my grandmother manage her business listing on there. It has
brought many happy costumers and for at least 4 months a year we have a full
booking of all spaces. However, the product is poorly done. It is hard to
navigate and there are no tools to share accounts or have multiple listings
within one account. I've also encountered numerous technical errors and
limitations that make me think the product might not be secure. It feels like
the more recent features are sitting on an outdated and overly complex
codebase (just a guess).

So if any TripAdvisor employees are listening - thank you for a product that
has worked well for us. But if you can improve it I am sure you could bring in
more costumers on both ends (managers and travelers).

~~~
websiteguy
Hi jokull,

Would like to understand more, is there a way to contact you ?

Andy

~~~
jokull
jokull@solberg.is

------
peterwwillis
Some reflections:

Culture

    
    
      - The "all engineering is organized by business function" is a business trick i've seen old dot-coms implement. When you have a business function that goes from making you money to costing you, just fire everyone in that business function and sell it off. You quickly get rid of what's dragging you down and continue your profitable business functions. However, this also creates duplication of some jobs, waste, and an insidious bureaucracy where teams are fighting each other. (I could have read that section wrong, but that's what it reminded me of)
    
      - Engineering Swaps: Really? You just uproot people so they have to re-learn how something else works and take time away from getting stuff done? Couldn't you just do a couple hour-long knowledge sharing sessions between developers?
    

Random thoughts

\- Don't design too far ahead - doesn't fighting fires and coming up on
sudden, unexpected deadline changes due to lack of foresight kind of drag on
your employees? You can't keep overworked code monkeys forever. It's one thing
to code "quick and dirty" to get a job done. It's another thing not to plan
for the future.

\- Put end to end responsibility on a single engineer - so when that guy's on
vacation nobody knows how to fix the thing he owns that broke?

\- Process - Doesn't say if you require change control, but some sort of
change alert/control system is really useful to tell people when you're
changing something which may affect others, so they can immediately see what
could have affected the broken thing and call the person who could have broken
it.

~~~
websiteguy
(disclosure - I am the author of this article) I think this comes down to
judgment.

Design appropriately. Short sighted designs are as bad as ivory tower designs.
Every situation is different.

If a person does not own something, no one does - everything comes down to the
individual. This does not mean that there is no overlap in knowledge.

Source control is a must. So are scripts that detect changes in your area. So
are tests.

------
reedlaw
"It is far better to do two queries (get the set of reviews with their member
ids, then get all of the member from this set of ids and merge it at the app
level) than do a join."

Is this way really faster? We've been moving the opposite direction in a Rails
app (from iterating over data in ruby to joins). We have a fast RDS instance
that seems to far outperform our app running on Heroku for complex data
manipulation.

~~~
hn_decay
If they are joining across partitions, then perhaps. If they are forced to
localize data to make it joinable, whereas doing the "join" at the app level
makes it more horizontally scalable, then that might be an impact.

However against a single instance if you can join in your app faster than on
the database for a trivial join, something is seriously wrong with your
implementation. I have never, ever seen such a case where it wasn't a scenario
where they should have analyzed their plan, to discover a monstrous issue they
need to resolve.

In many high performance database systems the IPC to the database level is
actually the most expensive operation. Doing two calls instead of one is
always a net negative unless you're doing something wrong or fit isolated
horizonal scaling scenarios.

~~~
websiteguy
(disclosure, I am the author of this post)

Hi hn_decay,

Our use case goes like this: a member database of over a 100M records, and a
number of content databases each with tens or hundreds of millions of records
(reviews, video, lists, wiki, this, that , the other thing, one content table
with over a billion records, where all the content records had a member id.
Our primary usage pattern is to grab a set of content records (say 10-200 at a
time) with their member information.

Putting everything in one database and doing the join there does not scale for
us, and severely reduces flexibility. We would need to continue to scale up
our hardware to handle the sum of the content sets, and new content sets are
being created on a regular basis. By putting these all into different
databases you then have the choice (not the necessity) of keeping them on one
or more machines. You can put on one machine a bunch of content sets that are
relatively small, and put the big ones on their own machine. You can also
scale the hardware to individual content sets - infrequently accessed content
sets do not have to be on powerful machines, very frequently accessed sets can
be scaled on bigger machines.

There are downsides, the two-query hit being the least significant, the extra
query on a tuned database is on order of 1ms. Even if the hit was larger, I
would still live it, scalability != performance

Andy

~~~
hn_decay
I was replying to the context of the post, and specifically spoke to
horizontal scalability so I am confused that you felt it appropriate to
"correct" that.

Having said that, hundreds of millions of records equals a small dataset. I
still don't understand when that's held as some sort of edge case when it's
easily accommodated on commodity low-end hardware.

------
scrrr
I find myself surfing to tripadvisor often when travelling. They would have
the potential to build a airbnb right in there..

~~~
martinshen
I think they have.. AirBnB like sites have existed for a while:
[http://www.flipkey.com/?utm_source=ta&utm_medium=foot...](http://www.flipkey.com/?utm_source=ta&utm_medium=foot&utm_campaign=tamg)

------
sanj
Full Disclosure: I work at TripAdvisor.

Andy's assessment is accurate and is part of the reason I really, really like
it here. The other part is the folks I get to work with.

If you're interested in joining us, drop me a line. Info in my HN profile.

~~~
dhugiaskmak
Why does your site use behind-the-window popups even though I have popup
blocking explicitly enabled in my browser? If I were hired at TripAdvisor
could my first project be to get rid of such scumbag behavior?

