

How Digg is Built - atularora
http://about.digg.com/blog/how-digg-is-built

======
Todd
Great overview of the technology choices and how they interact. I really
appreciate when important companies reveal some of the inner workings of their
sites. It gives something for the rest of us to go on. It can help inform our
judgement when we make similar choices. This happens all too rarely. Many
thanks!

~~~
dajobe
Thanks, we hope to go deeper in areas people are interested in. Let us know.

~~~
pkaler
I'd love to see you go deeper into devops. How are changes pushed? How often
do you push changes? How are changes tested? How do you performance test at
scale? How do you recover if something goes wrong?

~~~
cduruk
Hi! We posted something that somehow answers that question.

[http://about.digg.com/blog/continuous-deployment-code-
review...](http://about.digg.com/blog/continuous-deployment-code-review-and-
pre-tested-commits-digg4)

Some parts of it are a bit outdated --we don't use Selenium tests for pre-
testing all commits-- but overall it should give some idea of how deployment
works at Digg.

------
gumbo
Read this again and i find it great. so helpful for a young start-up to choose
from the start the rigth tool and architechture.

Thanks again.

------
skimbrel
Cassandra, Redis, MySQL, and Mogile as primary data stores? Good lord. And I
thought my company had a complicated architecture.

~~~
dajobe
You think this diagram is complex? You should see the one that describes the
full details. Things I missed out include the tiny Java bit. No Ruby though.

~~~
gumbo
Why are you using so much data stores? is it due to some legacy? I understand
why you need MySQL (because some key algorithms are based on jointures), but
why have at the same time Redis and Casandra? you also mention that you'll
maybe replace MySQL by HBase? why? You also mention that the MapReduce jobs
inputs are some flat logs (i suppose that those logs are stored in HDBS) so
are you using any API to write in HDBS from RabbitMQ?

~~~
dajobe
Boy, a lot of questions. I already mentioned that the different stores have
different features. I wouldn't see Redis or Cassandra replacing each other for
anything. HBase is a potential replacement for bigger MySQL-like things such
as big joins across all user data, actions. M-R inputs are logs in HDFS,
copied from scribe. RabbitMQ isn't involved in logging those.

~~~
gumbo
Thanks for the answers. I know, a lot of questions: this is because we are
architecturing our plateform and having a detailled help from a successful
website is very helpfull.

Looking forward to see the next article.

------
sbhat7
Nice article. It'd be great to have some numbers that indicate the size/scale
of the system and its components.

~~~
dajobe
I'll see if we can include some figures in tech posts that follow.

------
rhizome
Very interesting. If that's world-class (which it is), I'm probably charging
too little for my time.

------
u48998
So basically users are happily swimming in the sea of advertisements with
content they don't own or create.themselves. Nice business!

~~~
dajobe
I'm the author of this article which was about the technology we use. Your
comment doesn't seem to be about tech at all.

~~~
u48998
Agree.

