
Flickr Engineers Do It Offline - joshwa
http://code.flickr.com/blog/2008/09/26/flickr-engineers-do-it-offline/
======
timtrueman
The point of having a backend queue is that on very large scale systems an
easy way to boost the "speed" of your site is never do synchronous writes
(writes being loosely defined as anything that needs to be changed or
executed, examples: inserts, updates, email notifications). If a users updates
a title or description, return the result page right away and queue the actual
operation to update the value. Locking a row or table for update is relatively
slow in large databases (ones measured in TBs for instance). The extra tens or
hundreds of milliseconds you gain from using a queue like this helps your
users perceive your website as "fast" even though technically it takes longer
than you've led them to believe.

------
schtono
I wonder how they use php as a queuing system? It must be a CLI application,
what do you think?

~~~
bigbang
You can use php, pretty much like any other scripting language(without
involving web), so yeah it could be a command line tool running in the
background, started by cron etc

~~~
LogicHoleFlaw
We use a lot of PHP cli cron jobs at my work. It works fairly well, but isn't
as stable as I'd like. Any bug which could cause a fatal interpreter error
can't be caught by your code so it's difficult to diagnose problems when they
crop up in production.

~~~
schtono
Agree, I have the same problem with my backend apps. Maybe php5's error
handling should do the trick - but haven't migrated to php5 yet anyhow.

~~~
LogicHoleFlaw
We're using PHP5 and the exception handling is better, but interpreter errors
just kill the process, full stop. We're writing watchdog processes just to
monitor the nightly batches just so we have better information if one of them
goes haywire.

------
joshu
You can just use MySQL as the backing store for the queue.

I think pretty much all systems of a sufficient size end up reinventing
this...

~~~
aschobel
We are using BDB, dead simple and insanely fast.

Relational database seems a bit overkill.

~~~
joshu
It may be overkill, but there is typically a great deal of expertise around
setting up and running them.

Also, MySQL etc have a great deal of network connectivity and concurrency
support that is not provided by BDB. (In the mentioned example, they say they
used PHP. Can you imagine doing concurrency in PHP?)

So it's more a matter of expediency than aesthetics. At scale, everything is
painful and you'd really rather not write anything you don't absolutely have
to.

~~~
jrockway
Concurrency is one of BDB's strongest points. (As for networking connectivity,
BDB has an RPC server which works pretty well, although I'd personally
probably roll something higher-level and stick that in front of the actual
database.)

~~~
aschobel
Yep. We looked at the BDB's RPC system but are probably going to end up going
with Thrift for RPC. Other than the lack of documentation, Thrift seems pretty
killer.

------
iamwil
Sounds like what Erlyweb can do natively in Erlang.

~~~
njharman
Having small systems that do one or a few related things well is much better
for scaling than a "does everything" system.

It's also very generally an architecture that is easier to
write/debug/extend/grok.

It is nice though that Erlang gets you into programming for scale from day 1
without making it more work.

