

Digg: 40x Performance Increase by Sorting in PHP Rather than MySQL - there
http://highscalability.com/blog/2010/3/23/digg-4000-performance-increase-by-sorting-in-php-rather-than.html

======
petercooper
On Feed Digest (a RSS feed processor I sold a few years back but was doing
~300m "digests" per month) I had a similar experience, so I think I know why
and what they were doing here..

The problem was that even though I denormalized my "posts" (from RSS feeds)
table into a "post contents" and a "post metadata" table, MySQL wants to
create a "temporary table" to do sorting and pulls every relevant row into it
(I found this was even true with indexed columns at the time - I hope they
improved it). This can lead to heavy disk access if the number of rows needing
to be sorted is too high (consider 100,000 rows of 150 bytes each.. 15
megabytes, not good for a server under heavy load that needs to return a query
in < 0.2 seconds).

My solution was to pull out ONLY the post IDs and date (or whatever column I
wanted to sort by) from MySQL, do the sort in memory, then pull out the posts
from the database by already ordered IDs (e.g. SELECT * FROM x WHERE id
IN(9,4,22,38,4,..)). MySQL is crazy fast at giving you back rows X, Y, Z, and
so on, if you specify them directly.

I ended up with many more _fast_ queries rather than fewer deathly slow
queries, and that was a great tradeoff in the end. I believe the system still
runs that way under its new owner.

~~~
ars
But that doesn't guarantee the results will be in that order. Just because you
put your ID's in a certain order in the IN() doesn't mean you'll get them back
in that order.

~~~
petercooper
Interesting. It's been 3 years since I sold the business and touched the code.
The basic mechanic worked as I described, but you're right (just tested), we
must have had extra workarounds.

Update: I just realized I have the code in an archive here so I loaded it up.
I can't believe this wasn't obvious to me! Since we already had the ordered
IDs in memory (to build the SELECT query!) we loaded the posts with a SELECT
.. IN into a hash (with the id as the key) then just went through the ordered
ID list and pulled out the posts from the hash in the right order! :-)

~~~
ars
So then why did you bother with the first query and then the IN() ?

Just get your data directly with a search, then array sort the hash using the
hash key.

~~~
petercooper
_Just get your data directly with a search, then array sort the hash using the
hash key._

If I have a table with 10m rows (it was about that) and I want 10 posts from
feeds X, Y and Z in reverse date order, to get them in just one query with
local sorting I'd have to retrieve _every_ row for feed X, Y and Z down the
wire before doing the local date sort.

If I did the sorting with ORDER BY to avoid _that_ issue, the complete JOIN
(meta<->full) is used in the temporary sort table so, a bigass temporary table
would result for MySQL to do its sorting.. bringing us back to square one :-)

Hopefully I'm not misunderstanding your point though.. it's been known to
happen.

~~~
ars
Using it with LIMIT makes sense, and I forgot you were doing that.

BTW, you were doing this to avoid using too much memory at the database, but
by doing this (assuming it's PHP; you didn't say) you are loading the result
set twice into webserver memory.

Obviously it worked for you, maybe the webserver had more free memory.

~~~
petercooper
It did (though less total memory). Significantly, though, the memory use was
ephemeral at the application server end. Do the collation, push to a memcache
server, server, and die :-)

------
smackfu
1) Sometimes MySQL is bad at things that databases should be good at it.

2) Moving stuff from your database server to your webserver is actually
changing the hardware environment it runs on. Especially relevant if your
database server isn't keeping up or is RAM constrained or whatever else is
different.

~~~
wanderr
It's also worth considering that most big sites have more web servers than
MySQL servers, and I'm sure Digg is in that boat as well. Doing your sorts on
20 front end nodes instead of 2-4 MySQL servers is obviously going to help
improve performance and scaling of your MySQL servers.

------
brutimus
This quote rubs me the wrong way a little...

"The relational database tool chain is not evolving. It has failed for large
scale, real-time environments."

I've been a big user of Postgres for going on five years now and it has made
giant leaps in both features and performance. You can't blindly say relational
tools are done evolving. I know the other players have made a lot of progress
lately too (Oracle, MySQL, etc).

~~~
rmaccloy
PostgreSQL has gained many nice features (windowing queries, parallel restore,
moving towards more standardized master/slave replication, etc), but none of
them have substantially affected its suitability as a high-throughput/large-
data/(relaxed-consistency)/low-latency datastore. (On the other hand, it has
become significantly nicer for BI/OLAP workloads.)

Same for MySQL, modulo (possibly) the Drizzle project.

Oracle has a lot of different irons in the fire, but "cheap scaling" isn't one
of them. (If you've got the money to burn... TimesTen?)

That's not a knock on any of the above, it's just a different use case.

~~~
cturner
Vertica is designed for scaling. That's based on postgresql and under active
development to that end.

~~~
rmaccloy
So is Greenplum (more so, actually.)

Neither are useful for the Digg usecase. Edit: That is to say, they're
optimized for large datasets, low concurrency, medium latency, medium-to-high
consistency and medium-to-low availability.

------
barredo
Old story. Sorry if it's offtopic:

I was running a internal webapp with a 25M rows in a MySQL table. The query
for search was something like (pseudosql ahead):

    
    
      select * from table where match title_field against search_term order by id desc limit 20
    

It took around 10 to 12 seconds (it was a slow machine). When I removed the
sorting statement in the SQL query (and so, the limit)

    
    
      select * from table where match title_field against search_term
    

Then sorted the returned results with PHP and get the first 20. It took only 1
second.

~~~
ars
You probably needed to run analyze table.

What it probably did was sort by id (in memory), and then run the search (i.e.
check each row if it matched, and stopping after 20 rows), not realizing that
it should do the search first, and only then sort.

EXPLAIN .... is your friend here.

Without the order by, the only possible index was the search, so it used that.

And digg probably had a similar error, since it makes no sense for php to be
any faster in sorting than mysql.

~~~
petercooper
_since it makes no sense for php to be any faster in sorting than mysql._

Perhaps MySQL has improved lately but working on my working knowledge of about
3 years ago, it _can_ make sense to do sorting externally... if you can work
with _fewer data_ to do the sorting externally ;-)

MySQL does/used to pull entire rows into the temporary table created for
sorting. So if you have a 10,000 row, 100MB table and you're doing a SELECT
that grabs, say, 10% (1000) of the rows and sorts them.. MySQL's temporary
sort table will be 10MB (I found this to be true regardless of indexes - which
only sped up the creation of the temporary table, but did not reduce its size
- but as I was doing complex multi column sorts, it didn't matter anyway).

If you do it in PHP/wherever, you can pull out ONLY the "id" and the column
you want to sort by (say, a date), and use only about 1000 * 10 bytesish ==
10KB of data to do the id sort. Then you pull the full rows out by id (which
MySQL is quick at).

 _Disclaimer: I last did all this about 3 years ago on an early version MySQL
5 deployment doing about 200 queries a second. So I ain't no RDBMS expert and
what I say should probably be taken with a pinch of salt._

~~~
ars
This could work, but then you'd need to generate 1000 queries one at a time,
meaning lots and lots of disk seeks.

As long as your sort table stays in memory you are fine even if it's a lot of
data, but if you hit the disk then you could be right. (But even if mysql says
it hits the disk, doesn't mean it really does since the OS could cache it.)

But there are better ways of handling this. Number one being an index on that
column.

You could also split the table in two, one part for sort data, the other for
the large data. But I would only recommend that in very special cases, and
only after careful benchmarking.

Most of the time just increase the maximum size of in-memory temporary tables.

~~~
petercooper
_This could work, but then you'd need to generate 1000 queries one at a time,
meaning lots and lots of disk seeks._

Not with a SELECT .. WHERE id IN (x,x,x,...) type query. But, yeah, sort of. I
actually _did_ trial with separate SELECTs for every ID at first and it
performed surprisingly well.

 _As long as your sort table stays in memory you are fine even if it's a lot
of data_

Trueish. At the time, though, it was a 10-12GB database on a 4GB machine and
in 2006 I didn't have the cashflow to get anything better. So the most popular
parts of the DB were cached OK, but there was little left for temporary
tables. I say "trueish", though, because you're still using more memory
bandwidth with temporary tables in this situation and that's an issue
sometimes.

 _Number one being an index on that column._

The problem was that there were about 8 metadata columns and sorts were done
with varying combinations of these (though usually one). Indexes became a big
problem at a certain point because of how difficult they made it to clear
stale data from the database (at one point we were doing 6 hour maintenance
periods each month - with caches picking up the slack - to delete old
records).

If I were doing it again, I'd do a lot differently (and not use MySQL), of
course.. so your suggestions are certainly useful to anyone tackling a similar
problem with 2010 eyes.

~~~
jseifer
What would you use instead of MySQL?

~~~
petercooper
MongoDB is a running joke with you but, seriously, probably MongoDB along with
Redis. It wouldn't be using the same algorithms as it did on MySQL, though.
There are better ways to structure that sort of data nowadays.

~~~
jseifer
It's funny that it comes off as a joke since I really do love it. I mean, it
powers <http://nyhtp.com>. I've had nothing but great experience with Redis,
mixed experience with tokyo cabinet and tyrant, and not enough real work
experience with Mongo.

~~~
petercooper
MongoDB + Redis are proving a winning combination for me so far. Or, rather,
the combination of having a fast key/value store with ephemeral key support
_and_ a document database is a winner. It gives you two levels and more
immediate optimization and separate of concerns opportunities.

------
akamaka
Whenever I read about Digg's infrastructure, I think of Markus Frind's
comments from back in the day. He's the guy who built plentyoffish.com single-
handedly. A lot has changed since 2006, but at the time he said that Digg had
the worst infrastructure of any major site. A fascinating read, because his
basic point is still very relevant today:

"Digg is doomed unless they fire their tech staff." -
[http://plentyoffish.wordpress.com/2006/10/08/digg-is-
doomed-...](http://plentyoffish.wordpress.com/2006/10/08/digg-is-doomed-
unless-they-fire-their-tech-staff/)

------
figgs
"Scaling practices turn a relational database into a non-relational database."

When will people learn that the 'relation' in relational database comes from
the correspondence between database tables and the mathematical notion of
relations, not the ability to describe relationships between tables? Unless
scaling practices make a relational database stop using tables, the database
still being used as a relational database.

~~~
gnaritas
While you're technically correct, language does evolve and if 90% of
developers think relational means foreign keys and normals forms then that's
what it means. Words are used to communicate and sometimes get misused and
take on a new meaning; that's just the way it goes. I've met very few
developers or database guys who know that a relation comes from the relational
algebra and has nothing to do with foreign keys or normals forms.

~~~
gizmomagico
> Words are used to communicate and sometimes get misused and take on a new
> meaning; that's just the way it goes.

I sure hope that's not _always_ the way it goes. Because then "then" would
soon mean "than" and "than" would mean "then".

------
nessence
If the query cache, key buffer, sort buffer, and various engine configs are
too small for the dataset then MySQL will take a less optimal approach. MySQL
is an RDBMS for persistent data, not a spreadsheet...

"right tool for the job", etc.

------
urmane
At my first SA job someone was using the database for some cargo-cult
manipulation and sorting - it took eight hours. I rewrote it in a few lines of
perl, and it took 17 seconds - 12 after removing the debugging print
statements :-)

~~~
mdg
1337

------
fauigerzigerk
The most popular "solution" to all kinds of scalability problems seems to be
to simply not support use cases that don't scale well.

You can't have consistency. You can't query and view your data the way you
want to. Only prearranged queries are allowed.

It's all very well to do that, but it's not a better solution to the original
problem. It's acknowledging that the original problem is unsolvable or put
differently, it's giving up.

Scalability, for me, is making the things scale that I want to do, not doing
something that scales.

Sometimes it seems to me that we're heading straight back to the mainframe era
where everything that wasn't prearranged, didn't fit the nightly batch window
or was otherwise unexpected simply couldn't be done.

The answer to static precomputed reports and highly specialised, highly
inflexible data structures were relational database systems and Excel.

~~~
easp
And yet we still haven't completely replaced those old, inflexible systems,
even as we are inventing new ones.

So, were relational databases and spreadsheets really the answer, or just an
alternative?

~~~
fauigerzigerk
Nothing is ever completely replaced, but you can answer your own question by
asking _why_ the old systems were inflexible and _how_ RDBMS and spreadsheets
improved that situation. It's not just some historical freak event like
Betamax v VHS. There's logic.

Hard coded access paths and data structures combined with explicit and
selective consistency management is always faster and more scalable provided
the number of different use cases is small and requirements don't change much.
Doing a small number of things on a huge scale justifies the approach that
people like twitter are taking.

Unfortunately what scales physically doesn't scale in terms of logical
complexity. As soon as a greater variety of tasks, views and analyses have to
be supported, productivity plummets and things come to a screeching halt.
That's exactly what happened to corporate IT departments back then. Everything
took ages to implement because you couldn't just join over or group by some
unexpected set of attributes.

As a reaction to that, two things happened. For one, people tried to free
their data from the iron grip of centralised IT departments and put it on PCs
and into spread sheets that used very generic data structures to support a
very large variety of views and analyses. Nothing had to be predetermined. Of
course that also led to utter chaos in terms of consistency and it didn't
scale to large amounts of data.

The other idea to cope with hard coded information was to separate the logical
representation of information from pysical representations and access paths.
Relational normalisation doesn't just guarantee consistency, much more
importantly it guarantees the productivity of creating a wide variety of views
on data. A separate and independent layer contains all the special casing and
optimisations for frequent use cases.

I think the idea of modelling information according to a general, formally
well defined logic and separating that model from implementation constraints
is a timeless one. It's not just "an alternative". It's a fundamental
principle of dealing with information and it's not specific to the relational
model. I've been doing this for long enough to know that this kind of purity
is hard to achieve and has to be comprimised now and then.

Throwing out good design principles is sometimes necessary. I can see why it
is necessary for Google, Amazon or Facebook. But it is going to make these
behemoths slower and less flexible. It's a price they pay for their size. It
makes no sense for small companies to pay that price whilst not benefiting
from the same economies of scale.

------
bluesmoon
4000% sounds so much cooler than 40x

~~~
nazgulnarsil
41x one of the reasons I don't like using percents is that people always make
mistakes with them. 100% increase is doubling, 200% tripling etc.

~~~
gamache
A 4000% increase isn't a 41x increase. It's a gain factor of 41, or a 40x
increase. (Not that it even matters three percent either way here.)

------
dejb
I wish I was running a site big enough for these scaling issues with MySql to
be a problem. I'm sure an SSD drive or a battery backed RAM drive would
sustain some pretty serious growth for all but the really mega sites like
digg. This makes me wonder if these super scalable systems will ever be made
into products suitable for small and medium sites without a team of uber geeks
to support it.

~~~
megaman821
I wonder how many sites there are that couldn't be served by MySQL or Postgres
with SSDs in a RAID array and 64 gigs of RAM. Of course it is easier to spin
up 10 EC2 instances with Cassandra on them, but for the vast majority of sites
I don't think relation databases have reached their limits yet.

------
known
What’s different/better about an in-memory database versus STL or Boost
collections, or even just creating my own memory-mapped file(s)?

 _Any database system goes far beyond giving you a set of interfaces to manage
collections, lists, etc. This typically includes support for ACID (atomic,
consistent, isolated and durable) transactions, multi-user access, a high
level data definition language, one or more programming interfaces (including
industry-standard SQL), triggers/event notifications, and more._

<http://www.mcobject.com/in_memory_database>

------
ccarpenterg
MySQL is very slow. I've tried MongoDB with the C++ API but I had a lot of
"vague" problems and bugs so I gave it up and I think MongoDB is not ready for
a production env. However the JSON representation of data is very convenient
and very powerful in my case.

I also was playing with Redis and redis-py, but Python is too slow for me. So
now at this very moment I'm playing with Memcached and its C API.

Redis and MongoDB look very promising but not reliable yet.

~~~
antirez
But reliable enough for Craigslist?

No critical bug found in 1.0 and 1.2 since their release apart for a
replication bug happening in corner case conditions.

Please share with us how Redis is not reliable and I'll try to fix this ASAP.

~~~
metachor
"I couldn't figure out how to make it work, so it must be broken."

------
memoryfault
I remember reading a while back how Digg had tables with thousands of columns.
I know MySQL isn't the fastest, but a speedup for Digg really isn't that
impressive.

------
jhaile
If sorting in PHP is 4000% faster than sorting in the DB, then it's a failure
of the DB. That shouldn't be the case.

~~~
jhaile
I guarantee that sorting in PostgreSQL is faster than doing it in PHP. MySQL
is the problem, not databases in general.

------
audidude
i thought slashdot figured this out like a decade ago.

------
leff_f
Old news... We all know what a real PHP coder must write their on DBMS...
mostly because sorting is just aweful in Mysql.

