RethinkDB: Why Start a New Database Company in 2010? (MySQLConf)

rythie · on April 17, 2010

A really interesting point at about 37 minutes in, where they say that drives could end up executing code just like graphics cards do now.

That could really speed things up. You wouldn't have all the latency of doing multiple reads on the device to find a file. Normally the CPU would have interrupt raised for each return of the read but with this it could just request a file and get the return in one go.

gdickie · on April 17, 2010

The Netezza database appliance essentially works like this -- each disk is paired with an FPGA and a CPU core. The disk-local CPU takes care of caching directory information for the disk, decides what order to process disk pages in, and manages a disk cache with full awareness of the application. After a disk read the FPGA cuts away unwanted data before the CPU has to look at it. This has been shipping for several years.

To the extent that your problem is embarrassingly parallel and can be executed near the disk, you speed up linearly with the number of disks.

(I am currently employed by Netezza)

anamax · on April 17, 2010

> A really interesting point at about 37 minutes in, where they say that drives could end up executing code

And the wheel of reincarnation starts to turn again.

Some/most IBM "channel controllers" (read disk controllers) were fairly programmable. At some points in time and with some configurations, it was faster to run things on them than in the mainframe. In other cases, it was faster to run "everything" on the mainframe.

Some folks made a fairly good living moving parts of applications to and from channel controllers.

hga · on April 17, 2010

Indeed; we're told they're flaky in part because the have as much firmware as a modern OS has kernel code, there's no reason why they can't eventually allow for this sort of thing.

Even given how social and political doing that right would be, this could happen before the day SSDs displace magnetic media (a day we can not as yet really foresee).

helwr · on April 17, 2010

i was just reading about Active Disks yesterday:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15....

http://www.pdl.cmu.edu/Active/index.shtml

rythie · on April 17, 2010

They're both pretty old I think, around 2001? Neither seem particularly relevant to today's SSDs.

jules · on April 17, 2010

I would have loved a little more detail on what kind of performance they already achieved and what kind of performance they think they can achieve.

hristov · on April 17, 2010

This was pretty interesting, I must say. It is nice that they got into the nitty gritty technical details.

modsearch · on April 17, 2010

this video works on iPad :-) I was kind of amazed actually.. oh and good stuff Slava and Michael!

cpach · on April 17, 2010

Yeah, blip.tv is a really nice service. There are also download links for offline consumption. That's great for those of us who don't like Flash but are using a browser that lacks H.264 support.

RethinkDB:s blog only links to the YouTube version of this presentation, so thanks mace for providing this alternative link.