We used hashes with BDB. I didn't dig down deep to see what was going on since we weren't really considering using BDB because of its licensing -- GPL, and applications link directly to it, unlike, say, MySQL, and while we're currently only offering access as a web-service, we'd like to have the option open to licensing the recommendations engine in other ways down the line. So mostly we were trying it out to have another data-point to see how our implementation stacked up.

Interesting. Looks like Berkeley DB will be good enough for me to prove the concept then, and if, or seemingly when, it becomes the bottleneck I'll know it's possible to improve on it. Thanks again; it's great having first-hand access to those that have done it instead of just theorising about it, like me. :-)

