Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A distributed Recommendation Engine based on Redis + Ruby (and C) (github.com)
97 points by paulasmuth on Feb 19, 2012 | hide | past | web | favorite | 11 comments

I've been working on something similar - https://github.com/jamii/springer-recommendations

All the open-source engines I've tried scale poorly. Since our input data already contains >200m interactions I suspect recommendify would struggle (from my quick reading it looks like the co-concurrence matrix is stored in redis ie in-memory).

The approach I'm leaning towards at the moment is collating the article->IPs and IP->articles tables in leveldb and then distributing read-only copies to each worker. Everything else can easily be partitioned by article id.

I can't tell from the README - is your data fairly wide? I'm playing with using Postgres's new K-Nearest-Neighbor support to calculate similarity on 20D cubes, but I suspect my approach won't work well for an arbitrary number of columns (i.e. users x products) unless you first do some sort of PCA or SVD to narrow it down, and it isn't optimized for binary ratings at all. I started writing it up here: http://parapoetica.wordpress.com/2012/02/15/feature-space-si...

> ... is your data fairly wide?

Around 200m download logs, 2m articles, some million IP addresses. I suspect that interest in research papers is inherently high dimensional and dimensional reduction would probably damage the results.

I don't have much hardware to throw at it either. I just started looking at randomized algorithms - trying to produce a random walk on the download graph that links articles with probability proportional to some measure of similarity (probably cosine distance or Jaccard index).

Terrific readme

+1 more for that awesome ReadMe. The project looks clean and simple.

+1 for (albeit simple) time analysis

How does this compare with other recommendation engines, such as the opensource EasyRec[0] ?

[0] http://easyrec.org/

Paul, how does C fits into the picture? I don't see anything in the source for compiling native extensions, and the implementations src/recommendify.c calls look incomplete?

I am guessing you still working on the gem?

Oh man, this is just awesome. Thanks so much for making this open source - it's great to be able to pick apart the internals of something like this.

Does any part of this involve taking the square root of something? Amazon's original similarities did that and I can't remember exactly how.

Oof. This is beautiful on so many levels.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact