

Ask News.YC: Any studies on whether social filtering works? - aston

Web 2.0's clearly about people, and the more people you get, and the more you know about them, the more your company's worth, right? The social graph, while not yet clearly monetizable, is universally considered valuable for the predictive ability we think it has or will soon have.<p>Which leads to my question: Does data about my friends actually help <i>at all</i> when it comes to guessing things for me? I know Amazon, Netflix and others have demonstrated the predictive power of aggregated data, but my gut says restricting that data only to people I know will make it worse, not better. And if indeed the friend graph decreases predictive ability, what's the value of the social graph beyond the network effect/lock-in to a service?<p>Pure speculation is cool, though if anyone has hard data on this, that'd be even awesomer.
======
JacobAldridge
I would assume it better to compare and make predictions about me based on
somebody unknown to me but with strikingly similar purchasing habits, than to
compare me just to known associates.

Part of the problem with the social graph, particularly as it exists on broad
networking sites like Facebook et al, is the lack of data about the nature of
those relationships. (ie, "the more you know about them"). I've got however
many dozen friends, but the social graph there doesn't distinguish between the
guys I went to primary school with and the guys I see every weekend.

If I could better position my relationships, more accurate data could be
created. Eg, if I declare Dave is my 'drinking buddy' and Dan is my 'film
friend', the system could make reasonable predictions about what I might like
to drink based on Dave and what films I might watch based on Dan, while
ignoring the vice-versa.

In other words, the value of the social graph requires more meaningful
information about relationships, not just more people. Without that,
aggregated data wins out.

You're right - pure speculation is cool!

------
einarvollset
Turns out that "it depends"; a recent paper by Jon Kleinberg, Sid Suri and
others suggests that for wikipedia edits, social networks are more important
predictors, where as for Live Journal, "similarity networks" are better.

For hard data google the paper (it's a preprint so I'm not gonna link to it):
Sid Suri site:cs.cornell.edu, and checkout the Papers link.

~~~
aston
Made me take the long way to the paper, but from the abstract it's exactly
what I was looking for. Thanks.

------
morbidkk
It all depends on how data is aggregated and how specialized service like
yours cater to the data i.e uses that data.

For example linkedIn takes all the data from the end user whenever there is
communication event.

1) adding a profile: name/school/college/specialization/job industry/company

2) adding a new job- job industry/company

3) adding a contact - type of relationship

4) asking a question - mark that to industry vertical/set of people

So having granular details (which would help your service to serve users
better) defines what data you need to collect from start and then you can use
the same.

pure speculation is cooler with above set of data

------
skmurphy
Social filtering absolutely works in the small: consider how often your
friends and co-workers can recommend things that are of interest and of use to
you. How you construct an application that facilitates or automates this is a
different question.

------
wheels
From a theoretical perspective, what you're assuming above is that you apply a
social filter and then use collaborative filtering (i.e. Amazon, Netflix and
similar's method). In fact the two supplement each other -- collaborative
filtering using the entire customer database and then do a weighted merge of
that with an algorithm ranking things based on social graph traversal.

