
The Learning Behind Gmail Priority Inbox [pdf] - DanielRibeiro
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36955.pdf
======
gregschlom
It's interesting to see that the main reason for the complexity of this task
seems to lie, according to the authors, in the fact that the processing has to
be done on Google's servers and thus needs to scale to a 100+ million users.

Would gmail be a desktop client, instead of a webmail, the problem would be
much easier. But of course it would probably never get millions of users in
the first place.

Yet, I'm pointing this out because while everyone can see the benefits of
moving into the cloud, it's quite easy to forget that that in doing so we also
give up on the processing power of the clients, and that their cumulative
total is far from insignificant.

disclaimer: I'm working on a desktop client for gmail.

~~~
alextp
Conversely, it'd also be an impossible problem for a lot of users, as for
users that don't have that much activity a good global model is crucial to the
proper functioning of the system, and you can't have that with completely
local processing.

~~~
dododo
right, a key point of this method is that they are doing transfer learning
from the general gmail population to each individual user. the reason for this
is that the data from each user is small and sparse, and has relatively little
statistical power.

you simply cannot do this on a desktop system unless you pull in data from
other users. you might be able to do that but: bandwidth, privacy concerns,
and you might find you have to do just as much work to get the same
performance.

------
dododo
this work was presented as a mini-talk at the NIPS conference workshop on
"Learning Cores, Clusters and Clouds":

<http://lccc.eecs.berkeley.edu/papers.html#mini_talks_i>

