Hacker News new | comments | show | ask | jobs | submit login

1. It doesn't count word frequencies, but sub-string frequencies. Moreover, if a sub-string appears more than once-per-title, then it is counted more than once. I draw this conclusion by submitting "a,b,c". And from their paper [1]:

   our algorithm strips out dashes and catches any 
   occurrence of the query in the title, for example, 
   'blow' catches 'blowing', 'blowjobs'
This explains the results of these queries: "ada,erlang", "tea,beer". As an alternative they could have used a stemmer [2].

2. The "slow,fast" and "love,hardcore" trends illustrate an interesting trend. Perhaps towards women or mainstream viewers.

[1] http://sexualitics.org/wp-content/uploads/2014/01/PORNSTUDIE...

[2] http://nlp.stanford.edu/IR-book/html/htmledition/stemming-an...




> 2. The "slow,fast" and "love,hardcore" trends illustrate an interesting trend. Perhaps towards women or mainstream viewers.

I don't think so [1]

[1] https://www.google.nl/search?q=teen+loves+to&oq=teen+loves+t...




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: