Hacker News new | past | comments | ask | show | jobs | submit login

This is the problem with using associative reasoning to detect bad actors. Bad actors are attempting to look like good actors, but good actors aren't trying to avoid looking like bad actors. Bad actors will target a stereotyped set of good actor attributes, and your filters will mostly catch unique or idiosyncratic good actors, who aren't targeting anything.

Unless your good actors are willing to conform to a published set of (nerfed) behaviors that don't have the possibility of being bad, or are willing to register and be vetted by you individually, you can't help but be overwhelmed by false positives. It's the same reason why the most intricate, pervasive, and technologically flexible surveillance system in history can't find a terrorist.

edit: I think the entire endeavor is doomed. The bubble associating your current searches with your past searches, and attempts to eliminate spam and eHow through algorithms have just resulted in eliminating most sites from the searchable internet. You don't realize how bad its gotten from all of the search engines until you spend an hour on something like millonshort (which looks like it's down now.)




Reminds me of (forum) mafia: there's a set of heuristics for how "town" players behave. But real town players will break those heuristics occasionally, because they're trying to catch the mafia - so the players who conform exactly to the "town" heuristics all the time are actually more likely to be mafia players.


What it's resulting in is a 'news-stand' effect in which articles from the mainstream press will rank higher than anyone else on a given query if the match is close enough. Although I'm not a huge fan of the press as it is today, it could be worse.

You can actually identify opportunities by finding valuable queries that are being squatted on by the content mills. Lots of nice things in stuff like home improvement, insurance, and finance.


The thing is, in the context of content, if a bad actor acts sufficiently like a good actor it is effectively a good actor. You just have do a sufficiently good job of determining good or bad and not some proxy "like good" or "like bad".




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: