

Ask YC: Bayesian filter for NSFW content ? - ptm

I've just launched No-NSFW (NSFW content warning system) which relies on user feedback to determine site ratings.<p>I'm now thinking of introducing a Bayesian filter to determine site content.  Does this make sense ?<p>Also, where do I hunt for seed data - I'm using nsfw.reddit for NSFW data (thanks kirubakaran), what do i use for SFW data ?
======
ra
Also have a look at DansGuardian <http://dansguardian.org/>. Blacklist files
are available here: <http://urlblacklist.com/>

I'm not sure what you are looking for in terms of safe for work data; maybe
technorati tags?

------
rms
Google safesearch might help...
[http://www.google.com/support/bin/static.py?page=searchguide...](http://www.google.com/support/bin/static.py?page=searchguides.html&ctx=preferences&hl=en)

------
xenoterracide
NSFW? I'm not familiar with the term (yes I could google it but perhaps you
could enlighten those of us who aren't, so we don't all have to.)

~~~
jfarmer
Not Safe For Work, i.e., not suitable for looking at while at work.

~~~
xenoterracide
ah thank you. once again I'm familliar with the long version but not the
acronym (I find this happening a lot lately).

personally I'd just like a bayesian filter on my rss.

