

Fighting Spam for a New Startup - palish

Heya!  For my site launch I'm pretty sure I need to be ready to fight spam, but I'm not sure how ready.  At the very least I'll set up a basic spam filter, but I'm unsure if I should train it myself beforehand (by copy and pasting the 1000 Gmail spam emails I have in my inbox) or wait until site launch.<p>People will write text that other people see, so it's probable that some of it will be spam.<p>What do you all do?  Is there a big database of spam to preconfigure spam filters?
======
ajju
For email spam - <http://plg.uwaterloo.ca/~gvcormac/treccorpus/>

For web spam - [http://www-
static.cc.gatech.edu/projects/doi/WebbSpamCorpus....](http://www-
static.cc.gatech.edu/projects/doi/WebbSpamCorpus.html)

------
daniel-cussen
I figure old spam should work.

On a side note, how many people are making spam-filtering startups?

~~~
ajju
There can't be that many (edit - that many Email spam startups at least. Web
spam, blog spam etc are another issue). It's a crowded and relatively mature
market that just went through a round of acquisitions of mature private anti-
spam companies (CipherTrust, IronPort and Postini)

