It is basically the same method used for email spam filters (though that are not always bayes filters anymore). Users mark comments as spam or ham, and when a new comment arrives you look at the words in it, calculate their spamminness rating using the bayes algorithm, and can then predict whether a comment is spam or not.
In blogs that works great. Ursprung combines that with a honeypot, a field hidden with css in the comment form that if filled out will dismiss the comment.
After reading about the Gmail spam protection I thought it was impossible to do anything against spam, because what Gmail does is not really possible to common people.
So I would like to know from you, that is much more experienced than I, what are my possibilities?
Also, isn't there a hosted service, SaaS or something, that offers filters and allow for feedback (users marking messages as spam), like a pay-as-you-go trained Bayes filter?
But you can run a filter yourself, and because your personal filter will be more adapted to the specific spam you get, the results can be suprisingly good. https://spamassassin.apache.org/ is one of the old ways of doing that, it should work fine for your scenario.
> Also, isn't there a hosted service, SaaS or something, that offers filters and allow for feedback (users marking messages as spam), like a pay-as-you-go trained Bayes filter?
Yes, tailored for blog-comments even. It is called akismet and run by Automattic, the company behind Wordpress, see https://akismet.com/. It is very effective and a good solution without costs if you are willing to give the comment data to an US-entity, which is very critical in Europe where I live. Though you can't train it yourself from outside of Wordpress, to my knowledge.
For mails there are several spam filter services, like http://www.mailroute.net/, but I do not have any experience with them.
It is designed for realtime testing of blog-comments as spam/ham. There are a lot of options, and it has been in use for a couple of years.
How do you protect yourself against spammers falsely classifying their comments as ham?
I'm not sure I want to integrate that for ursprung. So far, the local methods were enough (and the bayes filter easy to provide via a gem), if that changes for me or others I will reconsider. But serendipity (http://www.s9y.org/) should get a plugin for that. that is certainly used in places where more options would help. I'll look into it.
Spammers would need to train their comments as ham for every domain - since if bayasian testing is done it is done on a per-site basis. I think that's probably the biggest reason why they've not bothered.
Oh, I did not realize that. That at the same time is totally cool and will help to tailor the filter, and misses some potential by not having the global spam characteristics.
I added the plugin to my todo-list. Not having to force the user to bother with an Api-key alone is worth the effort.
It provides an API you can use to test if a blog-comment is spam or not.
Looks like a nice alternative to ghost.
I think I started before Ghost existed, but it went in a similar direction later on, possible that it influenced me. The big differences apart from the programming would be that there is no backend, that it is a way smaller project, that it has more features of a real blog at the same time (comments, trackbacks and pingbacks) while not having the cool editor.