Do you really expect to be able to detect and
filter anything that's conceivably stupid?
No, of course not. You'd need real AI for that,
and beyond a certain point it's simply subjective;
after all, a sufficiently advanced AI would probably
filter out the whole of human discourse, which
isn't the idea.
I have to admit that I laughed pretty hard when I read that...
"this is the most shittest game online ever full of little nooby kids" (taken from a youtube comment) is "not likely stupid". Add a period at the end, though, and it works.
Maybe the parsing system isn't designed to handle incomplete sentences or sentence fragments? That could be difficult but key; many of the bottom of the barrel posts on the internet have no punctuation at all.
Update: Starting to really take an interest in this. While there are false negatives, I can't seem to find any false positives.
Unrelatedly, one of the challenges in analyzing modern text with traditional NLP tools is that the tools usually expect standard English, whereas the text is rather colloquial, and the punctuation is for timing purposes, when present at all.
I can't get the thing to detect any stupidity whatsoever. I tried:
-dude that was pretty funny lol
-dude thats a funny one! lol
-ur a ghey lord!
All passed as not being likely to be stupid.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
If something like this was actually used, I predict we would just see a shift from "I'm in ur base, killin ur doods" to "I am in your base, and I am in the process of killing your dudes."
I feel like the bigger challenge is finding the smallest piece of text that isn't considered stupid. I hypothesized that this would be difficult, but it turns out 'a' isn't likely to be stupid.