It cost, IIRC, a tenth of a cent per image URL. Rather than being based on skin tone, it was created based on algos to specifically identify labia, anuses, penises, etc. REST API: send a URL, get back a yes/no/maybe. You decided what to do with the maybes.
- Before launch, I tested it with 4chan b as a feed, and was able to produce a mostly clean version of b with the exception of cartoon imagery.
- It could catch most of the stuff people tried to post to the site. Small breasted women (being that breasts are considered 'adult' in the US) was the only thing that would get through and wasn't a huge concern. Completely unmaintained public hair (as revealing as a black bikini) would also get through.
- Since people didn't know what I was testing with they didn't work around it (so nobody tried posting drawings or cartoons), but I imagine eg a photo of a prolapse might not trigger the anus detection as the shape would be too different.
- pifilter erred on the side of false negative, but one notable false positive: a pastrami sandwich.
Imagine being able to fire off a dozen /b/tards bots at the website or target of your choice.
But that could also be developing breasts on a youth, and that would mean the image is something you very much want to block and report.
Notionally the oscilloscope would be there to show that the luminance and chroma was okay in the signal (i.e. it could be broadcast over the airwaves to look as intended at the other end - PAL/NTSC), however, porn and anything likely to be porn had a distinctive pattern on the oscilloscope screen. Should porn be suspected then the source material would obviously be patched through to a monitor 'just in case'.
Note that the oscilloscope was analog and that the image would be changing 25/30 times a second. Also, back then there were not so many false positives on broadcast TV, e.g. pop videos etc. where today's audience deems them artful rather than porn.
If I had to solve the problem programatically I would find a retired broadcast engineer and start from there, with what can be learned from a 'scope.
I found out that no single technique works great. If you want an efficient algorithm, you probably have to blend different ideas and compute a "nudity score" for each image. That's at least what I do.
I'd be happy to discuss how it works. Here are a few techniques used:
- color recognition (as discussed in other comments)
- haar-wavelets to detect specific shapes (that's what Facebook and others use to detect faces for example)
- texture recognition (skin and wood may have the same colors but not the same texture)
- shape/contour recognition (machine learning of course)
- matching with a growing database of NSFW images
The algorithm is open for test here:
It works OK right now but once version 2 is out it should really be great.
Source: I helped implement a MT job to filter adult content for a large hosting company.
I recall reading an article about the human workers who had this job at Google... they had crap benefits, crap pay, and no mental healthcare. As a result a lot of the people in the field had depression and other mental issues.
There was absolutely no reward in showing up to work, and some of the things they saw have likely scarred their memories forever
Only interesting tasks were the psychology ones, but they got boring too (think Daniel Kahnemans experiments repeated ad nauseum).
How people last on MT longer than a month is something that I
I used the so called Bag of Visual Words approach. At that time the state of the art in image recognition (now it's neural networks). You can read about on Wikipeida. The only main change from the standard approach (SHIFT + k-means + histograms + SVM + chi2 kernel) was that I used a version of SHIFT that uses color features. In addition to this I used a second machine learning classifier based on the context of the picture. Who posted it? Is it a new user? What are the words in the title? How many view does the picture have....
In combination the two classifiers worked nearly flawless.
Shortly after that, chat roulette has having it's porn problem and it was in the media that the founder was working on a porn filter. I send an email to offer my help, but didn't get an reaction.
You can find other implementation of varying quality if you Google for Bag of Visual Words. For the final classification, I would recommend scikit-learn.
puritanweirdos.example.com with no skin showing between toes and top of turtleneck (edited to add no pokies either)
normalpeople.example.com with 99% of the human race
The best solution to a problem involving computers is sometimes computer related, but sometimes is social. The puritans are never going to get along with the normal people anyway, so its not like sharding them is going to hurt.
Another way to hack the system is not to hire or accept holier than thou puritans. Personality doesn't mesh with the team, doesn't fit culture, etc. You have to draw the line somewhere, and weirdos on either end should get cut, so no CP or animals at one extreme, and no holy rollers on the other extreme.
The final social hack is its kind of like dealing with bullies via appeasement. So they're blocking reasonable stuff today, tomorrow they want to block all women not wearing burkhas or depictions of women damaging their ovaries by driving. Appeasing bullies never really works in the long run, so why bother starting. "If you claim not to like it, or at least enjoy telling everyone else repeatedly how you claim not to like it, stop looking at it so much, case closed"
For example if you've got children, given your stance on the matter, you may not necessarily agree that a filter is necessary, but how about being alerted when your children are viewing obscene content? How about being alerted when your children are engaging in sexting?
When it comes to children, maintaining their purity is only one side of the coin, a necessity with which not all people necessarily agree. But the other side of the coin that's a pretty objective fact is that children do get the wrong ideas about what they see and sometimes it happens with adults too, with porn being the main reason why men think they need big penises to satisfy women. And there's a lot of weird porn out there. With improper exposure, a child can end up growing with certain ideas about doing sex, with certain complexes and so on.
And I'm not necessarily for censoring that content, as children can find ways around the censorship should they want to, plus these filters aren't perfect anyway. But I would find useful a system that alerts me when my child gets exposed to porn, such that I can take appropriate measures, like having fatherly talks about sex, explaining to him that what he just saw is a really bad idea in case he looked at something weird.
Plus, exposure to porn can happen 100% by accident and that's my personal problem with it. I take my monthly dosage for appealing to my stockings fetish, but you know, I like to be in control of when that happens. Like going to a website and clicking on something can trigger a popup involving ads to either poker games or porn. Sometimes they've got sound too. Imagine hearing in the workplace the sound of a woman's moan. It's totally disrespectful to your colleagues, as it disrupts their workflow. I was searching for something on ThePirateBay once and it happened to me.
Is it porn that gives them the wrong idea, or is it because porn is the only concept of sex that they get exposed to?
But I would find useful a system that alerts me when my child gets exposed to porn, such that I can take appropriate measures, like having fatherly talks about sex, explaining to him that what he just saw is a really bad idea in case he looked at something weird.
Why wouldn't you have fatherly talks about sex regardless of exposure to porn?
Imagine hearing in the workplace the sound of a woman's moan. It's totally disrespectful to your colleagues, as it disrupts their workflow.
Playing any sound is disrespectful to your colleagues, by breaking their concentration. The pornographic part is irrelevant.
My wife works at a kindergarten and she has first hand experience with 4-year olds being exposed to porn and the effects really aren't nice.
What do you propose actually? Live demos or experiments under adult supervision?
> Why wouldn't you have fatherly talks about sex regardless of exposure to porn?
Because there is no proper way to bring up people having sex with horses and dogs into a conversation with a child, let alone that the act of making love doesn't necessarily involve slapping a woman and forcefully shoving your dick inside her mouth, as not all women like that.
You either missed my point or you don't have children of your own, but really, I'd love to get your idea of a conversation with a six year old.
> Playing any sound is disrespectful to your colleagues, by breaking their concentration. The pornographic part is irrelevant
Are you seriously implying that all sounds are created equal, especially given that some people end up behaving like children when hearing sounds related to sex?
Do you also fart in elevators? When it's hot outside, do you also come butt-naked at your office? When you get horny, do you make announcements?
I was thinking more about 8-10 year olds, to whom there's plenty of ways to introduce the concept of sex, not only through conversation but also vetted media (e.g. books, films).
I don't have kids, no, but was involved in the education of my (much) younger brother, and so in a privileged position to discuss this with him (compared to his parents), and he has grown up knowing about sex way before discovering porn.
I'm implying that a porn filter wouldn't solve the real problem. As for people behaving like children when hearing sounds related to sex, I have to say I don't know any above 20 years old.
No. I also try not to make silly comparisons on online discussions.
The conversation with the 6 year old is easy. It's the conversation with the 13 year old that's the problem. Liberals have "the talk" early. Right-wingers have "the talk" late. Very few people have talk regularly as the child grows up.
I'm not sure what your last sentence has to do with the argument.
I think this is made up, or at most the kids are reacting to the teacher freaking out or trying to tease the teacher knowing its a great way to freak her out.
It is comical to think back at what I thought of sex-ploitive-ish TV when I was a kid, pre-puberty. Now yes I know this wasn't "XXX chicks with happy horses" or whatever, merely broadcast TV, but I don't think that would change the reaction very much.
The women on Charlies Angels? Eh whatever, the A-Team was a better action show although even for a kid, a little formulaic.
The actresses on Laverne and Shirley don't wear bras? Who cares they look like goofballs wearing 50s clothes this is 1980 lets watch battlestar galactica (the original).
The dukes of hazard was all about the car chases, I didn't even notice Daisy's attire (or lack thereof) until my hormones kicked in and suddenly she was very interesting indeed, like from zero interest to 1000+ in what seemed like weeks. When I was young I thought it was a stupid show, like a lame version of Knight Rider which I much preferred.
So the happy couple is reunited and their relationship rekindled on the love boat. Well good for them but I have no interest in watching them smooch and grope each other for thirty seconds on camera. Heck I'd switch to Mr Belvedere if it was on.
Baywatch, seriously, that is the dumbest show ever if you're under 13 or so. Why is the intro this woman running along the beach in slow motion, if she's rescuing people she should be swiming into the ocean not running along it, even a dumb kid knows that.
Once the hormones kick in all a filter does is get in the way of whats suddenly tremendously interesting.
I could see it being awkward for a 7th grade teacher where suddenly some of the boys instantly find Miley Cyrus (or whoever) completely fascinating and the rest are all "eh, girl music, who cares". But at grade-K I'm thinking the reaction is going to be pretty minor most likely "eww gross" but mostly "so what".
I would worry more about gore and gore-shock sites, which is getting pretty far off topic.
So I'm a geezer, so what. So we'll try something more contemporary. How many kids actually required therapy after the famous janet jackson wardrobe malfunction during the superbowl. If they needed therapy it was probably for the agonizing awkward experience of watching what 60 year old bald white guys in NYC think suburban teen youth of america think urban rap stars think is cool and more or less screwing it up hilariously.
These days, it comes down to exposure and access. Back when I was a teen (early 90's) we didn't have the internet as it is. Sure there were BBSs but that was about as convenient as trying to get a dirty magazine. While I may had idle curiosity when I was pre-teen, I had zero access to it (my father didn't have any magazines, etc) and therefore it would've taken way too much effort to see it, so it was easily dismissed. In context, it was equally difficult to even see an R rated movie.
These days, my 8 or 10 year old self could be exposed to literally anything within a couple seconds. It's free and prevalent, and can show up in unexpected places. I don't have kids (will soon) but I don't look forward to dealing with those situations.
As for Janet Jackson, I agree that was overblown. If anyone needed therapy for that it was because their parents were overbearing. However, we aren't talking about a bare breast here, we are talking about simulated rape, bestiality, and some other things that just aren't my thing. But, hey, if they are someone's thing that's fine - let's just try to keep that stuff out of the hands of the kids until they are old enough to reason for themselves about what's really going on.
More than likely your not going to find out they don't mesh with the team till they quit or are fired. If your lucky its not followed by a lawsuit.
See, its not their job to not be offended, it is your job to offer a harassment free work environment. Whom you are catering too depends on what is PC or not at the time, fortunately its pretty easy to determine whose whims to cater to or not
In an old job I had the task of cleaning spam off a forum, this meant looking at rather a lot of porn but it would never have occurred to me to sue on that basis, yet that is still a job that needs to be done.
Sexual harassment is unwelcome sexual advances, requests for sexual favors, and other verbal or physical harassment of a sexual nature, no matter how many people are sexually harassed or the gender of the parties involved.
There have been class action sexual harassment lawsuits. Jenson v. Eveleth Taconite Co. was the first which represented fifteen women.
It's a fuzzy ground with say prominent displays of pornographic material in say an office environment. One might argue that it can make the workplace a hostile or overtly sexually charged environment. It is best to err on the side of caution.
That is NOT a case about accidentally clicking on a web page or debating if huffpo headline stories are sometimes a little too racy, that was a case about a madhouse of stalking and intimidation and tire slashing and... I don't think a mere web pr0n image filter would have saved the mine 3.5 million dollars. Actually having capable supervisors and managers, yeah, that might have worked?
I used the first class action sexual harassment lawsuit as an example to illustrate the point of "just because it's company culture/I do it to everyone/he's just like that" doesn't make it not harassment.
The GP was giving false information that if you sexually harass everyone, than it magically becomes not sexual harassment and then it somehow gets thrown out in court, but does not cite the relevant court case. Just because you ass grab everyone that walks by doesn't make ass grabbing not sexual harassment by the law.
That's mostly my point.
As far as a p0rn filter, that's a different story. You can look at it a different way, legally.
I'm just playing devil's advocate here as to why it would be a good idea for someone to try to err on the side of caution in the workplace filter rules.
They may want to avoid the situation where person A is sitting in their cube while person B is watching hardcore gay porn next door specifically to harass person A. Person A might feel that the employee didn't do a good enough job to prevent this type of harassment by blocking access to that content.
If the employer doesn't try to filter out porn or at least create a policy against it, it might look like it is condoning or supporting it whereas their peers filter. (I just made that up, if say 85% of their peers make an attempt to filter) especially if it is well known that managers tolerate risque images/video if not outright porn, or if company culture tolerates it.
A less grey example is if you put a picture of two women who are mostly clothed but kissing and put it in the office breakroom. Everyone sees it, but it creates a sexually charged environment that can be hostile to some. That's a big no-no in HR land.
The problem with some jerk harassing female coworkers with .. full stop ... harassing coworkers comma, is he's not doing his job, and he's preventing others from doing their job, and the only mystery is why that problem is not being taken care of at that level. Its not pass the buck time, why didn't some dude in another department find some hairbrained technical scheme to try to temporarily stop the nutcase from bothering people... he's got a boss who's already responsible for that task.
Is this one of those weird coastie vs heartland type issues where we have real bosses so coastie stories sound weird, or ...
That's the crazy part of the story. If the boss simply doesn't care, then blocking certain pix on port 80 is merely going to result in your example of hanging up pictures, or carrying in zip drives, or whatever, obviously the boss won't care about those either. So the net result is a huge waste of time and money for ... nothing?
I would think there's some CYA going on if you know there's about to be another 3.5 million dollar "taconite" type lawsuit filed, and its going to be a slam dunk, whoever it is ordering hairbrained technical solutions doomed to failure probably won't be punished as hard as the guy doing the harassment or his boss who doesn't care, or his bosses boss. But a legal hit like that means tune up the resume anyway, its not looking good.
I would bet a case of Moet that most of the Hyperbolic HR warnings over this stuff actuality have no reasonable chance of ever causing a loss.
I will say it again in my experience I have yet to see a credible case written up in reputable legal publications that detail any cases where a pron filter or the lack off one made any difference to a case.
Any one care to post a link to prove me wrong rather than than moding me down?
Also your claim that all people who wants to block pornographic images are really bullies who will not stop until all women are in burkhas is stupid in itself.
Develop a bot to trawl NSFW sites and hash each image (combined with the 'skin detecting' algorithms detailed previously). Then compare the user uploaded image hash with those in the NSFW database.
This technique relies on the assumption that NSFW images that are spammed onto social media sites will use images that already exist on NSFW sites (or are very similar to). Then it simply becomes a case of pattern recognition, much like SoundHound for audio, or Google Image search.
It wouldn't reliably detect 'original' NSFW material, but given enough cock shots as source material, it could probably find a common pattern over time.
edit: I've just noticed rfusca in the OP suggests a similar method
Do you have to tell it what shapes/colors to look for? Or do a combination of overall image similar combined with localized image similarity and portion by portion image comparison?
FWIW the biggest problem with this is false positives, though admittedly I may just not be clever enough to do it with enough finesse.
Detecting smurf-porn(1) (yes that's a thing...) is even harder since all the actors are blue.
http://pinporngifs.blogspot.dk/2012/09/smurfs-porn.html?zx=7... - obviously very NSFW, but quite funny.
Then you can convince me that a "sufficiently large training set" exists and is smaller than "all the images on the internet".
I would argue that this less difficult than you average image classification problem. Just have a look at what kind of challenges image classification can tackle, picking the correct class out of 1000s of classes. Porn is normally well-lit, the subject is at the center of the image, etc.
The main difficulty is to define what is porn and what is not ... It's easy see the difference between porn and pictures of bicycles. But how about porn and artistic nudity? You see it's actually a scale, but you are trying to make a binary decision.
Another problem (at least with the method I explained below) is that portraits sometimes get misclassified. Maybe it could help to integrate face detection. I'd suspect that more recent models would not have this problem (e.g. ones that take not only local features into account). Other times it makes mistakes, where you think "why on earth would think this is porn". Again combing different method should help to eliminate those outliers.
Outliers are also a problem, e.g. black and white pictures. Again an ensemble of different models (e.g. one color independent one) might help. Niches are not really a big problem. BSDM porn is as far as I have seen the only niche of porn that really different visually.
man 1 smurf-porn ?!?!
Edit: No shortage of stock image reviewer jobs https://google.com/search?hl=en&q=%22image%20reviewer%22
I'm trying to find an interview of one of these people describing what it's like on the other end. It wasn't a pleasant story. These folks are employed by the likes of Facebook, Photobucket etc... Most are outsourced, obviously, and they all have very high turnover.
Edit: I think it was this one: http://www.buzzfeed.com/reyhan/tech-confessional-the-googler...
A pretty unpleasant job no matter what angle you look at it.
I remember this bit the most :
"Google covered one session with a government-appointed therapist —
and encouraged me to go out and get my own therapy after I left."
This is the down side to 'abstracting away' the dirty end of filtering. I'm looking forward to a day when this can be properly automated, but, considering the ever-changing nature of erotica to begin with, I don't see that happening any time soon.
If you're trying for "must not offend any human being on the planet" then you've got an AI problem that exceeds even my own human intelligence problem to figure out. Especially when it extends past pr0n and into stuff like satire, is that just some dudes weird self portrait, or a satire of the prophet, and are you qualified to figure it out?
The classic problem of trying to filter pornography is trying to separate it from information about human bodies. I suspect that doing this with images will be even harder than doing it with text.
That said, not all sites are like Facebook and we aren't talking about filtering all the images on the internet, just ones on specific sites. One example I can think of is that a forum for a sports team might not want NSFW pictures posted as it would be irrelevant.
We as humans can readily classify images into three vague categories: clean, questionable, and pornographic. The problem of classification is not only one of determining which bucket an image falls into but also one of determining where the boundaries between buckets are. Is a topless woman pornographic? A topless man? A painting of a topless woman created centuries ago by a well-recognized artist? A painting of a topless woman done yesterday by a relatively unknown artist? An infant being bathed? A woman breastfeeding her baby? Reasonable people may disagree on which bucket these examples fall in.
So what if I create three filter sets: restrictive, moderate, and permissive, and then categorize 1,000 sample images as one of those three categories for each filter set (restrictive could be equal to moderate but filter questionable images as well as pornographic ones).
Assuming that the learning algorithm was programmed to look at a sufficiently large number of image attributes, this approach should easily be capable of creating the most robust (and learning!) filter to date.
Has anyone done this?
>There are already a few image based search engines as well as face recognition stuff available so I am assuming it wouldn't be rocket science and it could be done.
Just do a reverse image search for the image, see if it comes up on any porn sites or is associated with porn words.
Basically, it's impossible to completely accurately identify pornography without a human actor in the mix, due to the subjectivity... and especially considering that not all nudity is pornographic.
Take a look at the scores for classifying dogs vs cats with 97% accuracy http://www.kaggle.com/c/dogs-vs-cats/leaderboard. You could use a technique of digitizing the image pixels and feeding to a learning algorithm, similar to http://www.primaryobjects.com/CMS/Article154.aspx.
 Shih, J. L., Lee, C. H., & Yang, C. S. (2007). An adult image identification system employing image retrieval technique. Pattern Recognition Letters, 28(16), 2367-2374.
Nudity != porn and certainly half-nudity != porn.
I'd rather go for pattern recognition. There's lot of image recognition software these days that can distinguish the Eiffel Tower from the Statue of Liberty and it might be useful to detect certain body parts and certain body configurations (for these shots that don't contain any private body part but there are two bodies in an unambiguous configuration).
When I was a kid, we had a firewall at school that tried to filter pornography by doing something similar with text. Doing research on breast cancer turned out to be rather tricky.
So let's say you try to detect certain body parts. Now you have someone who wants to know more about their body, but you are classifying images from medical / health articles as pornography.
"certain body configurations"
So now instead of having trouble reading about my own body, I will have trouble looking at certain martial arts photos:
I am not saying these are unsolvable problems, but they are certainly hard problems. Even using humans to filter images tends to result in outrageous false positives sometimes:
At the end of the day I doubt there could be a fully bulletproof and always correct solution using current state of tech. But you need to factor much more than just skin color if you try to build an automated solution to this problem.
If you assume that porn tends to cluster, rather than exist in isolation, then a crawl of other images on the source pages , applying computer vision techniques, should allow you to block pages that score above a threshold number of positive results (thus accounting for inaccuracy and false positives).
How about social scoring? A normal (or even a weirdo) teenage boy would spend less than a second examining my ugly old profile pix, but after ten or so of your known teen male users are detected to spend 5 minutes at a time, a couple times a day, closely studying a suspected profile pix, I think you can safely conclude that pix is not a pix of me and then flag / censor / whatever it for the next 10K users.
If you wanted to spot shock imagery it's easier - study navigation aways or rapid scrolling.
Depending on the site, I'd go to a trust-based solution. New users get their images approved by a human censor (pr0n == spambot in most cases). Established users can add images without approval.
If you're going to try software, try something that errs on the side of caution, and send everything to a human for final decision-making, just like spam filters.
Maybe a good approach is an image lookup, trying to find the image on the web and seeing if it appears on a porn site, or a pornographic context.
Um, so to speak.