

An exploration of Yelp's own filtered reviews - jhdavids8
http://jamiehdavidson.blogspot.com/2011/09/exploration-of-yelps-own-filtered.html

======
progolferyo
Interesting article. I don't understand the filtered review system at all.
Beyond the 'he said / she said' complaints that occasionally come out, there
are things about their system that simply don't make any sense unless Yelp is
incompetent or slimy. For example:

\- When you post a review, you as a reviewer think its unfiltered forever.
When you revisit the page as a logged in user and read a place that has your
review, your review is visible. When you log out or log in as another user,
the review is filtered and hidden. At the very least, it should tell you your
review is filtered, I see no reason to pretend the review is not filtered when
the review is legitimate.

\- When you view unfiltered results, the per page number mysteriously changes
to 10 per page. I don't see any reason why this should change. Plus the
results are pretty slow to load, quite slower than the results for filtered
reviews.

\- Why do you need to enter in a captcha to view the unfiltered reviews? Why
would they care if you were a bot only for the unfiltered reviews and not the
normal reviews? I don't see the difference, unless they want to prevent people
from writing scripts to pull in unfiltered review data. Plus the captcha is
fucking horrible, literally half the captcha's I get are not readable and I
need to refresh.

\- The filter algorithm seems to be clearly flawed and simply catches way too
many reviews that should not be filtered. For example, take this user:
[http://www.yelp.com/user_details?userid=tZlbsUVo-8wtnR7oMa-3...](http://www.yelp.com/user_details?userid=tZlbsUVo-8wtnR7oMa-3xg)
. The guy has 11 reviews, 1 1-star review, 1 2-star review and nothing out of
the ordinary and yet his review about Yelp was filtered. Why? His points in
the review seemed legitimate. He seems to be a normal user, not a new user and
posts reviews across the board (more good reviews than bad in fact). They
should either fix the algorithm or be more transparent about why reviews are
filtered because I can't understand why a review like that is filtered.

~~~
newhouseb
> When you view unfiltered results, the per page number mysteriously changes
> to 10 per page. [...] Plus the results are pretty slow to load, quite slower
> than the results for filtered reviews.

Caching, I'm sure most unfiltered reviews are cached whereas filtered reviews
are not and reaching out past the cache can be expensive. One way to mitigate
this is to reduce the number of results you pull.

> Why do you need to enter in a captcha to view the unfiltered reviews? Why
> would they care if you were a bot only for the unfiltered reviews and not
> the normal reviews?

If you can write a script to deduce the filtering algorithm then you can by
definition write reviews that thwart it. With less data, it is harder to
deduce the filtering algorithm. In other words, a captcha thwarts high-volume
review fraud.

> The filter algorithm seems to be clearly flawed and simply catches way too
> many reviews that should not be filtered.

I think most people seem to underestimate the difficulty of the problem.
Unlike e-mail spam, which is easy for a human to spot, fake reviews are very
hard for a human to spot. How can you tell if a consumer was provoked into
writing a positive review so that they could get a few bucks off their order
just from their writing? You can't, you can look at other statistical trends
behind such reviews (such as a sudden wave of positive reviews), but you're
only looking for side effects of the primary problem and thus you will never
achieve perfect performance from a method like this.

Yelp takes the (somewhat philosophical) viewpoint that customers who are
coerced into writing a review are less genuine than they would be otherwise. I
believe that this view drives a lot of their algorithm and possibly threatens
its accuracy in a way that is ultimately not worth it. I think there are a
number of things that Yelp could do to make the users trust in reviews greater
that don't involve filtering - one simple thing would be for a user's review
of an Indian restaurant to show me that user's breakdown of reviews of other
Indian restaurants.

TL;DR: This is a much harder problem than it seems at first glance, partly
because of the nature of the problem and partly how Yelp has framed it for
themselves.

Disclaimer: I used to work at Yelp, but no longer do. Everyone I worked with
were stand-up guys.

~~~
progolferyo
I guess most of your points make sense. I just feel like Yelp does very little
to be open and transparent. I get that its very difficult, nobody ever said
its easy to algorithmically guess review spam.

But they clearly don't want users to see unfiltered reviews. A tiny gray link
below all 40 reviews, then a captcha (or two or three) and then a slow user
experience before you can see the filtered reviews is lame.

I agree with you about the showing other reviews of the same subject, that
would be neat. I guess if I were Yelp, I would try harder at standing up for
their algorithms and show more data about why they work and why we are better
off having their amazing algorithms.

I had an experience a year or so ago with a friend who started a moving
service in SF. A couple of months after he started the business, he noticed he
received a review on Yelp from some dude that said during a moving job, the
guy took a smoke break and peed all over the sofa he was moving. Not only was
the story ridiculously false but my buddy had no idea who the reviewer was.
The review did however NOT get filtered, even after he responded to the review
and contacted Yelp. And he was stuck with this crazy review at the top of his
profile. This went on for months and it really damaged his credibility,
meanwhile he would have positive reviews from legitimate customers who would
naturally have a newer profile or whatever and the reviews would get filtered.
It just seems like Yelp should be more sophisticated. (And yes, they are
10000% better than TripAdvisor)

------
jrockway
"I don't like Yelp, so here are some random unsubstantiated complaints about
the reviews about Yelp itself."

The author seems to think people hate Yelp, but I'm not sure that's the case.
Everyone I know in real life uses it regularly when trying to find some place
to go, and the results are largely acceptable.

Review sites are always going to trend negative because people who have had
average or good experiences aren't going to be driven by rage to write a nasty
review. Everyone has their own star scale and expectations of service ("I had
to sit in economy class on my $10 ticket! I'm never flying United again!").
This leads to useless star ratings, but this is no fault of Yelp itself. It's
the fault of relying on non-professionals to do professional-quality work.
But, if you read for content, you can usually figure out whether a place is
good or not. For example, a review like "I went during the dinner rush on
Saturday night and it took 5 minutes to get a table! 1 star! Oh yeah, the food
was good." is a positive review, even though the reviewer only gave one star.

So anyway, don't hate on Yelp, hate on the clueless people clueless writing
reviews.

~~~
blantonl
_I don't like Yelp, so here are some random unsubstantiated complaints about
the reviews about Yelp itself._

I'm inclined to agree with you about the original poster's perspective but it
is hard for me to get past the serious accusations and evidence presented by
numerous business owners; that Yelp basically extorted them when they received
negative reviews and were a new listing on Yelp.

Based on multiple reports it appears as if Yelp's business intelligence group
actively targeted(s) businesses that met certain criteria (new, very few
reviews), sent those results to an inside sales team, and then if the customer
didn't purchase from the inside sales rep Yelp effectively changed their
filter process for that individual business to "penalize" them for not paying
to play.

Zagat, Urbanspoon, and TripAdvisor, have never to my knowledge ever
implemented these type tactics. I might be wrong, but barring a set of
insurmountable complete coincidences for each and every one of the businesses
that reported these algorithm changes in the way reviews were displayed, I've
got to conclude that Yelp is using their inside sales teams to pressure
(extort?) businesses that their business intel teams identify, and make them
pay dearly if they don't pay.

~~~
derwiki
I think this is a vocal minority. Sure, you hear all the time about this
random biz owner who thinks Yelp is out to get him, but that doesn't account
for the hundreds of thousands of biz owners who have claimed a free account
and not cried afoul.

A more likely explanation of "extortion" is that the sales org hires 21 year
olds fresh out of school who routinely don't make it past 90 days. I'd imagine
some of these reps might resort to shady practices to fill a quota, but that's
also why they don't last past 90 days. I think this is just a problem of
needing a sales organization to make money.

Disclaimer: I used to work on the engineering team at Yelp, and I have the
utmost respect for the intelligence and morality of the engineers on the
review filtering team.

~~~
nieve
I'm willing to postulate that the shady tactics are originated by the sales
reps, but not that Yelp isn't responsible in that case. Tolerance of a culture
of unethical sales, creating an incentive & hiring structure that encourages
it, and being obstructionist when someone who encounters it tries to get their
issue dealt with speaks rather more to complicity than innocence. As further
anecdotal indication (since you appear to believe that all complaints are
isolated and don't appear to have any threshold at which you'll accept
otherwise), my girlfriend's cell phone was listed as her restaurant's phone
number and she was told repeatedly by support staff, sales, and supervisors
that Yelp would not under any circumstance correct this unless she paid for an
account. Since her place was open from 6pm to 9am and she had to have to phone
on in case of emergencies there, she got calls all night long seven days a
week. The only thing that fixed this when she was at the point of involving
her lawyer was a random Yelp employee who frequented the restaurant hearing
about it. If a single internal comment is enough to fix it, but a legally
actionable case (non-public number acquired from a review, refusal to address,
telemarketing calls to it) gets "sue and be damned" I'm inclined to think that
the company has a major issue at the interface with businesses. She literally
did not care what the reviews were because word of mouth in a couple of
communities was far more important, but they kept trying to sweeten the deal
by offering to "fix" the negative ones.

------
derwiki
Yelp has more information on their web site about the review filter,
soliciting reviews, false positives, and more:

<https://biz.yelp.com/support/common_questions>

It's easy to jump on the bandwagon and say "Yelp is evil," but having been a
near-constant point of controversy, it's something they do their best to
address.

------
tlb
Yelp's main defense is "It's done by an algorithm". That's not a valid defense
for a bad business practice.

They are accused of hiding positive reviews for businesses that don't pay
them. You can perfectly well write this logic in Python for an filtering
algorithm:

    
    
       def filter(review):
       ...
       if   review.business.contacted_by_sales_count >= 3 and
            not review.business.paying_yelp and
            review.stars >= 4 and
            random() > 0.8:   # 80% chance
          return true # yes, filter it
       ...
    

Even if it's not written so explicitly they have to take responsibility for
their algorithms.

~~~
derwiki
The sales org only has access to Salesforce copy of Yelp's internal data, and
numbers like "contacted_by_sales" never ends up back in the main Yelp
database. It's not something the algorithm could take into account because it
doesn't know about the sales org at all.

Disclaimer: I used to work on the Yelp engineering team, at one point on the
Salesforce data refresh project.

------
ZipCordManiac
I've posted about 6 reviews in total on yelp, 2 of which were immediately
filtered for no reason. Funny how my positive reviews went through just fine
and the rest got binned. If you're going to run a review site, at least let
people give their honest opinions don't filter anything other then profanity.
People should be able to share an experience be it good or bad.

~~~
wallawe
Although I wish it were that simple, it's more complicated than that. The
reason for this simply is, if I own a business, I could create multiple fake
accounts and have friends and and associates do the same. We would then be
able to give multiple (fake) positive reviews for our own business, while
giving one star feedback to our competition.

------
michaelcampbell
The captcha looked quite clearly to me to be "inctory". These captcha's are
not necessarily real words (I believe to thwart dictionary based crackers?),
and I also am pretty sure the service they are using is not tied to Yelp
itself.

~~~
jhdavids8
The first word wasn't the difficult one, it was the last word (it's even
tougher to decipher when zoomed in for normal view. Zoomed out, it appears
more like 'Law'). I agree though, I know the service probably isn't tied to
Yelp, I just found it weird that I had to decipher a blob of ink on page 52.

~~~
rottencupcakes
If one word is completely illegible, it often isn't being used as a captcha.
The captcha company may be trying to have you manually recognize text their
OCR software could not.

You can often make it through after painfully screwing up the hard to read
word.

~~~
jrockway
This is how I do reCaptcha. I guess the easy-looking word correctly, and then
type garbage for the hard-looking word. This results in approximately 99%
correctness and hopefully screws up their "make jrockway do work for us for
free" scam.

~~~
icebraining
What scam? Google offers the reCaptcha service for free in exchange for having
words OCRed; seems fair to me. To you as the user, it's not only the same work
as any other captcha (which you would have to fill in anyway), but you get
free OCRed books and magazines from Google Books.

If you don't like the reCaptcha ($deity know why, but it's your right),
complain to the website, don't fuck it up for everyone else.

------
pauljonas
I stopped using Yelp after I discovered that reviews I logged to the site were
not showing if I was not "logged in" to Yelp. What is the point of using a
social review repository if the reviews are only viewable by me, when I log in
to the site? I'd be better served by putting it in a text file/directory.

~~~
bkbleikamp
It's not a democracy - Yelp pushes the best reviews to the top. Most people
read a couple of reviews, it serves Yelp's interest to make sure they read the
best ones.

~~~
pauljonas
So, in other words, I as the user contributing reviews for which makes Yelp
valuable, are treated as a de facto ruffian, my time and effort expended to
"share" considered folly. So Yelp "filters" these and beyond wasting my time
(as I should just collect in my own file of reviews or "share" on my own site)
is it not a great conflict of interest going on here?

------
mathattack
Organizations like Yelp live and die on their accuracy and predictive ability.
If Netflix recommends movies you don't like, you will use it less. If Zagat
allows too much personal preference, it will no longer be the gold (Maroon?)
standard of crowd sourced restaurant reviews.

If half the things suggested about Yelp are true, they risk their integrity at
great peril.

~~~
cHalgan
Actually, Yelp can play game like this and be profitable.

If business have a good rating on Yelp, then that pretty much mean that
business is ok. However, a bad rating might be because business was not nice
with Yelp sales people.

That is ok because they target small, unestablished, businesses which you will
never discover that they are actually excellent business because they have
just one star review.

In other words, advertising at Yelp does not guarantee you a good review.
However, not being nice with their sales people might get you a bad review if
you are a small business.

And this is absolutely nothing new.

------
eurohacker
so are you saying that yelp is basicly an online extortion business,

and judicial system and all the yelp users accept it ..?

------
jprobert
I think Yelp has always been a good service but obviously there is a
disconnect between the business end of Yelp and the service end. The Chicago
Tribune article is not the first to complain about the business practices of
Yelp. Additionally, advertising on Yelp is some of the most expensive real
estate on the web with CPM rates of $100-$300. Ultimately I think that
directory sites such as Yelp will become obsolete as social search becomes
more refined and relevant. My company is actually working on this and we thing
that we have a good product so far (in alpha). <http://www.cliqsearch.com>

