
Ratings Now Cut Both Ways, So Don’t Sass Your Uber Driver - luu
http://www.nytimes.com/2015/01/31/technology/companies-are-rating-customers.html
======
danso
> _Part of the confusion stems from the fact that the rental economy — taking
> its cue from the Internet in general — sees everything as either horrible or
> great, with little room for nuance. Lyft nods to this when it tells
> passengers reviewing drivers that “anything lower than 5 indicates that you
> were somehow unhappy with the ride.” Drivers can be dropped from their
> services when they fall below 4.5, but it is unclear what it takes to get
> banned as a passenger._

I've never used Lyft, but as an Uber customer, it seems like someone at Uber
graduated from college and, either as a result of, or in a sly commentary on
grade inflation, created an ambiguous rating system that encourages rating-
inflation, while penalizing drivers based at the whims of what riders consider
"4-star" to be.

4-stars on every other rating service, is pretty solid...yet if a driver
averages 50/50 on both 4s and 5s, they are sanctioned (from what I've heard
from drivers, the rating-cutoff can shift depending on the area and excess of
drivers...in Silicon Valley, it might be higher than in other locales, for
instance)...I've never given a driver less than 5-stars because I've never
received _awful_ service (though, certainly, I've had drivers who were late,
or cars that were messy)...it's bad enough that they're scraping by, I don't
need to pile on by giving them a slight demerit that could end their driving.
But since the star-rating is basically a binary system, why not just make the
question, "Were you satisfied with your ride?" and solicit the user for
comments if the answer is the rare "No"...such a system would garner real
feedback and improvement, at a very small price in the sensory satisfaction of
clicking on a star (this might also prevent the problem of fat-fingered/drunk
users accidentally giving a low rating).

~~~
Chevalier
I agree completely. A binary yes/no would be a lot more useful in gauging
customer satisfaction with drivers. When 5 stars is "yes" and 1-4 stars are
"no," Uber forsakes the gradient system that, as far as I can tell, is the
only merit of star rankings.

That said... Amazon has likewise mutated star rankings. What was the
statistic, that 80% of Amazon's products have a 4-5 star ranking? For any
given listing, 5 stars indicates satisfaction, 4 stars indicates likely
problems, and 3 stars indicates serious defects. I've never seen 1 or 2 star
reviews except on long-tail products with less than a handful of reviews.
Despite turning star rankings into blunt instruments, Amazon still features
probably the most valuable, extensive reviews on the internet.

~~~
ThrustVectoring
I definitely agree with the binary yes/no

Also, instead of a percentage, list the good/bad ratio. 99% doesn't seem much
different than 98%, but the first is 99 good experiences per bad one while the
second is 49.

------
qeorge
My Uber driver on Wednesday flat out told me he won't pick up certain racial
groups, because they tend to leave lower-than-five-star ratings. It was
offensive, but I understand that his > 4.5 star rating is essential to his
well-being.

(This is the modern day equivalent of not being able to hail a cab if you are
a person of color.)

Uber & friends will need to fix that, or their drivers are going to keep
cherry-picking riders to maintain their ratings and giving a lot of customers
bad experiences.

Perhaps they could simply throw out the grumps' ratings (e.g., this rider
rated the driver a 2, but that's normal for this rider, so just ignore it).

~~~
omonra
What was the ethnic group he said he wouldn't pick up?

------
skuhn
I had routinely given Uber drivers ratings between 3-5 stars (good, better,
best), because service was always good enough that I didn't feel a need to dip
lower. Then a driver mentioned that if a passenger gave him less than 5 stars,
he would have been better off not taking the fare at all. That was pretty eye
opening: that a $30+ fare wasn't worth being rated a 4, because of the future
lost business that results from having a 4.8 average instead of a 4.9.

This was a few years ago, before Uber X launched. So even then, Uber had
limited their scale (which provides 50 ratings points) such that only the top
5 ratings were acceptable. I think it has only tightened since then, I rarely
see a driver below 4.8 at all.

Now I always give drivers a 5 unless something truly unacceptable happens, in
which case I file a complaint with Uber. The ratings feedback loop has been
completely broken for me.

It's ridiculous, and I don't see how they get the results they want when
people have no idea what each rating number is supposed to mean. Some people
start at 5 and remove points for issues. Some people start at 3 and only go up
if you've gone beyond the call of duty. The only thing that makes sense is an
up / down vote, when the scale can't be agreed upon by all participants.

I suppose that the driver -> passenger ratings might actually work as
intended, since the drivers are presumably well aware of the reality of the
ratings scale.

I don't think this is unique to Uber or car services by any stretch either. I
have noticed that any service with too many ratings choices seems to suffer
from poor ratings quality. Yelp is my go to example here. There _should_ be no
shame in a restaurant rated 3 stars, but in actuality it's seen as the kiss of
death to a lot of places. Here again, some people are rating as if 3 stars is
the baseline, some people as if anything more than 1 is earned at great cost.
When you're pulling your responses from everyone in the world, I don't think
you can really get consensus on a scale that goes beyond up / down.

------
Yizahi
Vast majority of humans can't comprehend how to rate any thing on more than
good/bad scale. Even less people understand how you could rate some thing on
1-5 scale where whole range is in "good" category, so 1 is a somewhat good
mark. One of the most obvious examples are schools. In my country before
reform we had 1-5 marks. 1 and 2 were not considered valid mark but rather an
absence of any score. So instead of five scale we had three scale. And since 3
was a placeholder for "passable" actual marks showing quality of the work were
4 and 5. And of course in time were invented unofficial corrections + and -,
then ++ and -- etc.

So when adults with 20-30 years of supposed experience of grading quality of
work can't use even a simple 1-5 system then it is obvious that general
population will fail at grading even more spectacularly. And thus we have 4.5
out 5 as "awful" score :) .

