
How Not To Sort By Average Rating - llambda
http://evanmiller.org/how-not-to-sort-by-average-rating.html
======
NathanRice
While I agree with the spirit of the article, this is one of those cases where
a Bayesian treatment is conceptually much clearer.

Assume that ratings are being generated by a stable stochastic process where
the underlying distribution is multinomial (ignoring the ordinal character of
ratings, for the time being) and use a dirichlet conjugate prior. This gives
you a posterior distribution over new ratings for an item. The benefit of a
posterior here is that it lets you rank items by thinking in terms of the
probability that the viewer would rank one item higher than another at random.
By adjusting the magnitude of the alpha parameter to the dirichlet prior, you
adjust your sensitivity to small numbers of observations. A small initial
alpha will lead to rapid changes in the posterior upon observing ratings,
whereas a large alpha requires a significant body of evidence.

The best part of the multinomial model with conjugate dirichlet prior is that
the math is REALLY simple. The normalizing constant for the dirichlet
distribution looks scary when stated in terms of the gamma function, but given
this is the discrete case, just pretend everywhere you see the gamma(x), it is
replaced with (x - 1)! and you will be ok.

Let me know if you would like to learn more, I would be happy to help.

~~~
_delirium
Here's a paper proposing a solution in that space, and which also compares
itself to the article linked here (kind of nice to see... papers sometimes
fail to cite stuff that's "only" posted online rather than properly published,
even if the authors know about it and it's quite relevant):
[http://www.dcs.bbk.ac.uk/~dell/publications/dellzhang_ictir2...](http://www.dcs.bbk.ac.uk/~dell/publications/dellzhang_ictir2011.pdf)

I emailed Miller a while ago to see what he thought of this reply, and he
thought it also seemed like a reasonable approach. But, in his view, the
criticisms of his method within their framework include things that in
practice he sees as features. In particular, they view the bias caused by
using the _lower_ bound as a bug, but he prefers rankings to be be "risk-
averse" in recommending, avoiding false positives more than false negatives.
Of course, that biased preference could also be encoded explicitly in a more
complex Bayesian setup, which would also be a bit more principled, since you
could directly choose the degree of bias, instead of indirectly choosing it
via your choice of confidence level on the Wilson score interval.

~~~
NathanRice
I don't think you have to resort to any overly complex machinery to achieve
similar behavior. The simplest approach is to just use a non uniform prior.
His pessimistic bound could be emulated by having an initial alpha that places
more weight on low star ratings. The intuitive interpretation of that being
"things are probably bad unless proven good" roughly. Another option would be
to generate the prior based on the posterior distributions of other items.
Just take the distribution of ratings observations for all products of a given
type (perhaps only items produced by that company?) to get a sensible prior on
a new item in that category.

The strength of priors here is that it is very easy to take intuitions and
encode them statistically, in an understandable way. Taking the lower bound of
a test statistic doesn't admit much in the way of intuition.

~~~
moultano
There's a distinct difference in the asymptotic behavior though between the
lower bound and the prior. The lower bound goes to the mean as 1/sqrt(n), the
prior goes to the mean as 1/n.

That makes for a pretty significant difference in practice, and I'm not sure
which is preferable.

~~~
NathanRice
You are absolutely correct that they are not mathematically identical. I
struggled to word it in a way that would not mislead people, the distinction
is important to emphasize.

~~~
moultano
It has a really big effect I think on the tone of what gets selected at the
top. The lower bound prefers things that are preferred by a majority and very
popular. The prior method prefers things that are completely un-objectionable
and liked by just enough people to be sure of that. My hunch is that with the
lower bound you get more interesting things bubbling to the top because it
puts a stronger emphasis on popularity.

In all of these models, the giant variable that is completely ignored is the
actual choice to rate something at all, versus skipping over it and reading
the next one. That's a very significant decision that the user makes. The
behavior of each of these systems w.r.t that effect will be the dominant thing
differentiating them.

------
edw519
I love it and I hate it.

Why I love it: It's precise. It's elegant. It's rigorous. It's based upon
solid, proven science & theory. It's a perfect application for a computer. And
most of all, it does what's intended: it works.

Why I hate it: What human can understand it?

I used to implement the first manufacturing and distribution systems that used
thinking like this. They figured, "We finally have the horsepower to apply
complex logic to everyday problems." Things like safety stock, economic order
quantities, reorder points, make/buy decisions, etc.

But the designers of these systems overlooked one critical issue: these
systems included humans. And as soon as humans saw that decision making
formulas were too complex to understand, they relieved themselves of
responsibility for those decisions. "Why didn't we place an order?" "Because
the computer decided not to and I have no idea why."

I suppose the optimal solution is somewhere in between: a formula
sophisticated enough to solve 95% of the problem but simple enough for any
human's reptile brain to "get it". This isn't it.

~~~
raldi
I'm not sure that's a "critical issue". 99% of Google users don't care that
PageRank is complicated; they just marvel at how good the results are. Just
like most redditors simply talk about how great the comments are, and not the
math that makes them so.

~~~
bri3d
Amusingly the proposed solution in this blog post is almost line-for-line
exactly the method Reddit uses to rank comments - proof, I think, that the
method isn't "too complicated."

When you're working with a system managing orders, where there's a direct
human interaction with the algorithm (like the top-level poster is),
simplicity is somewhat important. When you're working with a system that's
_designed_ to look like magic to the end user, making the system more magical
is often better, because it makes it less intuitive to game.

------
EvanMiller
Original author here. For the academically inclined, there is a critique of
this approach in this paper:

[http://www.dcs.bbk.ac.uk/~dell/publications/dellzhang_ictir2...](http://www.dcs.bbk.ac.uk/~dell/publications/dellzhang_ictir2011.pdf)

Of course, I think the authors miss the point of the algorithm, since I
basically wanted a system that is one-sided (i.e. false negatives are OK but
false positives are bad).

Also, if you deal with more than two outcomes you might be interested in
multinomial confidence intervals, described here:

[http://www.math.wsu.edu/faculty/genz/papers/mvnsing/node8.ht...](http://www.math.wsu.edu/faculty/genz/papers/mvnsing/node8.html)

The application to 5-star systems is not straightforward, since it's not clear
to me how stars relate to each other. Is it a linear scale? Are they discrete
buckets? Or maybe we want to use Tukey's froots and flogs? I'm not sure.

By the way, I'm coming out with a stats app for Mac soon that implements this
algorithm and much more. Drop me your email address if interested:

<http://wizard.evanmiller.org/>

~~~
NathanRice
I appreciate people who take the time to apply math to things in the real
world, and share it with non academic crowds. Thanks for that.

5 star rating systems are obnoxious. From a mathematical perspective, if you
treat them in an ordinal fashion they are poorly behaved, and if you treat
them categorically, you lose the relationship between stars. There seems to be
some popular movement towards binary rating systems, and I think that is
great. Not only do people tend towards binary rating behavior in the real
world (only rating a movie they thought was very good or very bad) but they
admit a much cleaner mathematical treatment.

~~~
tripzilch
> 5 star rating systems are obnoxious. From a mathematical perspective, if you
> treat them in an ordinal fashion they are poorly behaved, and if you treat
> them categorically, you lose the relationship between stars.

Helping out a friend with a statistics test, I recently read up about the
Wilcoxon Signed Rank Test[1]. Now this one is intended to get a p-value for
experiments with "before" and "after" measurements, but what I got the idea
it's trying to do, is to use the ranks of a not-very-normal behaving random
variable, turn that into a summation of lots of things, so due to the central
limit theorem you can treat it as a normal distribution again.

Though thinking about it, in this case it's the rank we're after, so maybe
it's not useful at all. But it gives an interesting idea about the tricks you
can pull if your input data isn't quite the sort of type that you can analyse
very well.

[1] <http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test>

------
a1k0n
There are a lot of comments complaining about how complicated the math is.
This shouldn't be all that hard to understand.

The assumption is that there's some constant _p_ underlying probability that a
random person will rate a given thing positively. If we observe, for instance,
4 positive and 5 negative reviews or votes, there's a probability distribution
(known as a Beta distribution) which tells us what the possible values of _p_
are given the votes we observe: p^4 (1-p)^5. graph:
[https://www.google.com/search?q=x%5E4+(1-x)%5E5%20from%200%2...](https://www.google.com/search?q=x%5E4+\(1-x\)%5E5%20from%200%20to%201)

Now if we observe 40 and 50, respectively, the curve looks like this:
[https://www.google.com/search?q=exp(20+%2B+40+log(x)+%2B+50+...](https://www.google.com/search?q=exp\(20+%2B+40+log\(x\)+%2B+50+log\(1-x\)\)%20from%200%20to%201)

(I had to do it in the log domain because Google's grapher underflows
otherwise -- the 20 is just to make the numbers big enough to graph. The more
correct thing involves gamma functions and that just gets in the way right
now)

The more you observe, the more sharply peaked the likelihood function is. The
funky equation in the article is an approximation to the confidence interval
of that graph -- 95% of the probability mass is said to be within those
bounds.

It's not a great approximation, for one because the graph is skewed (try it
with 10/50) and it assumes that the mean is exactly in the middle of the
confidence interval. The correct computation involves the inversion of a messy
integral called the incomplete beta function. Scipy has a package which
includes betaincinv which solves this more exactly:

>>> import scipy.special

>>> scipy.special.betaincinv(5,6, [0.025, 0.975])

array([ 0.18708603, 0.73762192])

would be the 95% confidence interval for 4 positive and 5 negative votes;

>>> scipy.special.betaincinv(41,51, [0.025, 0.975])

array([ 0.34599562, 0.54754792])

for 40 and 50, respectively.

[edit: apologies, I had to run and get ready for work -- I didn't really have
time to make this very comprehensible; but i just now fixed a bug in my
confidence interval stuff above]

~~~
metaxyy
Very interesting. So would you say developers should probably use the
incomplete beta function, rather than Ev's method? Or is it too
computationally expensive?

~~~
a1k0n
I haven't investigated it in depth -- quadrature over a single variable as
this is can be pretty quick to compute. Not sure how scipy does it.

Anyway, I personally think 95% confidence intervals are a crutch. The
_correct_ Bayesian approach is to consider two items, each with their own up
and down votes, and integrate over all possible values for p1 and p2 (being
the underlying probabilities of upvotes for item 1 and 2, respectively) over
the observed data, and compute the likelihood of superiority of p1 over p2.

How to turn that into an actual ranking function? No idea. I doubt it would
work, but you could compute against a benchmark distribution (i.e. the uniform
0-1 distribution).

If you do that, it probably turns out that your ranking function is the mean
of the Beta distribution, which is simple: (U+1)/(U+D+2) where U and D are the
upvote/downvote counts [note: we started with the prior assumption that p
could be anywhere between 0 and 1, uniformly]. Basically, the counts shrink
towards 1/2 by 1. This is a hell of a lot less complicated, and it achieves
the goal of ranking different items by votes pretty well with more votes being
better.

------
yariang
While it is good to look at these sorts of mathematically rigorous algorithms,
I think I would be frustrated if it was used everywhere. Or, well, maybe not
me perhaps, but a non technical user.

The beauty of the second algorithm for rating products is that it is
straightforward. Having never seen it before I can deduce that 5 stars come
before 4 stars and more reviews come before fewer. If I want to skip ahead to
the 4 stars I know what to do. I can internalize the sorting algorithm easily.
And as a user, understanding the order items are presented to me is important.

If Amazon were to use the last algorithm and present items in that order
(assuming we accounted for the 5 star vs positive/negative issue), it would
like a random order to most users and would be frustrating.

So I guess what I am saying is that this algorithm is very clever, but in some
cases, it may be too clever. Sometimes you just want to keep it Simple Stupid.

~~~
raldi
Instead of displaying stars, Amazon could display a percentage, which under
the hood represents the Wilson confidence number. It would be totally
intuitive to browse: first come all the 100% items, then the 99's, and so on.

~~~
imajes
er... and the problem with bucket categorizing?

80-100% = * * * * *

60%-79% = * * * *

etc..

~~~
raldi
How is it a problem if the five-star reviews display first, then the four-
star, and so on?

~~~
dsrguru
The point is we have to determine how to define a five-star item, a four-star
one, etc. Currently, an Amazon item's star value is the average of the star
values of every review. The author is saying that that's a bad way to compute
the item's star value. The author would argue an item with only two reviews
that are both fives should have a lower star value than an item with 400 fives
and 1 four. We typically associate stars with the averaging algorithm (i.e. we
define an item's star value as the average of the star values of its reviews),
so it might help if we do away with the notion that each _item_ has a star
value, and just think of this as saying an item with 400 reviews of 5 stars
and 1 review of 4 stars should be shown before an item that just has 2 reviews
of 5 stars.

Currently, when we see an item's star value, we think of it as an indicator of
the quality of the item. But if it's just the average of the star values of
every review, the author would argue that we're not going to get an accurate
indicator of quality. The author argues that whether the quality indicator of
an item is expressed in stars or percentages, that value should be determined
by the third algorithm, not the second, and that the order the items are shown
in should be the result of sorting those quality indicators.

------
peq
I always assume that initially there are q voters who gave the average rating.
This yields the following formula:

(p _n) / (n+q)

This is simpler and gives similar results:

[http://www.wolframalpha.com/input/?i=plot+%28p+%2B+z^2%2F2n+...](http://www.wolframalpha.com/input/?i=plot+%28p+%2B+z^2%2F2n+-+z*sqrt%28+%28p*%281-p%29+%2B+z^2%2F4n%29+%2F+n+%29%29+%2F+%281+%2B+z^2%2Fn%29+where+z+%3D+1.96%2C+p+%3D+0.8%2C+n%3D0+to+50)

[http://www.wolframalpha.com/input/?i=plot+%280.8+*+n%29+%2F+...](http://www.wolframalpha.com/input/?i=plot+%280.8+*+n%29+%2F+%28n+%2B+10%29+from+n%3D0+to+50)

------
jwr
I implemented this in a rating system once. Got multiple bug reports, people
complained that the system calculates averages wrong, because there are two
ratings and the average is _obviously_ not the number they are seeing.

~~~
edash
Why show the second figure to the user? They don't need to see the calculation
you made to determine the sort.

Just sort the objects in the order determined by this formula and only show
the ratings given by users in the interface.

~~~
jwr
This was a system where I was supposed to display a star-rating (1-5 stars,
with fractional stars as well) for each item.

The people reporting the problems were the users, and yes, this was bias, as
they expected averages. That was exactly my point — while the statistics
behind this method are sound, it is not what people expect. Building systems
that don't do what people expect is difficult.

------
hadronzoo
Here's a superior Bayesian solution: [http://masanjin.net/blog/how-to-rank-
products-based-on-user-...](http://masanjin.net/blog/how-to-rank-products-
based-on-user-input)

------
ajross
It's well explained, informally. The giant equation sitting there without
clearly defined parameters is mostly just showing off though. The final "QED"
solution that you put at the end of a paper is _not_ the proper form to
introduce a concept.

But... so what? Amazon and Urban Dictionary are hardly failing in the market
due to their "incorrect" score sorting. The whole problem is a heuristic, it's
not amenable to rigorous treatment no matter how many giant equations you club
your audience with.

~~~
esrauch
It's been fine in those contexts, but imagine if HN ranked +1/-0 above +100/-1

~~~
ajross
It would be instantaneously annoying to someone, who would downvote it and
push it off the front page. So the net effect is that the frequency with which
you saw "low vote garbage" would increase, but probably not by much. And IMHO
a reasonable argument can be made that this is a _good_ thing, because it
increases visibility for new posts.

------
joshuahedlund
For those who do not understand the Wilson algorithm, see this post which was
on HN recently, explaining how it works in a little more detail:
<http://amix.dk/blog/post/19588>

(I agree with other commenters that it is complicated and lacks common sense
to average users, but I feel like I have a general understanding of the
concept thanks to the above link)

------
aw3c2
Every time I see that page, I see the equation, I read statistical terms and I
get overwhelmed. I use PHP so I have no pnormaldist. Would love to use it for
some random page I run.

~~~
gipsyking
"if you don't have a statistics package handy or if performance is an issue
you can always hard-code a value here for z. (Use 1.96 for a confidence level
of 0.95.)"

~~~
aw3c2
Ouch, I managed not to see that at all. Thanks!

------
mumrah
There was an article about this a few years back:
[http://blog.linkibol.com/2010/05/07/how-to-build-a-
popularit...](http://blog.linkibol.com/2010/05/07/how-to-build-a-popularity-
algorithm-you-can-be-proud-of/)

I've found, in practice, a Bayesian weighted average is easy to implement and
pretty effective. It's also a good candidate for "stream" processing (i.e.,
calculating in a single pass)

------
ketralnis
See also [http://blog.reddit.com/2009/10/reddits-new-comment-
sorting-s...](http://blog.reddit.com/2009/10/reddits-new-comment-sorting-
system.html)

Or maybe more notably,
[https://github.com/reddit/reddit/blob/master/r2/r2/lib/db/_s...](https://github.com/reddit/reddit/blob/master/r2/r2/lib/db/_sorts.pyx#L40)

------
gtsc
Here's an even simpler way to think about it: it's the left point of the
standard 95% confidence interval from the Central Limit Theorem plus a hack
for small sample sizes. The Wikipedia page says the hack is almost equivalent
to estimating p = (X+2)/(n+4) i.e. assuming each item starts with two upvotes
and two downvotes.

------
dredmorbius
I've put some thought into metrics as well. A few other alternatives suggest
themselves:

\- Considering the standard deviation of ratings. On a 5 point scale, an item
that rates 3 because ratings are split between 1 and 5 votes, differs from one
that gets mostly 3 votes. The latter is a middlin' fit for anyone, the former
has an enthusiastic but niche audience. If you're looking at sales, the former
can be a valuable product if properly marketed.

\- An item that gathers few votes regardless of favorability ratings can
exhibit multiple problems. One is that it isn't well marketed / publicised, or
known. Another (particularly on content sites) is that there's very likely a
sampling bias (mutual admiration society / negging attack / vote stuffing).
I've tended to favor systems which take into account the total volume of
voting, generally on a ln(n) basis, though not out of any particular
statistical rigor. As an implementation, you'd start with a 5 point Likert
score, then multiply by, say, ln(n+1) (avoiding a zero multiplier on a single
vote).

\- The pattern of ratings over time and space (IP or geographical) may reveal
both opportunities for marketing and/or issues with your ratings system. Since
any effective quality proxy _will_ be abused, you've got to be sensitive to
the latter.

The Wilson score is an improvement over multiple other methods. It still does
assume a relatively unbiased estimator and rating behavior. My feeling and
experience is that excess reliance on any one metric is likely to cause
problems -- reality is multidimensional, metrics for assessing reality should
be as well.

There's also the question of whether or not you want to make specific
recommendations for an individual, or general recommendations for a
population. In the former case, correlating other rankings or behavior may
give a better fit (and the Wilson score may still be useful).

Though for a suitably specific goal (marketing, suitability, revenue
potential) a single encompassing metric may work.

------
omarqureshi
The one example that I've found of a good site that does really good average
ratings is steepster, it picks teas that you have previously rated and
indicates the rating you gave to them. This way the users rating is much
better and will give you a much more meaningful mean.

------
ColinWright
Discussions from earlier submissions are also interesting:

<http://news.ycombinator.com/item?id=1218951> <\- 31 comments

<http://news.ycombinator.com/item?id=478632> <\- 56 comments

Further, I hope JoshTriplett
(<http://news.ycombinator.com/user?id=JoshTriplett>) isn't too disappointed
that when he submitted this exact same item 2 days ago it got one upvote and
no discussion. In submitting to HN, as with comedy, timing is everything.
<http://news.ycombinator.com/item?id=3784912>

------
sold
Urban Dictionary no longer sorts by positive - negative, see e.g.
<http://www.urbandictionary.com/define.php?term=usa>. I don't know what they
use now.

~~~
gammarator
Doesn't seem to have improved the relevance of the definitions:

 _"USA: The only country keeping penguins from coquering [sic] the Earth"_

------
moe
So, anyone have the formula for 5-star ratings?

------
PakG1
This is awesome. This is perfect for what we need for our startup. We are
going to use this. We won't need to worry about the negative aspects listed in
these comments due to our use case. Wow. Thanks, HN. :)

------
PenZenMaster
But shouldn't the solution (formula) be "simply elegant"? eBay seems to be on
to something with: positive/positive + negative rating system. The user knows
how many data point are in the pool which over comes the one positive rating
gets five stars. Much in the same way
<http://demanddriventech.com/home/solutions/replenishment/> has come up with a
"simply elegant" formula for supply chains that is human understandable and
effectively solves the problem.

------
derwiki
My favorite part was looking at the code and seeing a variable named `phat',
and then looking up at the equation to find `p^' (p-hat).

------
ComputerGuru
What I don't get is how in 2012 sites like Amazon are still making this
mistake. Amazon is a company that, much like Google, spends millions analyzing
user behavior and trying to optimize the workflow (checkout, in their case).

This has been the number one complaint I have against Amazon for the past 10
years. And they haven't done a think about it?

~~~
nickm12
Well, you'll be glad to know that it changed over a year ago. See for example:
[http://www.amazon.com/Stand-Mixers-Small-Appliances-
Kitchen/...](http://www.amazon.com/Stand-Mixers-Small-Appliances-
Kitchen/b/?node=289932&sort=reviewrank_authority)

------
padobson
It seems to me there's still a problem at the point of data collection. Not
all +1's are equal.

This algorithm needs to be paired with another algorithm that weights each
plus one according to each user's ability to plus one something that gets a
lot of plus ones.

I couldn't hope to do the math for something like that, but I'd sure like to
talk to someone that could.

------
nickm12
Lots of people seem to be missing the fact that Amazon changed their algorithm
years ago to account for the number of review. For example see:

[http://www.amazon.com/Stand-Mixers-Small-Appliances-
Kitchen/...](http://www.amazon.com/Stand-Mixers-Small-Appliances-
Kitchen/b/?node=289932&sort=reviewrank_authority)

~~~
tvorryn
Huh. You're right. That seems pretty recent, or I didn't notice the switch-
over. Thank goodness.

------
uggedal
I implemented this algorithm by using likes/views in stead of
positive/negative votes on <http://mediaqueri.es/popular/> and have been quite
happy with the results.

------
dsears
When I have the whole body of reviews readily available, I like to just do a
Bayesian average. Mix in the average number of reviews at the average review
score to keep small data sets from skewing results.

------
jarin
Haha, I don't know why, but I laughed when I got to the 3rd formula. It was
like the punchline to a joke.

------
moofins
You had me at "Lower bound of Wilson score confidence interval for a Bernoulli
parameter"

------
jader201
I've always thought about this, and to me, a very simple (though _slightly_
inaccurate) solution would be to sort using this formula:

(TotalScore - 1) / MaxPossibleScore

Such that (using the Amazon examples from the article):

((2 * 5) - 1) / 10 = 9/10 = 90%

((100 * 5) + (1 * 1) - 1) / 505 = 500/505 = 99%

------
jakejake
we use this algorithm for our office ping pong game tracking system. it's
great because the person who just plays one game and wins doesn't get bragging
rights.

------
jwblackwell
This must be the third time this has been posted.

~~~
klapinat0r
The last time was about a month ago IIRC. Can anyone explain why this keeps
happening (and why people keep giving karma to those who just repost month old
HN links)?

~~~
cbg0
Probably because some haven't seen it yet and/or because people love
discussing this topic.

------
excuse-me
That's why a friend of mine joined the army engineers.

As a civil engineer working for a local city he might be involved in a 10year
process of approvals to add a freeway on ramp. Where most of his job would be
checking that an army of subcontractors were all doing things to code - not
that they were doing things well, just to the written requirements

In Afghanistan if they want a road or a barrier he basically finds somebody
lower rank points at a bulldozer and tells them to do it.

An interesting point he made was building a simple village clinic with a clean
water supply that would save lives for a few days work and a few $1000. At
home he would be involved in a multi $100M, 20year project for a new hospital
where most of the money would go into pretty decoration and parking structures
and would probably end up costing lives compared to the existing old hospital
that was working perfectly well.

~~~
tripzilch
I read this twice and I can't figure out how this is relevant to the article,
please explain?

------
its_so_on
Date: 9:12 AM Wednesday, April 4, 2012

From: the boss

To: dev3

Subject: URGENT - front page showcase selection broken!!

Body: Hey bro, I was looking into it, and our ratings average equation is
totally busted and products with just a few ratings are hogging space from
proven winners when it's just a sample bias. This is costing us money and
needs to be fixed NOW.

I'd like this up before our morning meeting so I can boast about it and you'll
get credit too, as this should massively increase our conversions right away
by putting BETTER products right on the front page.

this should get you started: <http://evanmiller.org/rating-equation.png>

I'm sure you'll figure it out. If you could do an A/B test for bragging rights
too that would MASSIVELY rock. Thanks!!!

Rock on,

Boss

------
ashishb4u
how bout ((positive-negative)/total)

~~~
jakejake
That's the same thing as the average rating - works great if everything has
about the same number of ratings.

------
epo
The problem with star ratings is that they have nothing to do with measuring
approval. They are a form of social inclusion mechanism to give the rubes the
erroneous sense that someone cares about their opinions. It is done to attract
users, not to guide them.

