All rating systems are relative to other ratings on the platform. So it doesn’t matter if you dumb things down or not.
The trick is collecting enough ratings to average out the underlying issues and keeping context. IE: You want rankings relative to the area, but also on some kind of absolute scale, and also relative to the price point etc.
The less choices you the more random noise you get from rounding.
A reviewer might round up a 7/10 to a 3 as it’s better than average, while someone else might round down a 8/10 because it’s not at that top tier. Both systems are equally useful with 1 or 10,000 reviews but I’m not convinced they are equivalent with say 10 review.
Also, most restaurants that stick around are pretty good but you get some amazingly bad restaurants that soon fail. It’s worth separating overpriced from stay the fuck away.
The fewer choices and the more clear the meaning, the less noise you get from the very well-documented cultural difference in how wide-range numeric rating systems are used; this isn't important if you are running a platform with a very narrow cultural audience, but it is (despite being widely ignored in design) in platforms with wide and diverse audiences, since your ratings literally mean different things based on the the subcultural mix rating each product.
A lot of mechanisms are involved. Culture doesn’t just impact the scores people rate something on, but also how people interpret them which mitigates that effect.
However, the rounding issue is a big deal both in how people rate stuff and how they interpret the scores to the point where small numbers of responses become very arbitrary.
> A lot of mechanisms are involved. Culture doesn’t just impact the scores people rate something on, but also how people interpret them which mitigates that effect.
It doesn't mitigate the effect, the combination of the effect on rating and interpretation is the source of the issue, which exists whenever the review reader isn't in the cultural midpoint of the raters.
2 and 4 are irrelevant and/or a wild guess or user defined/specific.
Most of the time our rating systems devolve into roughly this state anyways.
E.g.
5 is excellent 4.x is fine <4 is problematic
And then there's a sub domain of the area between 4 and 5 where a 4.1 is questionable, 4.5 is fine and 4.7+ is excellent
In the end, it's just 3 parts nested within 3 parts nested within 3 parts nested within....
Let's just do 3 stars (no decimal) and call it a day