I don't think this can really be answered. The mathematical probability arguments are ignoring important implications of the result.
If 999 out of 1000 said it was blue, you could dismiss the last guy as crazy or blind or something. But 10% providing a different answer means something strange is going on. Maybe the color is some borderline shade, or the lighting is weird, or people are being coerced, or.... Without more information, we can't really tell what's going on, so the answer pretty much has to be "who knows?"
I think this point would be more relevant to something like the modified question proposed by jbob2000:
> It's an interesting idea executed poorly. What if we changed it to: "1000 people were asked how many lights were lit up in a row of 4 lit up lights. 900 people said there were 4 lights."
But in the case of color identification, I feel like 10% disagreement is par for the course -- it doesn't imply anything weird is going on.
The other 100 people didn't necessarily provide a different answer. It just says they were shown a car and they didn't say it was blue. Maybe they just didn't feel like commenting on the colour of the car. In fact, we don't even know if they were asked the colour.
A lot will depend on how you perform the survey, too. If you presented it as a serious psychological study and paid people to participate, having 10% disagreement would be huge. If it's a survey put up before YouTube videos, it would actually be weird for so many people to have agreed. The probability depends on too many unknown factors.
The (well, a) problem lies in treating "is it really blue" as some definite, knowable thing. But that's never directly observable. We only use that question as a "intermediate step" in answering other observables:
1) "Will [x% of] people emit 'that's blue' when asked about its color?"
2) "Does it reflect light within [specified spectrum] under [specified condition]?
3) "Will Scanner model X emit True or False when it scans this?"
Depending on what question you're asking, it may or may not be blue in that sense. As in my other comment [1], 10% saying "green" may be enough for you to consider it green for the purposes of "does the guy at Lost and Found who's asking for a green object possibly own this item?" But a 10% green response may be false for "is this blue enough to meet this UX standard?"
If you were to ask me, then you would have a 10% chance that you catch me in the wrong mood and my answer to any questions would be "f___ you". That percentage might double if you were to ask me a stupid question about the color of a car when the answer is obvious.
Human color perception is a minefield. Your eyes play lots of tricks on you without you ever realizing it. A 10% divergence is entirely believable when asking color questions.
If 999 out of 1000 said it was blue, you could dismiss the last guy as crazy or blind or something. But 10% providing a different answer means something strange is going on. Maybe the color is some borderline shade, or the lighting is weird, or people are being coerced, or.... Without more information, we can't really tell what's going on, so the answer pretty much has to be "who knows?"