How to implement search-by-color when all you have is a good coffee 177 points by helloiloveyou 5 months ago | hide | past | favorite | 41 comments

 Using the distance of the colors in a Euclidian space makes sense, but RGB isn't the best space to use.You probably want something like CIELAB which is (quoting from Wikipedia) "designed so that the same amount of numerical change... corresponds to roughly the same amount of visually perceived change."
 You can apply a weight[0][1] to each of the R, G and B components before computing the distance to get results that approximate doing the distance in Lab color space. The main benefit being that this is a lot cheaper computationally than actually converting to Lab, if you are already working in RGB.
 Interesting. Perhaps this information is not available with the images? or are they.The conversion between RGB and CIELAB is almost meaningless (according to wikipedia) or perhaps there is a still a way to do that and helpful?1. https://en.wikipedia.org/wiki/Color_space#Absolute_color_spa... : However, in general, converting between two non-absolute color spaces (for example, RGB to CMYK) or between absolute and non-absolute color spaces (for example, RGB to Lab*) is almost a meaningless concept.
 Image formats like JPEG or PNG typically have a color profile embedded in the metadata, or if they don't you can assume sRGB.
 > The conversion between RGB and CIELAB is almost meaningless (according to wikipedia) or perhaps there is a still a way to do that and helpful?As I understand it, that line is really talking about coming up with formal conversions. That is, some color spaces are effectively not well-enough defined (neither in relation to reality nor to each other) to be scientifically useful.It's like how you might say converting from length in pixels to centimeters is "meaningless" because there isn't truly a consistent physical relationship between them, but people still get plenty of use out of making such a conversion every day.> Interesting. Perhaps this information is not available with the images? or are they.It's rare for images to use these colorspaces or gamuts. They are complex and cumbersome, and RGB makes for a damn good "pipe wrench" of bluntness and utility. Plus very few developers are aware of how deep color science is, and assume RGB is (pretty close to) the ground truth.
 I'll check it out! I didn't try further because the manual tests I was doing seemed sufficient enough. But I love this kind of theory!
 This seems like a really fun problem and I think you found the solution with the best cost/effort ratio. The first thing that sprung to mind on seeing the problem for me was to use the fact that products often come in several colour variations but use very similar images for each. If you mask out the parts of the image that are similar you can get the single 'actual' colour as it might be labeled by a human. This would also allow you to search for neutral tones like grey and black.And seeing as you've gone to all the trouble of figuring out how to grab dominant colours from images, why not allow users to upload an image of their logo and grab the dominant colour from that? This could help avoid the problem of having noticably different shades, without making the user do anything technical like searching with an RGB value and a threshold.
 You'll get better results using a perceptually uniform color space [1]CIE76 is not ideal but would let you continue to use CUBE.You could consider doing a first pass with CIE76 and then sort by something more accurate in a second pass.
 > I just took the euclidean distance between the given color and (0,255,0)If you used HSL, you possibly could match on the hue while being more lax on the brightness and saturation—so a pastel-pink color could still match to a red image. Not sure if the database can use different weights for positions in a vector (though with pre-processing S and L could be collapsed to narrower ranges, so they match more freely afterwards). Just need to make sure that white and black aren't matched to random hue values.Though, I guess if you use only the distance to fixed chosen high-saturation values then it's about the same thing.Except it seems that with just RGB distance, a semi-saturated cyan or yellow color might be counted as closer to ‘pure green’ than dark-green. Something like (0,255,180) vs (0,70,0).Also:> values('(\${ color })'2020 and still no placeholders.
 I'm not sure HSL would be better in that case. For unsaturated colors (black to grey to white) HSL has many points that have perceptually the same color but different HSL values. If S is zero, then you have the same color for any value of H so you might get a large distance for two points that are actually white for example. At least when using Cartesian coordinates like he is here.
 Yes, as I noted:> Just need to make sure that white and black aren't matched to random hue values.Colors with too small saturation can just be excluded from the search, since gray/white/black go with anything.
 I'm only marginally technical, but I would like to congratulate the author for a great example of how to explain the issues, the sequence of problem solving and the solutions so clearly and entertainly. Very good writing.
 Thank you very much!! I rewrote the article a couple of times
 This might be a dumb question, but what does coffee have to do with this task?
 It's trying to say that you don't need a lot of time to do this.I literally skimmed the article while drinking a coffee.
 the palette approach is a little reductive. we've built a solution in this space, a more robust approach would be to get the dominant colour, and from that colour brightness and tone extract the accent colour and the contrast colour by considering both distance from the dominant and extent on the image, that solves the problem of having a weighted palette ignoring, say, a red stitching because it's too small
 Yours is the form of solution I was expecting early in the article (though it sounds like you went farther) and I wonder if the author would have ended up with something similar if that library didn't exist.Regarding being reductive though, I think you have to be in this situation. It's a part of a product (not the whole thing) and seems like manual curation could fill in the gaps.
 >extract the accent colour and the contrast colourThat wouldn't work for products that have poorly chosen color (which aren't rare) but have the right non-dominant color of customer's brand.
 the opposite really, measuring both tone contrast and luminosity contrast from the background average separately and extracting them is instead more robust for goods that have patterns, i.e. https://www.seven.eu/it_it/seven/zaini.htmlit also works both ways, so white elements having a color accent and a dark contrast element still have them recognized for what they are
 I wrote a tutorial a while back on how to do this in Postgres using the Google Cloud Vision API in case its useful for anyone..
 I like the color pallet option you provided showing the primary colors. Adding the colors as facet options would be interesting since you can align each of the colors with the items, then maybe a numeric metric you can use to sort the results by the percent coverage of that color.Another item to consider is also taking into account how the user might search - "red shoes", then classify the query as "color: red" & "shoes".The next fun challenge is determining which image to display if you have a variant product (color/size combination for clothing for example). Figuring out which product image to show in the result set requires identifying the primary color in the search.
 Always happy to learn about more good ways to use PostgreSQL!
 Mike Bostock (the d3 creator) has a nice little demo illustrating the use of linear RGB to assign an average color to an image. [1]
 Vibrant.js [1] also does this.It is nice to look into the source code to see how it is done.But in my experience it works around 80% of the time because there are so many exceptions.
 There's a classic injection attack when interpolating in the SQL queryAlso as others have mentioned RGB space and human perceptible color space is different enough that distance in RGB doesn't equate to difference/similarity between colors
 No, it isn't. What is interpolated in the string comes from an object containing predefined set of strings. I don't allow the user to type anything they want. I didn't want to clutter the tutorial
 Can you be more specific on the SQL query attack?I can't see any client data in the query, so how would anything be injected?
 Impressive. I wasn't expecting that approach! The whole thing implemented as a stored procedure in PosgresSQL a data type (cube) I didn't even know existed. That was a wild ride for me.
 Great article. However in my experience, using the Euclidean distance in RGB space can bring about some really weird visual 'matches'. The best color space I've found so far for this type of task is the LAB color space.And there are also other tricks of the trade. But for starters this simple solution might be enough.This solution also assumes the colors in the images are 'calibrated' and are consistent. This is almost never the case, but may be overlooked in for the given use case.
 Good article, and I like the implementation.I am sure it has some drawbacks, like distance on the rgb space may not be the best option, or that it does not ignore background color of it is not transparent, still, I like the way of thinking
 What does the title refer to?
 I really like this title: "Can we implement something easier than a Convolutional Neural Network?"Nice to see people still remember problems can be solved with anything else than, ehm, "AI".
 I thought this was going to be about how coffee is a consistent color so you can white balance your customers uploads by asking them to photograph coffee in their ambient lighting.
 OP here, this is amazing, i love it!
 Coffee is definitely not consistent in color. You can absolutely tell the difference between a light and a dark roast, or a fresh and an hour old coffee.
 I love the happy, optimistic outcome of running free association on the title. It reminds me of some of the really good questions I used to come up with as a child the first time I heard of a fact or concept.
 xkcd did a perceptual color survey a few years ago: https://blog.xkcd.com/2010/05/03/color-survey-results/It's population specific to the type of people who read xkcd but it's an interesting starting point if you want higher resolution info on, eg: exactly where people draw the line between "green" and "blue".Raw data is here: http://xkcd.com/color/colorsurvey.tar.gz
 Nice write-up, thank you.Nitpick. What's the point of writing a good guide with rich text and images, but showing end result only in video? I have embedded YouTube videos disabled by default, so it's a little embarrassing.
 Why is this embarrassing? That is incomprehensible to me. You’ve disabled embedded YouTube videos and should not be surprised when content is missing from webpages. Video is the perfect choice for showing off the results, video distribution is hard, and YouTube embedding makes this fairly easy.

Search: