
Show HN: Colors Used by Popular Sites - paulhebert
http://paulhebertdesigns.com/web_colors/
======
thekevan
I think the idea is extremely interesting but after seeing 2 of the caveats at
the end, I didn't have nearly as much confidence in the data.

"3\. If a color is found inside of a website's text it counts as being used.
For example if the phrase "tan leather" is in a website's text, this scraper
would say that the site uses tan. Additionally if the phrase "I understand" is
in the text, it would still count as tan.

4\. Colors added by external javascript are not included."

~~~
paulhebert
Hey thekevan,

Thanks for the feedback. I agree that these are pretty major caveats.
Hopefully you still found it interesting or useful.

This is my first public release of this visualization and I plan to keep
improving it over time. If it were a traditional software project I would
probably categorize it as 'beta.'

I've been spending a lot of time working on this in my free time and was
curious whether it was something people would even be interested in.

Seeing the positive feedback on HN and other sites has inspired me to put a
larger focus on solving these problems as well as adding new features.
(Analytics are showing about 1500 people have viewed it since I began sharing
it a couple hours ago. Thanks everyone!)

EDIT: up to around 4,500 hits as of 12:50 PST

~~~
iLoch
A different approach, if you wanted to do something more accurate (this is
also a reasonable choice in my opinion, given that you have a fixed number of
sites you're looking at) would be to use a headless browser like PhantomJS and
traverse the rendered DOM checking the computed style with
`window.getComputedStyle(el)`.

~~~
paulhebert
That's a good idea. That would be a much more accurate way to see which colors
were actually _used_. A few other people in the comments also mentioned
Selenium. I'll have to do some research.

Stayed tuned for version 2.0!

------
dynode
Paul, I made something similar -
[http://bardagjy.com/?p=1639](http://bardagjy.com/?p=1639)

I had the same problem with Javascript so I used I used Selenium to drive
Chrome to take a screencap of the page. Then I used K-Means clustering with EM
to convert the pages to their constituent colors.

I scraped 100 and 1000 of the Alexa top 1M. Cool to see another approach,
great work!

~~~
dynode
Replying to myself a quote from my article

"It’s easy to notice a bug when examining the colors for Google (note, this is
normal google.com not a doodle). Notice how the three colors are light gray,
dark gray, and white – not the typical red, green, blue, yellow color scheme.
Why? Well, when the image screenshot is resized to 320 x 240 pixels for
processing, the colors are dithered. The number of pixels in the new image
that lie _between_ red, green, blue, yellow and white – the dominant
background color – is much larger than the number of pixels that are colored.
Because of dithering, those between pixels are closer to shades of gray, than
colors, and thus the k-means clustering (with EM) finds shades of gray and
white to be the “color of Google”. I’m not sure if this is a bug.. what do you
think?"

~~~
paulhebert
Hey Andy,

That's awesome! I figured someone else must have had the same idea before me.
:)

I think your screenshot scraping technique is probably more accurate than my
text parsing. I also like that you used a larger sample size. I plan to
experiment with groups of 100 and 1000.

Thanks for sharing! It's always interesting to see how different people
achieve similar goals.

I'd like to begin scraping the images on the sites soon too. When I've got a
good chunk of time I'll look through your source code for inspiration. Mind if
I reach out with questions when I do?

EDIT: I also really enjoy those woodblock prints! Now I want to somehow print
my data for the top ten sites onto canvas.

~~~
dynode
Sure - I think the git repo is dead, I'll resurrect it if you're interested.

~~~
paulhebert
Yeah, that would be great. Thanks!

------
paulhebert
Hey Hacker News,

Hopefully this is the right place for this.

I made an interactive visualization showing the colors used by the ten most
popular sites. The page also explains different color models and how to
convert between them using math and javascript.

In the future I plan to use the tools I built here to do more interesting
comparisons of color trends over time and between different cultures.

Any feedback is welcome. Thanks for looking!

EDIT: Here's the github link by the way: [https://github.com/Paul-
Hebert/web_colors](https://github.com/Paul-Hebert/web_colors)

------
nprescott
I had a similar idea recently, originally I intended to try and identify
"fashionable" color palettes, which worked pretty well:
[https://idle.nprescott.com/2016/scraping-color-
palettes.html](https://idle.nprescott.com/2016/scraping-color-palettes.html)

and only later tried branching out to the 100 most popular sites:
[https://idle.nprescott.com/2016/image-capture-
crawler.html](https://idle.nprescott.com/2016/image-capture-crawler.html)

I tried ImageMagick's histogram functionality and a K-Means but wasn't happy
with the results. I didn't finish doing a meaningful color extraction from the
most popular sites because of difficulty getting (in my mind) representative
samples from more complex images (like the linked GitHub homepage screenshot).
I still intend on circling back at some point and trying a color quantization.

I hadn't thought to scrape the CSS colors directly, that's an interesting
approach. I'd like to see the colors sorted in some way, to get a better sense
of the range for each site.

~~~
paulhebert
Hey Nolan,

Those are really cool. I think your palettes are a more useful representation
than my 'data dump' strategy.

Another commenter used Selenium to get screenshots. I'll have to look into
that as I update this.

Thanks for sharing!

------
arjie
I think the pictures would be prettier if similar colours were together. Even
straight Euclidean in RGB and then sorting will probably be good enough for
this purpose, though you could probably get something fancy by using an actual
colour difference metric.

The Google logo is missing, though. The green, certainly.

~~~
paulhebert
Hey arjie,

Further down there are a few visualizations where colors are sorted by hue and
saturation.

I'm debating whether or not to sort the initial visualizations in the same
way.

Unfortunately my script doesn't currently pull colors from images, only CSS
color codes. A couple people have been confused by this so I'll add more info
about that at the top.

Thanks for the feedback!

------
dbg31415
* Spectrum - Chrome Web Store || [https://chrome.google.com/webstore/detail/spectrum/ofclemegk...](https://chrome.google.com/webstore/detail/spectrum/ofclemegkcmilinpcimpjkfhjfgmhieb)

> Instantly test your web page with different types of color vision
> deficiency. Color Vision Deﬁciency (CVD) affects people’s ability to
> distinguish certain colors. Estimates indicate that approximately 200
> million people worldwide are affected by some kind of CVD. Individuals of
> Northern European ancestry, as many as 8 percent of men and 0.5 percent of
> women experience the common form of red-green color blindness.

> This extension helps you to test web pages for people with different types
> of CVD. It's particularly useful for websites with data visualisations,
> because some colors may not be distinguishable from other colors in the
> charts.

> Flash enabled websites are also supported!

~~~
paulhebert
This is really interesting thanks!

It's interesting to see how the visualizations would render to the colorblind.

I also had no idea that 8% of Northern European men had red-green
colorblindness. I assumed it was a lot lower than that.

Thanks for sharing! I'll definitely be incorporating this extension into my
design process in the future.

------
niftich
This reminds me of the article 'Why Every Movie Looks Sort of Orange and Blue'
[1]. Even though the reasons behind movies and websites looking similar are
different, it's interesting that both are heavy on blue and reddish-orange.

[1] [https://priceonomics.com/why-every-movie-looks-sort-of-
orang...](https://priceonomics.com/why-every-movie-looks-sort-of-orange-and-
blue/)

~~~
paulhebert
Thanks for sharing! That was really interesting.

It kind of makes sense, at least for movies. Complementary color schemes with
contrasting colors connect with humans on an intrinsic level. Since the sky is
already blue and humans are slightly orange. (at least caucasians, who tend to
be over-represented in Hollywood)

If blue and orange are already present in most scenes I understand the impulse
to up that contrast.

------
Elrac
Site doesn't work for me. There's a blank rectangle below the "Sites" title
and to the right of the site URLs.

Mousing over that area raises some white tool tips/pop-ups displaying color
names/codes.

I found this an ironic coincidence with another recent post on HN:

"Ask HN: Is web programming a series of hacks on hacks?"

[https://news.ycombinator.com/item?id=12477190](https://news.ycombinator.com/item?id=12477190)

~~~
paulhebert
Hey Elrac,

Sorry to hear that. If you try reloading the page are you still having that
issue?

Mind letting me know what browser/OS you're using?

Do you have javascript deactivated or any browser extensions running?

I read that article as well and enjoyed it.

In the future I'd like to move more of the visualization code server-side and
rely on the end user's browser less.

~~~
Elrac
Hi Paul,

Sorry 'bout the late answer; I don't get emails on responses.

My browser is Firefox 45.3.0 running on Windows 7. I use Privacy Badger - that
blocks sites that do too much cookie tracing but it didn't object to your
page.

What may be more significant is that I browse through a corporate firewall -
it sometimes fails to connect to the same host I connected through earlier,
and sites which rely on consistent conversations sometimes fall victim to
this.

All that said, frustratingly, today I see colorful splotches. Whatever the
problem was, it seems to have gone away.

I think that your idea to do more server-side rendering is praiseworthy. I'm
very tired of the many Web sites whose authors seem to think they own my CPU.

------
adontz
Looks like Internet is mostly blue.

~~~
paulhebert
Yeah, it's interesting to see the color groupings that take place.

Next I want to do cultural comparisons. E.G. the ten most popular sites in
China vs. the ten most popular in the U.S.A.

It would be interesting to see if it would still be all blue.

------
sb8244
Not sure if I'm missing something, but the main Google logo colors aren't
reflected in the Google section. Is this just HTML color codes rather than
images. I guess the challenge in an image is that most aren't solid blocked
text.

~~~
paulhebert
Hey thanks for checking this out!

Right now my scraping script only pulls the HTML color codes and doesn't
attempt to pull colors from images. Additionally, a color code could be
present in the stylesheet but not actually used on the site.

I have some information about this and other caveats at the bottom, but I'll
put an explanation closer to the top to avoid confusion.

Thanks!

------
_RPM
Kind of off topic, but didn't anyone notice that FastMail pretty much copied
Facebook's colors. I too remember using Facebook's colors when I first started
learning how to do HTML, but didn't go forward with it.

------
eekfuh
I would like to see the colors sorted by hue

~~~
paulhebert
Hey eekfuh,

If you scroll down a little further you'll see there are a few sections where
they're organized by hue; once in a bar chart and once around a circle.

Let me know if you still can't find it and I'll try to explain more thoroughly
or send you a screenshot.

Someone else mentioned they'd like to see the colors sorted by hue in the area
where they're broken out by site. Maybe I'll update that section to be less
random.

------
SimeVidas
But how does Alexa determine the most popular sites to begin with? I’ve never
asked myself that.

