

Interactive Word Clouds in D3.js - jasondavies
http://www.jasondavies.com/wordcloud/

======
mbostock
This is a truly impressive implementation, based on previous work by Jonathan
Feinberg [1]. The display uses SVG, but the character outlines are computed by
canvas bitmaps. Then, Jason implemented hierarchical bounding boxes to
accelerate the intersection checks. Be sure to play with the draggable "0" and
"1" letters beneath the demo!

Try clicking on words to navigate between word clouds. The cross-fade
transition is beautiful, and words that overlap between clouds transform in
their new position.

Non-English word clouds look amazing, too:

[http://www.jasondavies.com/wordcloud/#http%3A%2F%2Fsearch.tw...](http://www.jasondavies.com/wordcloud/#http%3A%2F%2Fsearch.twitter.com%2Fsearch.json%3Frpp%3D100%26q%3D%7Bword%7D@%D8%B3%D9%88%D8%B1%D9%8A%D8%A9%E2%80%8E)

[1] <http://static.mrfeinberg.com/bv_ch03.pdf>

~~~
jasondavies
Thanks!

Minor correction: I implemented hierarchical bounding boxes separately, as per
Jonathan Feinberg's paper, but they turned out to be slower in all the cases I
tried (even large words and areas). I was also using a quadtree to cut down
the number of comparisons with previously-placed words.

In my version, once a word is placed, I copy it to the relevant position in a
large sprite representing all the words placed so far. So placing a new word
means it only needs to be compared with the candidate area of the large
sprite, rather than multiple comparisons with all previously-placed words.

I'd like to try a hierarchical sprite version, where you compare against
coarse-grained sprites first of all. This would essentially be a quadtree. The
implementation would be a bit trickier because I'm also compressing blocks of
32 1-bit pixels into 32-bit integers, which also helped with performance.

------
brianstaats
Well done Jason! Finally, a tag cloud written in an accessible technology.
Thoughts on using an invisible bounding container for the words? Example:
silhouettes, shapes, words. What other features have you contemplated but have
not implemented?

~~~
jasondavies
Thanks!

Yes, I've thought about using invisible silhouettes, which presumably is how
<http://www.tagxedo.com/> works. The placement algorithm might be a bit
different though, perhaps starting at the centroid of available areas and so
on. I think it would work better to have a smaller pool of words and allow
reuse, or alternatively they could be sized randomly (perhaps with some
weighting).

The sprite collision code could certainly be reused for this, though!

~~~
bazitov
Great work Jason!

I was amazed of the beauty of Wordie but then I loved Darth Vader (Figure
3-11. Do not underestimate the power of the randomized greedy algorithm). The
words reuse should definitely need to be allowed, but the result will be very
beautiful when applied to smoothly changing contour (shape).

Is there a problem if I try to reimplement your code in C++ (especially
openframeowks)?

~~~
jasondavies
Thanks!

Yeah, I'll definitely try the randomised greedy algorithm when I get time, I
think it should be fairly straightforward now that I have the sprite collision
primitives in there.

No problem at all if you want to reimplement in another language, the license
is BSD: <https://github.com/jasondavies/d3-cloud/blob/master/LICENSE>

I'm guessing it will be blazingly fast using the bitwise operations in C/C++.
:)

------
NelsonMinar
So many details to love about this. The little angle control at bottom middle
is beautifully interactive. Also the transitions between clouds as you change
parameters or choose new words.

------
indubitably
This is awesome.

Curious detail: any idea why searching for "Islam" seems to make things go
haywire?

[http://www.jasondavies.com/wordcloud/#http%3A%2F%2Fen.wikipe...](http://www.jasondavies.com/wordcloud/#http%3A%2F%2Fen.wikipedia.org%2Fwiki%2F{word}@Islam)

Perhaps it's to do with the Arabic script corrupting the SVG somehow?

~~~
jasondavies
Seems to be working fine for me.

I occasionally get gzipped data back from Wikipedia (even though I explicitly
set Accept-Encoding in the proxy) - I think it's due to their intermediate
caches not respecting the Vary headers. I plan on adding gzip support soon to
rectify this, but until then, you can get around it by adding ?foo to the end
of the custom URL.

