
Shape Catcher: Find Unicode characters by drawing them - molenzwiebel
http://shapecatcher.com
======
cedex12
Similar tool for mathematical symbols:
[http://detexify.kirelabs.org/classify.html](http://detexify.kirelabs.org/classify.html)

------
brudgers
I could not get it to identify a British Pound symbol after several attempts.
The top proposed glyph was much more obscure and the following ones were
increasingly obscure from there.

I suspect that the training corpus may have been a table of Unicode glyphs
rather than text from the wild.

~~~
Psyvire
I think it might be that it's using a font that styles it in a way that's
different that's what you're drawing. I got it to recognise it even with my
rather crude drawing after I changed the style:
[https://imgur.com/a/e9B5p](https://imgur.com/a/e9B5p)

------
throwanem
This is kind of a missing piece, in a lot of ways. With such a large character
set as Unicode's, discovery can be a real pain - when you see a novel
character, how do you find out what it's called, so you can find out how to
type it?

Unless you're using something like Emacs which lets you point at a character
and ask the editor to tell you everything it knows about what's there, this
kind of identification becomes a daunting task to contemplate. Shapecatcher
does an excellent job of it; as long as you can draw something roughly
approximating the glyph you have in mind, it'll very effectively winnow down
the search space to a very manageable list of possible matches.

~~~
anonymouz
> Unless you're using something like Emacs which lets you point at a character
> and ask the editor to tell you everything it knows about what's there, this
> kind of identification becomes a daunting task to contemplate.

If you already have the character in some text file, it would be much easier
and reliable to copy and paste it into some unicode table lookup tool (e.g.
[https://unicode-table.com/en/](https://unicode-table.com/en/)).

------
ardacinar
The search isn't really perfect. I tried drawing a (pretty good, IMO) Hiragana
"no" and that result was in the third place (First was, a latin small m. の
looks nothing like an m). Then tried Greek small sigma (σ) but not perfectly
(I draw ny sigmas in a weird way, looks like this:
[http://imgur.com/a/XYVHO](http://imgur.com/a/XYVHO)), the top result I got
(Malayalam fraction one quarter: ൳) kind of looks like the thing I drew, but
the rest of the results are not really resembling it and there's no sigma
there.

~~~
darkengine
In the box on the right, it says "Japanese, Korean and Chinese characters are
currently not supported", which is disappointing, but probably why you're not
seeing it in the results.

------
Silhouette
Interesting idea. It seems to struggle a bit with some types of characters.
For example, drawing a lowercase pi would return many characters with more
than two legs, which showed up ahead of pi itself and other characters that do
have the two. Does clicking on the good/bad feedback links in cases like this
help to train the algorithm in some way?

------
tyingq
Really well done, and handled my crappy drawings just fine.

I did see the link to your thesis on captcha, but a specific higher level blog
post on how this works would likely be popular.

Edit: One piece of feedback...it's hard to draw dots. You have to drag the
cursor with the button down, or drag your finger in mobile to get a dot. So
dots end up more like little lines. Also, an "Undo" to remove the last "cursor
down / draw" event would be nice. Starting over for every line is the only
current option.

------
hsivonen
This is cool, though I was a bit disappointed to notice the part about no
support for CJK characters after trying to draw one and not having it
recognized. It seems to me that looking up Unihan ideographs is an area where
a tool like this could be particularly useful.

~~~
chch
I've been using Shape Catcher for a while for my non-CJK needs, but the best
tool I've found that for Unihan characters has been the one from LINE[1].
Specifically, unlike others I've used, it tends to do a good job with semi-
cursive/calligraphic writing. That makes it faster to look up complicated
characters (because you can scribble a bit), plus it's useful when you don't
actually know exactly what the "original" form of a character is, and have to
just draw it from sight.

e.g. if you see [2], and don't recognize it, but can draw it into the tool,
you can get 而 correctly, without having to draw the very specific stroke
order[3], like in other tools.

It's made for Chinese, but I often use it for Japanese, too. :)

[1]
[http://ce.linedict.com/dict.html#/cnen/home](http://ce.linedict.com/dict.html#/cnen/home)

[2]
[http://www.ryuurui.com/uploads/1/5/4/8/15489306/6738815.jpg?...](http://www.ryuurui.com/uploads/1/5/4/8/15489306/6738815.jpg?468)

[3]
[https://upload.wikimedia.org/wikipedia/commons/2/2a/%E8%80%8...](https://upload.wikimedia.org/wikipedia/commons/2/2a/%E8%80%8C-order.gif)

------
m-p-3
It reminds me the special character finder in Google Docs, very well done.

------
justinclift
I use this occasionally when trying to find a new glyph. There are some
drawbacks though:

• Last updated in 2012:
[http://shapecatcher.com/news.html](http://shapecatcher.com/news.html)

• No way to draw straight lines except pixel-by-pixel (really tedious). This
turned out to be a pain when trying to draw various arrow types (made of
straight lines).

I'm hoping the author, Benjamin Milde, picks the project up again and keeps it
updated, or makes it Open Source, then someone else does.

------
hashhar
This is really great. Works perfectly and solves a very practical problem for
me. Unicode really should do something about discoverability though.

~~~
yorwba
Using Fcitx on Linux, any Unicode character can be typed using the hotkey for
the Unicode addon [1] and typing the description. So if you ever need the
"arabic ligature bismillah ar-rahman ar-raheem" (﷽), it's right at your
fingertips!

[1] [https://fcitx-im.org/wiki/Unicode](https://fcitx-im.org/wiki/Unicode)

~~~
hashhar
Wow. That is awesome. Thanks for leading me to it.

------
eriknstr
Pretty neat. Would be useful to be able to restrict the blocks that are
searched. For example I might know that the character I'm looking for is
Japanese, so if I could let it know that I was looking for is Japanese then it
could restrict itself to Katakana, Katakana Phonetic Extensions and other
blocks if any that apply to Japanese specifically.

------
trhaynes
Love this tool! (Btw title should be one word: "Shapecatcher")

------
nfriedly
Android Wear does something like this for emojis - I've gotten pretty good at
drawing a "thumbs up" to respond to text messages and such.

------
tigerBL00D
Pretty cool! I wonder why the recognizer is not very good at differentiating
among types of faces (sad face, happy face, etc.)

------
hughes
𝗇ìϲе homoglyph search tool you've made :)

(it found all the letters of the word "nice" quite well!)

------
runnr_az
That's really fun!

~~~
dbcurtis
for sure! I did about three serious test cases then got distracted into making
random squiggles just to discover new wacko unicode characters.

Now that Python supports unicode identifiers.......

