Seems to be matching against one particular typeface, and is very sensitive to how similar what you write is to that typeface. The first thing I tried was a lowercase "a," which, in my handwriting, is a one-story "a," and it had no idea what it was. Likewise it guessed poorly at my lowercase "f" because the top of my "f" curves back down, and its doesn't, etc. Seems like it would benefit from a dataset with more variations on how various characters are written in practice.
I tried multiple times to match an ampersand, and thought I did a fairly accurate job.
I eventually managed to get it as a third match, but this seems to be what it matches against [0], which isn't the ampersand I'm used to (nor the one you get in a google image search).
Only halfway related, but a long time ago I had an idea for a character encoding that was stroke based.
In other words, based on minimal stroke primitives (line, arc, circle, etc.) that were placed not with any exact coordinates, but simply in relation to each other conceptually and with crude size/position categories. E.g. "downward stroke, top to bottom" is a capital "I" while "downward stroke, middle to bottom, dot closely above first stroke" would be a lowercase "i". And then for matching forms with different meanings, there would be a final "family" selector, e.g. to distinguish an em-dash from the Chinese character for "one", or an en-dash ("punctuation" family) from a minus sign ("math symbol" family).
And then a suitably compressed bit encoding for the instructions. So in the end, something like "I" might just be 3-4 bits long, while a complex Chinese glyph might be 60 bits.
But the main feature being that a font renderer could always draw a primitive version of any glyph, even if you don't have it in a single font anywhere, because the character code itself encodes it. And then that character codes wouldn't be just something totally arbitrary invented by Unicode, but inherently meaningful, and anyone could be free to invent any character they wanted, that would always been drawn by any software, no Unicode gatekeepers needed.
Obviously it's not terribly practical, for a whole host of reasons. But I still sometimes think about how elegant it would be to have a "geometric" self-describing character encoding, and to get away from all of the political decisions around language scripts and where they get put in Unicode and in which version.
Problem is, how do you define these representations in the first place? For one thing, some characters have multiple different forms… like a/ɑ, or g/ɡ [0]. Then you have characters which differ only minutely — like Thai ช/ซ, or Ethiopic ሀ/ህ, or Cherokee Ꭺ/Ꭿ. And, worse, those characters can also look entirely different between fonts (see e.g. [1] for Thai). So by the time you’ve finished working through all those choices, and created a format which can distinguish them all, you’ve effectively created yet another font — just in a very lossy vectorised format.
Or if you go the opposite direction, and expect each individual character to make its own individual choices in this regard — well, that basically has the same problems as PDFs: it may look good, but it’s totally impossible to process programmatically, since it overspecifies the visual details at the expense of semantics. And although that might be fine and even desirable for certain usecases, it does limit the places where this format could be used.
That being said, I can certainly see possibilities and places where this could be useful to me. Perhaps I’ll have a go at implementing this someday.
[0] In case those look the same in your font, here they are again:
a/ɑ g/ɡ
And in case those look the same too, then… well, have a look at the codepoints in the Unicode reference charts, I guess!
Yup, I didn't say it would be easy. :) But initial thoughts are simply that characters would be defined "canonically" (for progammability) with fonts free to vary stylistically as desired just as they are now (for the classic single- and double-story 'a' and 'g' you mention, for instance.) While minute differences within any language should be handled by the strokes themselves or else the final "family" selector I mentioned when strokes are identical (is it an en-dash or a minus sign).
Again, more of a thought experiment of how character encoding might have gone a different way in the past. And because canonical forms would be required for interoperability it would still need a coordinating body (like Unicode) to standardize them, but people would still be free to encode their own meaningful characters outside of the standards (such as rare/ancient family name characters in Chinese that aren't in Unicode).
What's really fun to think about is how rendering libraries might even use machine learning to draw glyphs when no font glyph is available, in the style of an existing font, whether a garamond serif, or brush Chinese.
It's an interesting idea, my first thoughts are how do you make it machine readable. Like if you're writing a browser how do you translate something that "looks" like google.com and know you need to go to google.com and not googIe.com?
Maybe you don't bother (i.e., don't try to parse bytes in this encoding as plain text) but that has a bunch of consequences too.
This is basically the point (or one of the points) I was trying to make in my sibling comment; thank you for expressing it much more clearly than I did!
This doesn't appear to even try to match to Chinese characters? I drew the ones for convex and concave (凸 and 凹, respectively), but only got back dominoes and APL symbols as matches. Drawing a box returns the Japanese katakana ロ, but not the Chinese hanzi 口.
Tried it with β and with ξ but did not get a match.
(Disclaimer, I did eventually get the β working, but first versions only gave other results such as P, ρ or Բ).
I'm using these for documentation and monodraw is only useful up to a point and references to these are scattered in different pages in no relation for drawing purposes.
I tried drawing a \, but shapecatcher only show just "\" and only if it's semi-close to 45°.
Edit: Thanks @mapierce2 , Detexify seems to work better for this purpose, but the results seems to be images, not text.
Snowman didn't work for me, but the other characters I tried did. I had some trouble with "therefore" because there were a lot of virtually identical symbols, but I can't blame it for that.
It mentions not all characters are in the the database.
Neat concept though!