
I made a talking emoji using regular emojis and JavaScript - maury91
https://hackernoon.com/how-i-made-a-talking-emoji-using-regular-emojis-and-javascript-fe20e62ba10
======
_nalply
I'm Deaf and read lips.

Thanks for this extremely intriguing idea to use emoji to depict a talking
person visually.

You know, some animated films get it right. Sometimes I clearly see utterances
like «Thank you!», «Okay», but more often not. Especially older animations
don't care and just let the character move the mouth in very simplistic ways:
«Babbabbabaa Baababba ba baaa.»

Similarly for that emoji. It is moving the mouth in quite arbitrary ways. It
looks like: «Sobbabbabee be <grin> seebee <grin> <frown> babbaa».

To solve this problem we need about twelve emoji for utterances: «ah», «ay»,
«ee», «oh», «oo», «b», «d», «f», «l», «n», «r» and «s». The «r» emoji must be
animated (so we see the trilling tongue). The other sounds are either
invisible like «k» or cover several different sounds like «b» also covering
«m» and «p».

«Okay» would be rendered by: «oh», «oo», «oo» (standing for «k» wich is
invisible) «ay» and «ee».

Edit: Add «th». This is an extremely simple sound to lipread. I remember my
delight when I learnt English in my youth and in an movie suddenly realised
that an actor said «Thank you».

~~~
twiss
Halfway through the article, he shows a picture with mouths for a-z, sh and
th: [https://cdn-images-1.medium.com/max/1600/0*gwbWfYm-
iQIPEXXP%...](https://cdn-images-1.medium.com/max/1600/0*gwbWfYm-iQIPEXXP%2E)
There are others if you search images for "lip sync animation". It would be
interesting to see if the result is understandable if you simply replace the
emojis with those images.

~~~
nurettin
That image does not map to letters, it maps to the phonetic alphabet. English
has different phonetics for words when compared to their written counterparts.
For example "money" needs to be converted to "mahnee" before being converted
to emoji counterpart. Otherwise it is gibberish.

~~~
yesenadam
'mah-' is US English, not English :P It's 'mah' in one country.

p.s. Seems every day on here I'm reminding people from the US that their
country is not the world, their version of English is spoken and spelt in just
one country, etc etc etc. Maybe I should give up doing that.

The most classic maybe was a linguist (yes, amazing) who on her blog was
gloating that her variety of US English was particularly rich, in having
almost every vowel sound in English. I looked at her list.. it didn't include
the 'o' sound in 'pot'..

~~~
viridian
Either you don't understand that he means the pheonetic alphabet, or you say
money in a way that is foreign to almost any person in the UK as well as the
US. Do you say mooney or mohney? If not I think your phoneme almost certainly
must be mah.

~~~
namelost
British English has at least two pronunciations of money that I know of,
/mʌni/ and /mʊni/

~~~
viridian
The former is the most common way for Americans to pronounce it, and matches
the mouth shape of ah, and the latter, while not common in the US, also
matches the ah mouth shape. I really don't know how yesenadam is pronouncing
it such that his mouth doesn't take the ah shape.

~~~
yesenadam
Ohhh. I'm sorry everyone. Yes, you're right of course. It didn't occur to me
until I tried it that the _bu-_ in _bun_ is _bahhh_. __Goes back to drawing
board __

------
exogen
One improvement might be to use a pronunciation dictionary and map the emojis
to actual sounds rather than characters. Having recently needed such a thing,
I found that there are two widely available datasets: CMUdict and Moby
Pronunciator:

[http://onlinebooks.library.upenn.edu/webbin/gutbook/lookup?n...](http://onlinebooks.library.upenn.edu/webbin/gutbook/lookup?num=3205)

[https://github.com/cmusphinx/cmudict](https://github.com/cmusphinx/cmudict)

The problem is that they'd be pretty hefty payloads to load on the client, so
you'd want to do the text -> phone mapping elsewhere. Then use your character
mapping as backup for words that aren't in the dictionaries.

------
chrismorgan
> an emoji, in reality, looks like this: “\xF0\x9F\x98\x81”

… well, that’s one encoding of the UTF-8 encoding of the Unicode code point.

> This is because setInterval(_=>{ },99) executes the function every 99ms

This is categorically wrong. You can’t trust setInterval to heed your request
precisely at all: it’s a request that browsers take as a minimum only. Most
browsers will call your function after the number of milliseconds you
requested plus up to 16ms more, but some might wait even more than that (I
think Safari in power saving mode doubles its tick time, to operate at 30fps;
and background tabs typically won’t fire more than once a second these days).

Try running this snippet in your dev tools; it logs the number of milliseconds
between calls:

    
    
      t=Date.now();setInterval(()=>console.log(-t+(t=Date.now())), 99)
    

On Firefox I’m getting mostly 108–111, with the odd one a little higher, and a
couple of 99s after running it for a few minutes.

(Some trivia on similar techniques for this measurement: `+new Date` is rather
slow, `Date.now()` is about 7× faster in at least Firefox and Chrome, and
`performance.now()` gets you microsecond precision (it returns a floating-
point number in milliseconds, tied to an unspecified epoch instead of real-
world time), and is a little slower than Date.now().)

~~~
didymospl
Besides that 99 % 6 = 3, so if the browser fired the setInterval function
precisely every 99 ms, only two emojis would be displayed.

~~~
maury91
Thanks for noticing, I will update the article

99 % 6 === 3 ((99 % 6) + 99) % 6 === 0 ((((99 % 6) + 99) % 6) + 99) % 6 === 3

------
giancarlostoro
Makes me think of Microsoft Comic Chat[0] this is kind of cool looking though,
and definitely different.

[0]:
[https://en.wikipedia.org/wiki/Microsoft_Comic_Chat](https://en.wikipedia.org/wiki/Microsoft_Comic_Chat)

------
cornholio
Good God, copy-pasting emoji in production code is now a thing. Damn you,
millennials, damn you to code maintenance Hell!

~~~
throwanem
Your editor can't type emoji?

~~~
cornholio
ycombinator can't:

There should be emoji after the colon, I can type them fine in Chrome. Truth
be told, they would make up great return values, making yourself indispensable
to your employer.

~~~
throwanem
The HN platform deliberately discards characters beyond IIRC 0xFF - it's been
a while since I looked at the source, I'm probably wrong about where it cuts
off. An odd design choice, but a design choice all the same - it doesn't bear
on anything else in any way I can see.

~~~
nitrogen
There was a time when people discovered a lot of text renderers break down
with lots of stacked combining characters, and people were posting strings to
HN that IIRC either drew a stack of accents all the way up and down the page,
or crashed the browser.

Maybe that has something to do with the character range limits. Not all non-
ASCII is forbidden, though: Français should work AFAIK, and I've seen Chinese
and Japanese.

------
DiThi
> Instead iterating through every byte like ""[n]

Pedantic note: JS iterates through UTF-16 2-byte code units. It's that way for
compatibility reasons (used to be UCS-2), but it's the worst of both worlds:
not as good as UTF-8 for ASCII, can't do operations in O(1) time like UTF-32.

~~~
chrismorgan
Calling it UTF-16 code units as distinct from UCS-2 is dubious. The water is
_very_ muddy there, because bad surrogates are allowed, thus the strings are
arguably better considered UCS-2 rather than UTF-16.

See [https://mathiasbynens.be/notes/javascript-
encoding](https://mathiasbynens.be/notes/javascript-encoding) for a discussion
of the topic.

I’m perpetually sad that a couple of big players decided to go all in on
UCS-2/UTF-16 when it _should_ already have been apparent that UTF-8 was
roughly uniformly superior and that UCS-2/UTF-16 would have major issues. (I
speak as one who was a child at the time, but has read a fair bit about the
state of things back then. So don’t trust my yearning optimism.)

The worst part about UTF-16 is how it poisoned Unicode with surrogate pairs.
UTF-32 and UTF-8 are both good because they didn’t need any special
consideration from Unicode to make them work. But UTF-16, extending UCS-2 as
it does… ugh.

------
iverjo
This idea originated in the jsgolf community and dwitter (a site where you
post 140 character js snippets that generate interesting visuals). See
[https://www.dwitter.net/u/aemkei](https://www.dwitter.net/u/aemkei)

------
gdiocarez
This is an interesting take on making javascript do animations.

------
bathtub
Such a nice idea paired with a descriptive and non-click-baity title.

