Hacker News new | comments | show | ask | jobs | submit login
Pink Trombone: Speech Synthesis Simulation in JavaScript (dood.al)
325 points by glitcher 69 days ago | hide | past | web | 52 comments | favorite

This very cool video demonstrates what you can do with it on a touchscreen: https://www.youtube.com/watch?v=7LGnozlwU1o

Also, if you like this kind of interactive thingies, you might enjoy the explorables subreddit: https://www.reddit.com/r/explorables/

Thanks a lot for the subreddit!

Would this be a good way to learn a language? Seems like a good way to show how to pronounce consonants/vowels that we may have trouble discerning from similar sounds.

for example, I would love to be able to refer to this when watching Erik Singer describe accents: https://www.youtube.com/watch?v=NvDvESEXcgE

All this needs is keyboard controls, and then it'll be a great substitute for QWOP.

oh you mean like the voder! ;-) https://www.youtube.com/watch?v=5hyI_dM5cGo The lady playing Auld Lang Syne would probably be an excellent QWOP player.

I've had the Voder's version of Daisy Bell stuck in my head for years https://www.youtube.com/watch?v=41U78QP8nBk

This is hilarious and very good. How do I make it say a word though? Is it possible?

You can make most pulmonic voiced consonants, except laterals, trills, dentals, and retroflexes, so /m n ɲ ŋ ɴ b d ɟ g ɢ ʡ ʔ ɣ z ʒ ʑ β ʋ ɹ j ɰ ʁ ʕ ɦ ⱱ ɾ/ are all possible, as well as all vowels. If you're really quick with your mouse you can then make pretty decent sounding words using any of these sounds. Some relatively easy ones I managed to do are "man", "no", "guy", "me", and "why".

Some require co-articulation and diphthongs. For example, "why" requires putting the tongue to the top left, /ɯ/, rounding the lips, making it /u/, then quickly releasing the lips and moving the tongue to the bottom/bottom-right, /a/, then sliding it to the top right, /i/, so that you end up with /uai/ which is pretty close to the correct English /waɪ/.

I figured out "mama" pretty quickly (click in the front of the nasal cavity). Also, I figured out why it's such an easy word for babies to learn.

Similarly, "papa": click on the upper "lip".

I can get it to say "yeeeeaaaahhhh"

Love the tie in with physical science / diagram. Not over the top trying to be sterile perfect, more a beautiful tool to deliver "conceptual" perspective. Interactions. So very very cool.

As somebody else noted, wisely so, I'd love this as a free VST DLL. Could fit right in over at KVR Audio.

I'll third this!

Oh wow. The next time I've got a difficult conference call where all sense and reason has been lost, I know what I'm going to do.

"Ben, do you think we can implement this feature successfully in time for the client demo?" "Ahhhuhhhuhhahhahuhuhuhhhuh!"

I've done a C port of this code and hooked it up to my own midi controller. Much fun. Happy to share the code with anyone who wants it.

Funnily enough, the port I did was so brainless that the last bit (the perlin noise part) was originally in C, then ported by HN's very own josephg. I didn't notice this, even though Joseph was a friend of mine at uni. Once I finished the port and went to making sure all the licensing stuff was fine I noticed this all.

How difficult would it be to wrap your code in VSTi plug-in and produce a DLL? This would make it easy to hook up to midi controller and other devices using any realtime VST-capable host.

I would assume trivial. I'm a linux person, so I've just wrapped it in jack stuff - but this is what these plugin APIs are designed for.

I prefer LV2 plugins as they integrate nicely into the rest of system. As I dabble in sound/music only rarely, Jack's all-or-nothing approach doesn't suit me at all. Also, it requires 5+ windows just to set everything up.

github please!

Hi, thanks! How did you get it running with your midi controller?

It should be scriptable... It would be a whole other level

I had always wanted to do a webaudio port of:


but never got around to it. This is great!

Amazing, but do they know that pink trombone is a euphemism for the penis (when performing fellatio)?

To be fair, everything, including the visual length of your comment can be a phallic euphemism. (Look, your 'comment' even has a tip!)

Isn't everything?

That's oboe. You might be thinking of brown?

I want a brown trombone that synthesizes farts. ;)

Also you should be able to synthesize a raspberry by dragging the tongue out between the lips.

brown trombone? is that a thing?

"Miss Simpson, do you find something funny about the word 'tromboner'?"

What a great time to start learning linguistics and phonetics!

As a trombonist, I'm thoroughly offended by the name. This is way too silly for serious trombonists!

Super cool. Is there a gender button somewhere? sounds very male to me but there's no difference in voice/sound production between genders other than just the exact pitch, right? Can you get it to make sounds like a woman or only a man?

There is (typically) a difference other than pitch. Formants will be in slightly different locations [1]; the energy in the spectrum is balanced differently; there is more aspiration noise [2]. You can model the latter somewhat with this tool by moving the voice source box down.

I've been creating a speech synthesiser recently and it does seem like simple approaches produce a voice that is more passably 'male' than 'female', even accounting for the pitch. Aspiration added with white noise sounds better but doesn't quite get that breathy quality across. I think more sophisticated techniques may be needed.

[1]: https://www.researchgate.net/publication/220531604_Phonetic_...

[2]: https://www.ncbi.nlm.nih.gov/pubmed/8653179

Thank you! Super interesting and a lot of info...

Can't you just turn the pitch all the way up?

Or way down to synthesize vocal fry!


The length of the canal would be different, wouldn't it?

Vocal cord thickness too.

I haven't examined how it works, but in the same vein:


How does it compute the frequency response of the various configurations? Is there an aerodynamics solver at the heart of this?

When using multitouch interface and turning off "always voice" I was amazed at how expressive it is.

Drag over the circle, and while still dragging, move your mouse over the square. Now you can control both at once.

It would be neat if this took some kind of data input in sequence to manipulate the pieces. It would be neat to decode Speex or Codec 2 data as input into this to watch the animation ....

I'm curious, what glottis wave-table are you using? Is it randomized at all?

I had a lot of fun making deep throat blowjob noises...Thanks!

Isn't a pink trombone...

A pink oboe is a thing. So is a rusty trombone. Your disgusting brain has somehow combined the two.

All sounds sound Asian.

Oh dear....

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact