Hacker News new | past | comments | ask | show | jobs | submit login
Singing synthesis as a new musical instrument (2012) (ieee.org)
33 points by teleforce on Sept 6, 2023 | hide | past | favorite | 15 comments



Here's a fun little formant synthesiser that runs in the browser: https://dood.al/pinktrombone/


Pedantic nitpick, but pink trombone uses articulatory synthesis, which is a distinct technique from formant synthesis.

Articulatory synthesis creates a physical model of the vocal tract (usually by treating it as a tube of varying diameter).

Formant synthesis simulates peaks in the spectral signature (i.e. formants) by directly constructing bandpass filters with the appropriate center frequency and bandwidth.


Thanks for pointing that out! It's not a pedantic nitpick at all. It's my mistake. I didn't really know what to call Pink Trombone, so I plucked 'formant synthesiser' out of thin air. The distinction seems really obvious in retrospect. I've always liked physical modeling synthesis. It's a shame that there aren't many commercial synth products featuring models like this.


This is amazing. Will keep me busy for a while.


To show something more modern, there are hardware devices capable of singing synthesis, e.g.: https://sonicstate.com/news/2022/01/21/casio-releases-two-vo...


What, no mention of Imitone[1]? I got a license years ago, and I still get frequent "new version available to download" from SendOwl. It doesn't work perfectly, but it's still pretty fun to use as a midi controller and doo-doo-doo to play shredding metal guitar.

1: https://imitone.com/


What was old, is new again.

https://en.wikipedia.org/wiki/Talk_box

In 1939, Alvino Rey, amateur radio operator W6UK, used a carbon throat microphone wired in such a way as to modulate his electric steel guitar sound. The mic, originally developed for military pilot communications, was placed on the throat of Rey's wife Luise King (one of The King Sisters), who stood behind a curtain and mouthed the words, along with the guitar lines. The novel-sounding combination was called "Singing Guitar", and employed on stage and in the movie Jam Session, as a "novelty" attraction, but was not developed further.


These are not the same thing. A vocoder or talk box effect modulates an input signal with a carrier signal. It doesn't actually digitally synthesize sung speech from scratch.

Singing synthesis is different, most notably through Vocaloid. It's like a piano roll for singing. With Vocaloid, you can use preset voices, input MIDI notes and words/lyrics, and actually synthesize the sung speech completely in the box, and with significantly more realism based on how the synthesizer has been specialized to do that exact task. It lets a songwriter who is not a singer to write songs with full vocals and instrumentals that they wouldn't be able to do otherwise, in the same way that a DAW/midi sequencer enables a different way of writing music that doesn't require it to ever be played to exist.


for people interested you should check out holly+:

https://youtu.be/5cbCYwgQkTE?feature=shared&t=334


Requires a login. And the paper is from 2012.

I found this video about a singing synthesis instrument also by Yamaha (as the author): https://www.youtube.com/watch?v=2hk-kOPdHAY


Pocket Miku is an offshoot toy product. The paper is about Yamaha VOCALOID, the software synth that Hatsune Miku (and other voice banks) runs on. Vocaloid has been around for about 20 years now, leaving a massive subculture of music, art, games, hologram concerts and merchandise in its wake. The 16th anniversary of Hatsune Miku, easily the most popular voice/character, was this past week.

https://en.wikipedia.org/wiki/Vocaloid

https://en.wikipedia.org/wiki/Hatsune_Miku

It's definitely a major part of internet history as well; even Nyan Cat is a small offshoot of the greater Vocaloid community. (The music, by DaniwellP was originally created using Hatsune Miku, covered with another synthesized voice, and later someone paired it up with the famous cat graphic.)

Edit: here's a sample of what someone skilled with Vocaloid can do. https://youtube.com/watch?v=KmvydnVTriE


I'm fairly sure the concept of synthesis is much older as well, the wiki page is enlightening: https://en.wikipedia.org/wiki/Music_technology_(electronic_a... It mentions VODER, a Bell Labs invention, an electromechanical speech synthesis thingy that later evolved into the VOCODER, used a lot in early electronic music.


I can't read the article, but I see in the first few sentences it acknowledges this and starts to point out that Vocaloid (the main technology in question) is the first speech synthesis technology to actually see widespread use in a subculture of music production.

Most creative use of speech synthesis prior to this was limited to vocoders, which is really more speech transformation (requires speech input). The sound you hear from Kraftwerk, Daft Punk etc.

Though the VODER demonstrations are actually insanely impressive, especially for the time period.


Sounds like the voices used in Animal Crossing:

https://youtu.be/pMkeZBpXZiM


Nobody here had a Hero Jr. growing up in the 1980's?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: