Cute as a Javascript hack, but not going to compete with Vocaloid or Festival Singer.
Somebody really needs to crack singing synthesis. Vocaloid from Yamaha is good, but it works by having a live singer sing a prescribed set of phrases, which are then reassembled. Automatic singer generation is needed.
Figure out some way to use machine learning to extract a singer model from recorded music and generate cover songs automatically. Drive the RIAA nuts. Get rich.
Thanks for the feedback, and yeah I agree that it's pretty primitive at the moment (it can't even do legato!). I'm working on improving it though, and hopefully it demonstrates some of the potential for Web Audio applications; I actually originally made this for a demo session at this year's Web Audio Conference [0].
Modeling singers using machine learning would be really neat; I'm not too hip to the current research around that, although the idea brings to mind WaveNet [1], which seems like it'd be absolutely fascinating to try with pitched audio / using musical parameters.
Way back in the mid-1980s in the United Kingdom, and there were few places more 80s than that, Superior Software produced Speech!, a software speech synthesis program for the BBC Micro, a 6502-based machine running at 2MHz which didn't have PCM audio. It could reasonably reliably read out ordinary English text in a fairly robotic voice.
It was an utter sensation (featuring, among other places, as the computer voice in Roger Waters' Radio Kaos).
It's obviously not going to win awards, being barely intelligable, but if you can achieve that with a table of 49 phonemes each of 128 4-bit samples, then producing basic speech isn't that hard. I think that mespeak.js, which is what this demo is based on (which is pretty cool, BTW) is based on the same principle, although with obviously better samples.
(Unlike producing human sounding speech, which is appalling difficult.)
Original author here; good catch, I've fixed it now so it'll sing whatever lyrics + notes are in the grid when you press "Set Voices". Thanks for pointing that out!
This is great, nice job! I'm working on a midi player in JavaScript; it would be interesting to use this as the sound font. Maybe assigning certain words to certain pitches. https://github.com/grimmdude/MidiPlayerJS
What back end code have you integrated this with? Have you tried "flocking"? http://flockingjs.org/
(I have not, I'm just wondering about somebody else's experience with this sort of thing in JS)
I'm still pretty new to this (modern computer synthesis). I had a couple electronic music (synthesis) classes back in the day, but that day was back in the late 80s. We didn't even have any digital equipment the first time I took the class - it was analog gear with literal patch cords between LFOs, envelope generators, oscillators, filters and such. The second time we actually had some digital stuff to do FM and sampling layers.
There's no back end code used with this (assuming you mean server side). I've never tried flocking but it looks interesting. The library I wrote, MidiPlayerJS, just emits JSON events and in my demo I'm feeding those into a sound font library (https://github.com/danigb/soundfont-player). But my thought was to switch that out for this singing synth library and see what kind of sounds it can produce.
Project author here, just want to say thanks gattilorenz for sharing (was quite the pleasant surprise to see this on the front page!) and everyone for the feedback + fascinating projects, ideas, links etc. Really cool to see so much enthusiasm for speech+singing synthesis and Web Audio!
It's been a good year for the English singing synthesizer world, with the launch of chipspeech. (https://www.plogue.com/products/chipspeech) But I'm pretty interested in whether more realistic singing synthesizers will be made, since there are a few recent new voices by Acapela Group and others developed for non-singing speech.
This is great! I'm in the very early stages [0] of creating a framework to automate and control physical instruments through hardware & software. Never thought voice would be possible, I'll have to check out integrating this! Thanks!
On Safari, instead of using the normal "AudioContext" constructor you must create a "webkitAudioContext"- a feature detection check for this would be a nice addition.
EDIT: This issue has now been fixed. However, it's led me to notice some (unrelated) timing problems in both Safari and Firefox, which will take some deeper digging to figure out. Seems like browser compatibility rabbit hole never ends!
---
Thanks for this; I've fixed that issue and started on getting it compatible with Safari, but turns out there are some other errors regarding Float32Array mapping and support for AudioBuffer.copyToChannel(). I'll have to look more into this, but rest assured I'll push the changes when I get it working in Safari!
Somebody really needs to crack singing synthesis. Vocaloid from Yamaha is good, but it works by having a live singer sing a prescribed set of phrases, which are then reassembled. Automatic singer generation is needed.
Figure out some way to use machine learning to extract a singer model from recorded music and generate cover songs automatically. Drive the RIAA nuts. Get rich.