Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Borg Voice Generator (jaxcore.github.io)
71 points by dsteinman on Nov 27, 2019 | hide | past | favorite | 35 comments



I've always wondered if there is something like this for the voice of Majel Barrett-Roddenberry (Computer). I know her voice was recorder phonetically before she died (https://io9.gizmodo.com/the-voice-of-star-treks-computers-co...), but I don't think there's public access to that.

Perhaps transfer learning could be used to copy the style, using something like SV2TTS.


As far as I know they have not released those phonetic recordings. But even without those recordings it might be possible to use those deepfake voice fingerprinting systems to build an STT engine from sound clips from the show.


IIRC there are a lot of sound clips voiced by her on the Star Trek Encyclopedia CD: https://en.wikipedia.org/wiki/The_Star_Trek_Encyclopedia


I've been looking into that too.

Where I'm at now: there's an audio book she recorded, which is on the edge of being a large enough corpus for a third party to generate a model. I haven't listened to it yet though, so I don't know of it's in the right flat intonation to be easily usable.


There’s an HN thread within the last few weeks discussing research presented showing ML needing only five seconds of someone’s speech to replicate them.

Edit: https://news.ycombinator.com/item?id=21525878


I'd like to have something like this for the HAL 9000 voice. Does anybody have any suggestions?


Very cool effort, but I think it sounds more like Daleks from Dr. Who.

https://www.youtube.com/watch?v=YQLbwOGT8eM


Incidentally the Dalek voice is an effect that can be accomplished really cheaply in hardware or software, called a ring modulator:

https://webaudio.prototyping.bbc.co.uk/ring-modulator/


Dr Who used cheap special effects?! Say it ain't so!

I loved the Tom Baker version as a kid.


There are a handful of Dalek effects for the Teensy/Teensy Audio platform:

- https://pastebin.com/dYf49wp5 (from this demo: https://www.youtube.com/watch?v=2TyN__5MuJs)

- https://forum.pjrc.com/threads/47157-Distort-Voice-to-Someth...

Teensy info is here: https://www.pjrc.com/teensy/

Teensy audio board is here: https://www.pjrc.com/store/teensy3_audio.html (heads up, there's a rev D board to use with the Teensy 4, has the same functionality but different pinouts to match the Teensy 4).


I have a friend who is very handy, very into building vintage stuff. Cars, bikes, etc. One day he rang on some unrelated subject, I could hear he was on the motorway, I asked him where he was going.

"Oh I'm just off to get a voice box for my dalek". Like that's the most normal thing in the world.

Sure enough, next time I visited, a certain iconic malevolent alien, that fears only stairs, was sitting in the corner of the garage, happily murderously chuntering away every time he pressed the button.


Reminded me of the Vogans from the original radio show of Hitchhiker's Guide to the Galaxy. But that might just be my memory playing tricks.


Agreed, the Borg voices are much deeper and certainly aren't don't have a British accent.


This is very spot on, especially if you use some standard Dalek lines. A good example to test: "The Doctor is detected: terminate, terminate! Seek, locate, destroy."


Fun project, I'm sure, but sounds nothing like the Borg I remember.


I always liked the Borg voice in Voyager. It had the reverberating "many voices" feel, e.g. in https://youtu.be/4bkw69E_C4g



Makes it more interesting, in a way - who knew there was an uncanny valley between humans trying sound cybernetic and software trying to sound like humans trying to sound cybernetic.


I agree, this is how I remember it: https://www.youtube.com/watch?v=AyenRCJ_4Ww


Same here. It's obvious from the other clip provided that the voice evolved over time; but it's not something I noticed at the time. The voice you linked to is the one that is most memorable to me.


> It's obvious from the other clip provided that the voice evolved over time

I'd be surprised if this wasn't intentional, based on the premise.


Aye, it sounds like the Borg assimilated the late Prof Hawking's speech synthesizser.


It doesn't work for me in Safari on Mac


Safari has all kinds of issues with audio. Every report just leads to a rdar: url and then silence, both from Apple and from Safari haha (cry)

For a while couldn't send streamed but redirected audio through the webaudio api on Safari only. Workaround was to manually catch the redirect but the latest safari that doesn't help.

Like WebGL I don't think Apple wants Web Audio to work. They've got several outstanding bugs in WebGL (3yrs+) and their non-existent WebGL2 support as not seen a single commit in > 3yrs. Web Audio appears to be the same. It's frustrating.


They do have their own Apple Music streaming platform, perhaps you can just do what they do


I've had trouble identifying why. Safari has a largely compatible AudioContext API, and there's no errors, but the audio never starts playing and there's no "onended" event when it's supposed to be done. So I'm a bit stumped at the moment.


Same for Safari on iPhone


Same for me (safari on iOS)


This is cool but I'm curious why it sounds so much worse than the built in speech to text API

http://greggman.github.io/fanfictionreader/

Which voices are available are browser and OS dependent and there's no "borg" voice anymore. There used to be several alien and or non human voices but Apple removed them from the OS and most browsers just call the OS's text to speech API

--correction--

You need to go into the VoiceOver Utilities and add all the novalaty voices back in

https://recordit.co/ZGgw9MhepW


This wasn't made with the window.speechSynthesis API, it's using 2 older systems (espeak and sam) that have been ported to JavaScript. They don't sound as good but they generate AudioContext data which can be processed, mixed, and visualized in the browser. I don't think it wouldn't be possible to make this kind of Borg voice using the speechSynthesis API -- I did it by generating the speech using 6 voices, 3 in each channel.

I totally agree the built-in OS speech systems sound better over and I may end up adding window.speechSynthesis support to the API I made so it'll expose more voice profiles, but those ones will lack the visualization ability.


Can this be modified to include the original voices from the 1998 Microsoft Sam TTS Generator, or is that voice technology not open-source?

ex: https://tetyys.com/SAPI4/


I like the way that one sounds. But it looks like it's using a server-side script to generate the audio:

view-source:https://tetyys.com/SAPI4/scripts/tts.js


It would be fun if you could generate a link with a hash of a message so you can send it to your friends and coworkers with a silly message that autoplays.


It already does this, when you click the "say" button it generates a base64 url. You can share that url. The problem is when someone loads the URL the browser will not autoplay the clip. You have to click a button (or some other user interaction) to start the Web Audio API, it's a really annoying limitation that I wish Firefox and Chrome would change to a one-time popup confirmation. So what I did was hide the text box until after playing the audio.


scary!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: