morrisjm's comments

morrisjm · 2026-02-19T00:39:38 1771461578

Turns out you can play Asteroids on AsteroidOS https://github.com/moWerk/asteroid-blaster

morrisjm · on Jan 9, 2023

James Betker in the tortoise-tts repo, which is similar, says he spent $15k for his home rig. I'm not finding right now how long it took to train the tortoise model but feel like I read him say weeks/months somewhere. Obvs all kinds of variations depending on coding efficiency and dataset size, but another datapoint https://nonint.com/2022/05/30/my-deep-learning-rig/ https://github.com/neonbjb/tortoise-tts

morrisjm · on Jan 9, 2023

you could also just skip step 1 https://twitter.com/lexman_ai

morrisjm · on Jan 9, 2023

Related: https://www.newyorker.com/culture/annals-of-gastronomy/the-e...

morrisjm · on Jan 9, 2023

Open source tortoise-TTS has been able to do this for 6+ months now, which is also based on the same theory as DALL-E. From playing with tortoise a bit over the last couple of weeks it seems like the issue is not so much accuracy anymore, rather how GPU intensive it is to make a voice of any meaningful duration. Tortoise is ~5 seconds on a $1000 GPU (P5000) to do one second of spoken text. There's cloud options (collab, paperspace, runpod) but still https://github.com/neonbjb/tortoise-tts

ShamelessC · on Jan 9, 2023

Heh you might want to use an equivalent gaming GPU for the price comparison. Surely a thousand dollars spent on an RTX 4000 series card (Hopper) would outperform a P5000?

I agree though, Tortoise TTS did a lot of similar work IIRC by a single person on their multi-GPU setup. Really impressive effort. Did they get a citation? They deserve one.

edit: reading other comments it seems there is a misconception that the model takes 3 seconds to run? That isn't the case - it requires "just" 3 seconds of example audio to successfully clone a voice (for some definition of success).

morrisjm · on Jan 9, 2023

rtx4000 only has 8gig memory which means reducing the batch size (much slowness) and/or how much text you can give it at once (meaning you have to break up text chunks not at sentence breaks)

rtx5000 maybe but not sure how much of a value improvement there is

ShamelessC · on Jan 9, 2023

What is this, chatGPT? RTX 4000 is a series of cards, some of which have 24 GB of VRAM. There is no such thing as RTX 5000 series yet.

morrisjm · on Jan 9, 2023

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_proces...

dc443 · on Jan 9, 2023

The commenter you're responding to is talking about Lovelace architecture based GeForce RTX 40x0 products. The Quadro line isn't even released yet on this architecture. You are talking about the specific Quadro RTX 4000 product, which is a TU104 (turing arch, 2 gens behind, with 2560 processors and 8GB memory). The commenter you're responding to is referring to something like a GeForce RTX 4090 which sports an AD102 (lovelace arch, with 16384 processors and 24GB memory).

You were merely an unfortunate casualty of Nvidia's product marketing scheme (and a commenter's slightly imprecise reference to it) here.

ShamelessC · on Jan 10, 2023

I'm pretty sure we all lost heh. Thanks for clarifying. Indeed, there were slight errors in my description and the other commenter was reasonable in assuming those other cards were in discussion.

jordibc · on Jan 9, 2023

I think you mean https://github.com/neonbjb/tortoise-tts (missing last "s")

morrisjm · on Jan 9, 2023

fixed thx

NayamAmarshe · on Jan 9, 2023

Is the link wrong?

morrisjm · on March 13, 2014

http://endsoftpatents.org perhaps?

pkghost · on March 17, 2014

I need something that Yogi Bear would understand :)