Hacker Newsnew | past | comments | ask | show | jobs | submit | ge96's commentslogin

Hehe I'm waiting right now, should have been reviewed yesterday but I'm like alright, I'll just chill then.

So instead of a keel/weight it uses downforce? That would be neat

Maybe can generate roblox games, one gets picked up like the next skibidi, boom rake in the money

Time to make a Pokemon Go app for morticians

McGuyver pouring sap on a pinecone

Hiya! (Grenade)


Day 7 of using Claude Code here are my takes...

“Day 7" would be amazing - all that I see YouTube recommending is "I tried it for 24 hours"

I was listening to an "expert" on a podcast earlier today up until the point where the interviewer asked how long his amazing new vibe-coded tooling has been in production, and the self-proclaimed expert replied "actually we have an all-hands meeting later today so I can brief the team and we will then start using the output..."


I'm pro OVH granted I've not used Hetzner before and only used DO briefly. I too remember the burn but they gave credit. I like the emails saying "your service is under DDOS attack".

Raspberry pi? Say 4B with 4GB of ram.

I also want to run vision like Yocto and basic LLM with TTS/STT


I've been trying to get speech to text to work with a reasonable vocabulary on pis for a while. It's tough. All the modern models just need more GPU than is available

For ASR/STT on a budget, you want https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3 - it works great on CPU.

I haven't tried on a raspberry pi, but on Intel it uses a little less than 1s of CPU time per second of audio. Using https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/a... for chunked streaming inference, it takes 6 cores to process audio ~5x faster than realtime. I expect with all cores on a Pi 4 or 5, you'd probably be able to at least keep up with realtime.

(Batch inference, where you give it the whole audio file up front, is slightly more efficient, since chunked streaming inference is basically running batch inference on overlapping windows of audio.)

EDIT: there are also the multitalker-parakeet-streaming-0.6b-v1 and nemotron-speech-streaming-en-0.6b models, which have similar resource requirements but are built for true streaming inference instead of chunked inference. In my tests, these are slightly less accurate. In particular, they seem to completely omit any sentence at the beginning or end of a stream that was partially cut off.


Whispr?

For wakewords I have used pico rhino voice

I want to use these I2S breakout mics


I used to live on $20K/yr working a restaurant job, now in tech and six figs I'm still check to check. It's a lifestyle/personal choice thing in my case I'm dumb/waste money.

My burner account here, I be stupid so I learn

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: