Hacker News new | past | comments | ask | show | jobs | submit login

Skip the speech-to-text step, grab the subtitles from youtube.



Good one! Not always reliable, but pretty good (this video it does not seem to work with, says no subs). Not sure if Whisper.cpp (yes I did mean OpenAI Whisper) is better though, or has similar issues on the edge cases I encountered with YouTube subs.

Here's the output by Whisper.cpp

> $ whisper --verbose False --language English --output_format txt transcribe_me.wav

> [...] (removed, comment too long)

> $ cat transcribe_me.txt | mods -f "summarize this in Simple English"

> The speaker discusses how Amazon accused a customer of racism due to something a robotic doorbell said. They also talk about the concept of companies limiting how fixable devices are, and how this limits the freedom and sovereignty of consumers. They mention that Amazon canceled their seven-year affiliate program account, claiming that they violated terms of service by having employees and friends use the links to buy items. The speaker argues that Amazon overreacted and that this demonstrates the problem of companies becoming less accountable as they gain more power. They call for more scrutiny of such behavior.

> $ cat transcribe_me.txt | mods -f "summarize this in English"

> The creator discusses how Amazon accused a customer of racism because a delivery driver claimed that he heard something racist from a robotic doorbell. The creator explains they often discuss the concept of "right to repair" and how companies are making devices less fixable over time, requiring customers to buy new ones or rely on the manufacturer for service. The creator talks about how having a system in your house that can be turned off by Amazon if an employee claims that your device said something racist to them is concerning. They also receive an email canceling their Amazon Affiliate account that has been open since 2016, claiming that the purchases resulting from special links are in violation of the operating agreement. The creator disputes these claims and argues that such actions demonstrate the problem of companies becoming less accountable as they become more powerful.

Note, mods requires OPENAI_API_KEY to be set.


Just FYI, use:

yt-dlp --write-auto-sub --sub-format json3 --sub-lang en

I don't know why automatic subs require a different flag, but they do.

What's that "mods" CLI?


Nice yeah I suppose I would use format vtt and then convert that to text e.g. with [1]

mods is a CLI interface to GPT-4 [2]

[1] https://webvtt-py.readthedocs.io/en/latest/

[2] https://github.com/charmbracelet/mods




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: