Hacker News new | past | comments | ask | show | jobs | submit login
TL;DW: Too Long; Didn't Watch Distill YouTube Videos to the Relevant Information (tldw.tube)
328 points by pkaeding 62 days ago | hide | past | favorite | 192 comments



What I'd really like is a service that edits down YouTube videos by removing all the stock footage and talking head crap, then speeding up the audio to fit over the remaining novel information—whether that's new battlefield footage, electron micrographs, demonstrations of machining techniques, or just elephant toothpaste. The talking head filler seems like it would be easy to recognize, but stock footage recognition presumably would have a significant false negative rate, which is okay.

This would reduce some videos to just a transcription, which would be the ideal outcome, I think. The less of my limited time on Earth I waste watching some dumbshit reading a script at a camera, the better. Summarizing the transcript further like this site does might be occasionally useful, of course.


I use SponsorBlock (be sure to enable all categories of blocking such as filler content, not just sponsors), DeArrow (de-clickbaits thumbnails and titles) and a video speed changer extension to enable much of your stated functionality, though of course not all. I've saved likely years of watching due to this combination.


One of the best things in SponsorBlock is Highlight segments. If the video is 10 minutes of filler building up to a single interesting/exciting moment you can often see exactly on the timeline where to jump to.


That's also present in the regular YouTube video scrubber. Maybe less prominently?


I think that only shows up when a video hits a critical number of views so it knows where people watched and rewatched.


How did your viewing experience change after using DeArrow? I'm curious because DeArrow has a great reputation, yet my gut feeling tells me that I should avoid watching clickbait videos altogether.


I tried DeArrow for a bit but I found that for some channels it actually made things worse - I mean it removed the clickbait, but some clickbait gave me a better idea of what the video was actually about than DeArrow's boring titles.

Overall it didn't improve the experience really.


DeArrow lets me make an informed decision on whether to watch a video or not. It's true that sometimes the clickbait will extend into the first few seconds of the video, but you can still stop watching then if you think it's not for you. Myself, I enjoy knowing what it's really going to be about.


Then you'd be in luck as it makes me less likely to click on those videos. I only watch them if I'm truly interested in the topic as shown by the revised thumbnail and title, which makes my viewing quality much higher on average.


If you avoid watching clickbait thumbnailed videos then you will miss out on a lot of good content. Perhaps you are fine with this but if the implication is that a clickbait thumbnail implies the video is not good, this is an indicator that, at this point has almost no correlation with the video quality. This is because content creators, pretty much have to use clickbait thumbnails to get views or traction, largely because of the way YouTube's "algorithm" works.


I love the DeArrow experience. The problem with clickbait on YouTube is that even good creators are compelled to use it because it works. Whether it's due to the algorithm or just human behaviour I don't know.

There's a little blue button next to the video titles to toggle it on and off and it's very interesting to compare the community title and thumbnail with the original. Good original titles are often left largely intact, maybe with some extra context. Betteridge law is no longer relevant. It's nice.


Tom Scott and LTT did research on this and you are absolutely correct. It just “works” for engagement. Even those who are already your top viewers.


Just looking at their own marketing material, in some cases it's just making the titles longer and more boring without actually making them more descriptive.


When you need 4 extensions (counting ublock as well) to make a site useable, maybe it is time to reconsider what site you spend your time on?


Ya we are sick and addicted.

Not only do I need the above 4, I need a separate browser because YouTube breaks with Firefox, private tabs to not taint my preferences when I watch one-ofs, and other addons such as a channel and video blocker.


Interesting, I'm using Firefox, what breaks with YouTube for you? I also like PocketTube for subscription management as I find that much better than trying to trawl through the mountains of BS on the homepage.


It’s two things on YouTube. Certain videos (seems to not occur for super popular videos) dont load. Just a white browser screen and black area for the video, but no video, suggested videos, or text at all. The second is searching. Certain terms that seem unremarkable and unrelated will leave out the channels I’m after. Eg. “Cop body cam videos” will only suggest videos from 1-3 channels, or “Hardcore WoW clips” won’t include the channel I’m after (the most obvious and largest for this search). Using Brave or Chrome produces the expected/desired behavior (but isn’t fixed with some other browsers or local YouTube clients).


There's an unsolved problem of some content only existing in youtube. Until that's solved, using 4 extensions is one of the best workarounds to making it usable.


What a silly point of view. Instead what should they do? Post comments like this on HN?

They clearly enjoy watching and digesting information from YouTube- the biggest platform for this. But people need to make money from it hence the ailment these these plugins remedy


>Instead what should they do?

Read a book, touch grass, etc.

>They clearly enjoy watching and digesting information from YouTube- the biggest platform for this.

I'd probably enjoy heroin but I still choose not do take it because I know it's not good for me. What's your point?


> Read a book

I wanted to say something flippant, but dang with the halo on my shoulder urges me otherwise.

so here goes: books can also full of filler crap and useless and outright wrong material. just because it's a different medium doesn't make it any better for wasting time. in fact, a bad book can waste weeks of your life whereas a bad video on YouTube could waste about 10 minutes.

ultimately I approve of any tools that allow one to extract the information that they are after without paying the cost of giving more attention to somebody than they've earned.


Being a different medium does in fact make it much better for wasting time. Skipping a paragraph in a book takes 100 milliseconds. Referring back to a previous paragraph to understand a seeming contradiction does too. Pausing a book requires no effort, and no effort to return to the place where you were. Books can be excerpted and quoted, not just in other books, but also in conversation or in videos, in a way that videos just can't. (Talking-head filler scripts can be quoted that way, but the video can only be quoted in video or still images.) Errata sheets, errata websites, and later editions of books can correct errors in earlier editions. Books are much smaller and therefore easier to archive than videos, making them less likely to get lost.

Neil Postman's Amusing Ourselves To Death is an extended and very persuasive reflection on precisely this difference in the tendencies of these two media, although particularly attuned to the form of video that was popular at the time, TV, which had commercials and didn't even have pause and rewind.


somehow I doubt that the person I was replying to would consider ebooks as a book.

videos can be viewed in double speed, and they can be rewound. they can be watched over and over if needed. they can describe/demonstrate information in ways that books can only project into 2 dimensions. a picture is worth a thousand words - well imagine 24 of those per second.

certainly with STEM subjects, nothing beats a good animation.

even for things like history/geography, I find animations and progressive disclosure to be much more engaging and easier to remember than dry words on a page.


Books at least go through a minimal amount of vetting before they are published, and there are meaningful differences between forms of media. As McLuhan wrote: "The medium is the message".


Lol, no they don't.

If you have a pdf with enough words in it, you can go from that to a published paperback on Amazon in 15 minutes.


Those aren't books. They're printed blogs you pay for.

Every book can be published as a pdf. That doesn't mean that every pdf is a book.


> Those aren't books. They're printed blogs you pay for.

If you are going to make an outlandish claim like this, at least try to post some criteria to defend. Not doing so has led to you not realizing that you can't without also excluding works of classical literature like Sense & Sensibility (which was self-published by Jane Austin).

There's also more direct problems with the arbitary line you are trying to draw, like best-selling book The Martian, which was originally published as a blog, then self-published as an e-book, then officially published as a book by a major publisher.


1. Some blogs are better than most books.

2. Some carefully-written ebooks now are self-published, and I don't care if their cover design and promotion campaign aren't as professional.


What a textbook (sorry, printed blog) example of a No True Scotsman.


No true Scottsman!



Maybe in the 1900s. Not any more


so if they're good, they're books. if not, they're not _really_ books.


>Read a book, touch grass, etc.

There's a ton of useful information that's only available in Youtube videos and not books such as DIY tutorials, repairs, etc.

Unfortunately, many helpful videos also include useless fluff. E.g. a "good" 10 minute video on how to disassemble a Samsung dryer to replace the heating element also has 2 extra minutes of "click my link to Nord VPN" and "please smash that Like button and subscribe". Therefore, tools using AI to remove the extra time-wasting fluff can help. There is no book in the public library or Amazon that will show the tricky steps to repair a Samsung dryer. It's Youtube videos uploaded by random people that has information like that.

My friend learned how to use her new Apple Watch by watching Youtube videos. Yes, there's official Apple documentation but it's hard to learn from a wall of text with static screen shots. In contrast, watching youtubers do live demonstrations with their fingers manipulating the screen accompanied by voiceover narration is easier to understand.

EDIT reply to: >When you've saved "years of watching" by using sponsorblock and 2x speed, that's not how you use YT. Sponsorblock saves 25% at most.

Your math is incomplete in your characterization of the gp you replied to because he also mentioned "DeArrow (de-clickbaits thumbnails and titles). That tool saves 100% of the watch time when the re-titled videos instantly informs him he can totally skip it.


When you've saved "years of watching" by using sponsorblock and 2x speed, that's not how you use YT. Sponsorblock saves 25% at most.

To be clear, I am not saying that youtube has no use, but the commenter I originally responded to is clearly in excess of whatever that reasonable level is.


How exactly does one "use YouTube," in your opinion then? You also forgot the effects of the other extensions than just SponsorBlock, even 2x speed (I watch at higher than that usually) saves 50% immediately. You presume I don't also do those things you listed, I now have more time to do them because of the time saved, not less.


You watch and enjoy the content that you like, find useful or otherwise enjoy instead of min-maxing your way to not supporting the platform or creators that are spending their time creating that content.


If the platform and creators are incentivized to waste my time, then why would I support that? I watch and enjoy the content even more knowing that I am not wasting my time.


This is where the snide touch grass comment above was awkwardly pointing. If you feel like you're wasting your time and can't be bothered to do so in a holistically responsible way (supporting the creators at the very least) then perhaps your time would be better spent doing something else.

i.e. you may be wasting your time already.

I probably shouldn't have wandered into a thread that's ostensibly about ripping off content creators to begin with. I'll take my moral high horse elsewhere lol.


Why would I care about responsibility when I waste my time, much less "holistic responsibility?" Sounds like your moral systems are different from others' in this thread.


I totally agree, you're absolutely right about that. have a good one.


They shared a list of plugins that might be useful to other people, so the comment has value. I recently found out about the "press to 2X speed" and I love it. Does that mean I should go touch grass because I use it in a non-stock way? Or are you the ultimate decider on the maximum limit of plugins or hacks?


It’s the seventh commandment, don’t you know?

7. Thou shalt not use more than two browser plugins.


I agree with you about there not being anything wrong with watching YT videos (no virtue signaling about "grass" here) but at the same time, couldn't you just watch the video on 1.5x - 2x? You can easily skip the VPN ad reads, I don't know, how valuable is your time, really, at the level of minutes? I know that there will be people on here that have autistic levels of min-max optimization of their entire lives like some kind of techno-vampires or something but if you're watching a couple dozen videos, plus or minus, a day is this really enough time to be worth a bunch of micro-optimization?


“Touch grass”

This turn of phrase says everything to be honest.


YouTube has a speed changer built in. I'm curious what your extension would add.


Higher speeds than 2x. I usually watch at 3x and sometimes more (sometimes with captions turned on). Keyboard shortcuts as well, as others have mentioned.


the built in one has keyboard shortcuts too: < and >


There are more advanced keyboard shortcuts than just that when using an extension like YouTube Enhancer which includes the aforementioned higher speeds.


SponsorBlock specifically blocks sponsor segments in YT videos.


RIP the Ad Accelerator, which would up the speed upon detection of an ad. Youtube has found a way around it and it no longer works.

https://github.com/rkk3/ad-accelerator


with youtube enhancer (there's also an open source one i wanted to check out but haven't yet) f.e. one can add a speed icon in interface on which the scroll wheel works, allowing dynamic speed adjuistment which, accompanied with [cursor left] to jump back a bit and [number keys] to jump to 0-90% position is amazing for extracting info. Also, muting videos, turning on captions and rushing through them without the mindfsck is a soul saver often times. Our current attention economy with its commercialization incentives hopefully is just a toxic glitch in history..


Not the op but the extension I'm using adds 3 very mportant features :

* default to my preferred speed (2×)

* keyboard shortcuts to change speed on the fly as needed

* arbitrary speeds up to 4×


> keyboard shortcuts to change speed on the fly as needed

Speed up: >

Slow down: <


Undiscoverable, and not configurable.


I don't watch YouTube but if I would / I would if I'd cut out anything with faces or speech & use an LLM to summarize what's technically relevant from the transcript in a way that fits length of what remains.

Pipeline such content, but use weighted random videos, with low weights for types of content with clickbait headings & perhaps blacklist for words like meme or lol in transcript to cut out things with stock footage. I am not sure of exact best way to remove it, actually, other than "using the transcript for some computational technique of probabilitistic stock footage prediction" which I bet would be most effective.


You described SponsorBlock, really a game changer. Also works on mobile!


how would you remove stock footage? B-roll has voice over it, it's kind of a necessity.


By "speeding up the audio to fit over the remaining novel information", as I said. "B-roll" doesn't necessarily imply worthless stock-footage filler; in https://www.youtube.com/watch?v=woj4vfMLpao or https://www.youtube.com/watch?v=DdF_nzMW_i8 or https://www.youtube.com/watch?v=Eu_crbcBdNM, for example, the B-roll is the remaining novel information. The third of these also includes the kind of talking-head filler I'd like to remove.


Or ask the original creator if he can publish the script used.


If that's what you want then why would you want a service like this? Surely there would be a non-video news sources of electron micrographs or elephant toothpaste for which there would probably be hundreds or thousands of LLM TL;DR things.


I think you don't have a very clear idea of what I am talking about. You cannot present electron micrographs as text. You can present individual electron micrographs as still images, but not animation. Similarly, video of elephant toothpaste can only be presented as video, even if still images can be arresting. There is no sense in which a textual description of machining techniques or a Ukrainian battlefield is a substitute for video footage of them. In 5 seconds they can convey information that no amount of linguistic description can. Sometimes that information is even true.

What I hate is when I'm trying to find such irreplaceable information and instead my search results are full of vapid stock footage and hubristic talking heads overconfidently reading a script out loud as they gaze at a video camera. It's like AI slop without the creativity.


Cool.

What's stopping you from doing it?


Watching too many YouTube videos, probably.


Wow, I'm really surprised that my comment describing 95% of YouTubers as "some dumbshit" got voted up to +19. I guess I'm not the only old man shaking his fist at the surveillance capitalism incompetent confident shouty bullshitter cloud?


What if we summarize all the information in the world into a few hundred volumes of human knowledge, then summarize those into a 10,000 pages book, then that into a 10 long form essays, then those into a 100,000 chars blog post, then that into a pamphlet and finally we summarize one more time into a single tweet.


Not a single tweet, but 10 brief sentences, as per the AI overlords:

1. The universe is vast, mostly empty, and runs on fundamental laws that we barely understand but exploit well.

2. Life is a self-replicating, entropy-defying phenomenon that emerged through chemistry, evolved through selection, and adapts through intelligence.

3. Humans are social primates who dominate the planet through cooperation, tool-making, storytelling, and an insatiable drive for meaning.

4. Societies form through shared beliefs, laws, and trade, but oscillate between progress and collapse due to power, greed, and ignorance.

5. Technology is humanity’s amplifier, accelerating knowledge, comfort, and destruction in equal measure, with unintended consequences at every turn.

6. Economies are trust-based systems of resource distribution, prone to cycles of boom, bust, innovation, and inequality.

7. Morality is a human construct, evolving with culture, often conflicting between collective well-being and individual freedom.

8. Knowledge is a fractal—deeper the dive, more there is to know—yet most wisdom is rediscovery of old truths in new contexts.

9. The future is uncertain but shaped by the tension between human ingenuity and our own worst tendencies.

10. The meaning of life? Whatever gets you up in the morning and lets you sleep at night.


> entropy-defying phenomenon

entropy-exploiting phenomenon

is a much better description as life does not defy any fundamental laws.


can you please share the prompt? and in general reproduction steps? many thanks!


nice lifr tldr


Some times I think that would actually be useful for some politicians that do not care about history and prior knowledge.

* Rule of law is a good idea

* Dictatorship is a bad idea

* Allowing Germany to occpy Sudetenland in the Münich appeasement 1938 was a bad idea. [1]

* ...

[1] https://snyder.substack.com/p/appeasement-at-munich?triedRed...

But that said! If this service works I think I could use it. I can handle long articles, but have no time to watch YouTube clips.


> into a single tweet

It would say something like, "This text attempts to summarize the entirety of human knowledge".

Still, IMO summarizing videos is useful. Even if the summary is not accurate or a 1:1 representation of the content, you can mostly get the gist of what is being said without being baited into watching advertisements.

Although, this site doesn't seem to do a great job at summaries. Kagi's universal summarizer has much better results, https://kagi.com/summarizer/index.html . However, it requires transcripts to be available for videos.


I think a lot of people are sort of missing the benefit of something like this.

How do you read a book effectively? You skim the table of contents. You skim the contents of each chapter and mark interesting paragraphs. Then you go through the book another 1-2 times, each time getting deeper into the text and cross-referencing information between different parts of the book.

What tools like this will do is allow us to apply this same workflow to videos, which can greatly enhance our understanding of videos we're interested in and help us contextualise it with the rest of our knowledge.

I've already been doing this and it's helped me expand my knowledge and understanding in ways that wouldn't have been possible without an unreasonable investment of time and effort.


Tried asking Claude to do that, ended up with something pretty beautiful:

Everything is made of atoms & energy, life evolves, math describes reality, knowledge builds on itself, humans need each other & Earth to survive – test ideas, learn from mistakes, be kind, stay curious.


The answer will be a single number, 42.


Then summarize it one last time into a single bit. I like to think it'd be '1'.


1 in base 42.


Then this super-condensed 1 explodes and there is no information left, just noise


…but who will have the question?


Gemini's output:

Our understanding of reality is fundamentally shaped by the power of stories and narratives.

Humanity constantly seeks to impose order and structure on the world through systems and frameworks.

The inherent human drive to create and innovate defines our art, technology, and design.

We are bound by the complex interplay of connection, conflict, and cooperation in our relationships.

Time's relentless flow drives change, progress, and the unfolding narrative of history.

The vastness of the unknown perpetually challenges and defines the limits of human knowledge.

The search for purpose, values, and meaning is a central and ongoing human endeavor.

Abstract concepts and models are powerful tools for understanding and navigating reality.

All living things are interconnected within a complex web of life and ecological relationships.

The future of humanity presents both boundless potential and significant challenges to overcome.


That's really bad, but also excellent in a particular way, which we might call glibness.


This reminds me of the famous Library of Babel story, where the entire corpus of a language is imagined to live in a library. Like, every permutation of the characters of an alphabet for pages of a certain number of characters in books of a certain number of pages.

The reducto ab asurdum of this library is an alphabet of 0 and 1, a page size of 2 characters and a page count per book of 2.


I know you’re making a joke, but more seriously I think most yt videos have atrocious signal/noise ratios so information compression is likely very useful. Less so for many academic papers (although they have some pretty awful filler sometimes).


I was on YouTube a few weeks ago and saw a 20 minute video with a title that looked interesting. Under it was an AI summary that saved me 20 minutes, and had me skip the video completely. I wish that was under every video.

This week I got a notification about the AI added to YouTube to allow users to ask questions about a video. I haven’t had a chance to use it yet, but I can see that also being useful to get the main points from a long video. Up until now, I mainly use the popularity indicator on the progress bar. Since I watch most videos on my TV, it’s harder to use the AI, as I would need to pull out my phone, open the same video, and ask… that’s a bad workflow.

I do find it a little ridiculous that we need AI to summarize long videos full of fluff, when the only reason they are full of that fluff in the first place is YouTube’s own monetization policies which pushed the average video from 2-4 minutes to 10 minutes.


This is exactly the problem. There are so many 20 minute videos that should have been 2 minutes.

In a way, it's much easier to make the 20 minute video. Just hit record, rant an rave, stop recording and publish.

There are indeed justified long videos stuffed full with knowledge, insight and witty comments to make it fun.

Then there are "slow" videos but magical. Paul Sellers has a 30 min video on how to make mortise and tenons joint with hand tools. Just you and him in real time. You get a (recorded) private lesson from a master craftsman. It's magic. Every minute of it is knowledge transfer.

https://m.youtube.com/watch?v=aBodzmUGtdw


Some people inflate their video durations intentionally, but I think the majority of people truly think they're using the time wisely. Have you ever tried making a quick travel vlog of a vacation and ended up with a 15 minute short film? That B-roll at the airport was definitely critical to include!

I think the reality is that there are a lot of amateur video creators. Elevating the few talented creators through social engagement metrics isn't perfect, but I think it works well enough. Or at least more so than what these anodyne summarizations would give us.



42


Like "the book"?


Hi HN! I'm the author of this service. Thank you for your support.

There may have been some temporary downtime due to residential proxy running out of bandwidth. I have purchased additional bandwidth. (I run this service for free.)

There also may be some errors with particular videos because they are not accessible in certain regions. For now all requests to YouTube originate from United States, but open to change in the future to some kind of round-robin or fallback system.

I know it's not perfect. I developed the tool originally for my own use. It's open source and I'm open to any patches or pull requests.

Enjoy!


Hey this is really cool, I literally had the same idea about a month ago but ultimately decided to not pursue it. Glad someone else did.

A few quick Qs -

1) Do you use the available auto-generated transcripts from youtube? Or do you do any audio parsing? I know transcripts aren't always available.

2) Do you have any plans to monetize in some way, do you think it would be possible? It's definitely a neat product but a tad generic and replicable, so I'm curious.


1.) We do no TTS of our own. We either use the original transcripts uploaded manually by the YouTuber or we use the auto-generated ones supplied by Google.

2.) No, I plan to keep it free as the operational costs are relatively minimal.


Thanks for your response. Yeah that's nice and simple. I wonder how much you'll burn keeping it up for free, are we talking like on the order of $20/mo? If you use models like 4o-mini it might even be less, that thing is insanely cheap and not terrible. Cheers


How does this compare to gemini thinking experimental with apps?


Out of curiosity which residential proxy service do you use?


how expensive is the bandwidth if you're buying from residential proxies?


Cheap. Less than $10/GB and since we only scrape metadata and transcripts, the traffic usage is low.


> open-source

> OPENAI_API_KEY

Choose one.


The service itself is open source ; but it relies on a closed source service. Both are not incompatible.


If you don't like it, just edit the open-source code to point to a locally hosted open-source llm.


> OPENAI_BASE_URL

> OPENAI_API_HOST


Tried it on 3 random videos I watched, and the results were... mostly good, albeit mixed.

On the one hand, it got my video about a Mario & Luigi: Brothership glitch dead right, immediately listing where you'd need to die to get an item early and what you'd get out of it.

It also did an okay job summarising a Zelda dungeon analysis video by someone I'm subscribed to, with some info on why that dungeon was well-designed that clearly came from the video.

Unfortunately, it did a poor job at summarising a video about plagiarism in the YouTube speedrunning essay space, associating the problem with smaller creators rather than the person the video was about and leaving out far too many details to be useful.

This seems to confirm my assumptions about how an AI summariser would work in general; if the original media is a straightforward piece about one easily understandable topic, then it'll do fine and work about as well as a human would. If it's a longer piece with multiple points backed by various examples, then it'll struggle to summarise it in a way that makes sense.


> If it's a longer piece with multiple points backed by various examples, then it'll struggle to summarise it in a way that makes sense

I've found the same problem with humans too, so it's not like an improvement over humans.


So what is everyone doing with all this free time they've now accumulated from not reading, watching, or listening to media?


Chatting with AI bots.

I agree that this mentality of works being “too long so don’t ingest it”, is not a healthy way to go about life and thinking in general.


It's not that, it's "too long with low information density so ingest it more efficiently." That way we can spend our time on things that are more productive or enjoyable.


Videos are typically already summarized substitutes for complex topics--topics for which you might need to read text or literature to get the full context of. Now we want to min-max and summarize the video themselves. Then what? If that video summary is too long, we throw it into another LLM to summarize the summary of the summary? To what extent does this end?

There's more to learning than just information density. There's visuals, presentations, explanations. And if you want more proof, then a video played a 2x speed is twice the information density, yet we all know that many videos would be extremely hard to retain anything from at that speed.


Lots of youtube videos are not the well-organized presentations you describe, but instead have a minute or two of good information with ten minutes of meandering asides, background you already know, and other fluff. Some are well-disguised clickbait. A good summary prevents a lot of time wasting.

As for the good videos, I can't watch them all. If I can skim a good summary, I can decide whether, for me, this video is worth watching, or ignoring, or just reading the summary more carefully.


> That way we can spend our time on things that are more productive or enjoyable.

That seems like an arms race, how do you know there aren't more productive ways to enjoy your time? Can something less productive be more enjoyable?


I don't, and yes. I suspect this varies by person.


Sleeping. Occasionally hitting a drum.


we automate so much, we end up making human redundant


I tried a couple videos in both this site and Kagi's summarizer, both were decent but each time Kagi did better.


Whoa, I've been using Kagi for two years and didn't realize the summarizer could do videos!


It only works when there’s a transcript. It’s not “watching” the video. That said, it works very well most of the time for me.


Idea hackneyed since LLM's appeared. Cool that implementation is open-source, though yt automatic captions are sometimes completely off-point, especially when people talking in the video don't have a diction of a tv show host.

I wonder if an idea found it's niche after all? Do you guys summarise you videos to short texts and that leaves you satisfied? For me video is video, I can relax, sit and watch/listen to it. With text it is different, it is a mental exercise to read and process it, so turning video into text feels like an essential downgrade. I would prefer watching at 1.5/2x speed instead of text summary if I want to finish it faster.


If I want to watch a video for the sake of watching a video, then no, a text summary would not at all be the same thing.

But most of the time I don't want to watch a video, I just want to get information. A text summary then would be strictly superior.


> Idea hackneyed since LLM's appeared.

It’s an idea that’s been around long before LLMs. Check out Yahoo under Marissa Mayer acquiring a news summary app. Though it is still hackneyed.

https://finance.yahoo.com/news/yahoo-acquires-summly-app-150...


> For me video is video, I can relax, sit and watch/listen to it. With text it is different, it is a mental exercise to read and process it, so turning video into text feels like an essential downgrade.

Exact opposite for me. Reading goes at my pace, in silence. Video is much more invasive, so I avoid it except for the highest quality stuff.


Interesting, I was always thinking that audio/visual information is naturally much easier to consume. For instance: I can watch a video and count to 10 in my head at the same time – I will still get everything what was in that video – but with text it's a much harder task since the head is fully occupied with "narrating" the text what I'm reading, so reading in the end turns into podcast inside the head before actually get consumed.


I find it much easier to skim texts to find the information I'm looking for. I can't do that with videos unless they have clearly marked chapters.

When I read about a new technology and they have a video instead of a blog post, I just close the browser. If it is useful, someone will probably post a blog on HN at some point in time


2x video speed is still a lot slower than text.


I tried with a few Thunderf00t videos. He has good analysis, but the guys repeats everything too many times. Many are about silly impossible "inventions" / scams, but this is an experiment that he published in Nature Chemistry:

https://tldw.tube/?v=LmlAYnFF_s8 "High speed camera reveals why sodium explodes! --> "Coulombic explosion. (Sodium and water reaction)"


I used to enjoy watching his stuff, but he became hyperbolic, and now has too much rage bait for the algorithm.


That was the first thing that came to my mind, all his 20 minutes videos contains at best 2 minutes of information.


Seems to use gpt-4o under the hood. I wonder if using Deepseek would make any difference in quality.

I used to have nice and detailed summaries with this app lasting many paragraphs. It used to be totally free and you could submit as many links as you wanted. However they started forcing you to wait dozens of hours between summaries or pay for credits. I haven't found a YouTube summarizer as high quality since.

https://play.google.com/store/apps/details?id=com.emote.yout...

I'm thinking one could replicate such a service pretty easily and be able to plug in your Deepseek API token instead. It would be convenient if such services let you "bring your own API key" so to speak, so you'd only have to manage one bank of credits. But it's understandable that lots of people want in on a slice of the AI pie right now.


Something that could be interesting is processing all videos in some subgenre, like say "game development", rank them by popularity, summarize them, and then analyze for patterns. This would be valuable information for anyone wanting to make a video with higher chances of being shown to people.


Or one could not do that and focus on quality instead.


That was actually my initial thought for what I would test with a tool like this. How much does quality matter? Quality of course has many definitions, but in the light of this tool, the measurable quality could be "how much novel information does this video have?" I'd hope a trend that pops out is that content with more information around a focused topic would perform better. But as it is, I only have my personal preferences and my suggestions list (based on my personal preferences) to go by. I'm not a content creator myself, but if I was, I might find this kind of analysis interesting enough to pay for.


Pretty cool. The summary at first was helpful, but then delves into repeating itself. I tried it on John wick. Here’s the latter part of the summary.

The video calls out the cliché tropes and logical inconsistencies in "John Wick," showing how they detract from the film's emotional impact. The critique outlines the logical flaws in "John Wick," focusing on cliché tropes that undermine the film’s emotional depth. The video critiques the logical inconsistencies and clichés in "John Wick," highlighting how they reduce its emotional impact. The critique points out clichés and logical flaws in "John Wick." The video notes clichés and flaws in "John Wick." The video critiques "John Wick."


I made a shortcut for iPad to do more or less the same thing and promised myself to keep “engineering” the prompt but work pretty well as it is.

Link: https://www.icloud.com/shortcuts/fbb5a315cb354fbf903bcfa4f40...

What the shortcut does is taking the transcript of the video and asking your ChatGPT app to extract main ideas, quotes, and facts.

The prompt comes originally from fabric CLI tool https://danielmiessler.com/blog/fabric-origin-story


For specific questions I get the transcript from https://tactiq.io/tools/youtube-transcript then copy paste to a LLM and append my question


Tried Andrej Karpathy's latest video -- https://www.youtube.com/watch?v=7xTGNNLPyMI

It said "too long video"

Exactly, duh ;D


I saw that too. I'm able to use Eightify for longer videos https://eightify.app/


An error occurred: tuple indices must be integers or slices, not str

https://www.youtube.com/watch?v=Ks-_Mh1QhMc&list=PLWDUzz3hCD...


This should be fixed now. There was temporary outage due to proxy running out of bandwidth.


The sweet spot of most YouTube videos was supposedly under 10 minutes. What happened? These days, the typical YouTube videos are almost 30 minutes long. I'm fine with documentaries and the like. Is there a way to bring back the idea that under 10-minute videos work the best?


8 minutes is the cutoff for "mid-roll" ads (you can insert ad breaks into the middle of your content, not just the beginning and end).

But people also realized, especially in the past couple years, YouTube seemed to tweak the algorithm to favor watch time more strongly than in the past, so people started "fluffing up" videos to 20, 30+ minutes. As long as they could get a click with a decent thumbnail and title, and enough people to watch for at least a few minutes, the average watch time would go up, which was a very positive signal. Plus, they could insert like 3+ ad breaks (some big creators insert 6+!) and make more money.

However, in the past year it seems like YouTube's tweaked the algorithm again to not favor watch time as strongly over some other metrics, leading to shorter videos being able to get more views sometimes (and not just music videos). (But they're still "less monetizable".

I think this is why podcasting (and especially extremely long podcasts) have gotten so popular. A few thousand viewers watching for 2 hours gives a lot more ad revenue (especially if you fill in that space with read ads) than like 1 million views on a 3 minute clip.


It's just a natural evolution of more people watching on their TVs and Mobile. Long slop and short slop with nothing in between. Must adapt to the TV and phone brain rot zombies.


Non-fiction text isn't that different. You have 5 to 10 minute blog/article reads and 250 page books from publishers with not a lot in between. (Yes, you can and I have, self-published books of an intermediate length but publishers don't generally want anything under 250 pages or so.)


You're reading the wrong non-fiction books then. Find books that are closer to academia than whatever you find in the non-fiction shelf of an airport bookstore.


You would have to convince YouTube that 10 minute videos generate more revenue.

They’ve been tuning this beast for decades now.


I tried it to summarize some lecture videos. And the summary ranged from average to bad. Nothing I couldn't already get from the description. Even ChatGPT 4o spits out far better content.

So far my method has been to take the transcript and use an LLM with customized prompt for summary.


This feels like the inevitable outcome of the youtube algorithm favouring longer videos.


How can anyone make a generalized statement about “the algorithm” when it is by definition personalized?

All YouTube cares about is keeping you on the platform as long as possible.


because there are generalized algorithms in use as well. The 'show ad' logic that is used at specific points along the way.


does it? i get pushed so many shorts and 15 minute video essays


15 minutes actually is longer than what early videos were. The fact that it seems short is a reflection of the algorithmic push for longness.

Shorts are a whole different thing. But there's a void of videos from 90 seconds to 7 minutes that just answer a question.


Yeah, for many years it didn't allow uploading videos of over 10'.


Getting a "Too long video" message in response to a query is frustrating to the user, and, well, redundant information, so contrary to the purpose of increasing information density.


Interestingly, YouTube also lends itself for this sort of thing for movies. You can watch scenes from movies and get the gist, while only spending ~30min on a movie. It's a great way to watch mediocre movies - they're not so horrible (1), and you do get the entertainment value without being exposed to the shlocky-est parts.

(1) Eg, if you skip the beach scene, Terminator Dark Fate is... palatable! But yeah, the reasons for mediocrity typically still shine through a bit.


I tried using it on a 2-hour video and it said that… the video is too long?

Kind of like hiring a painter to paint your house and him refusing the job because your house is unpainted.


great analogy.


This is so fun!

Could you make it compatible with openrouter please? I see you use the openai binding that allows you to specify a different base_url so that would work out of the box and integrates well with the .env

``` from openai import OpenAI client = OpenAI( base_url="https://openrouter.ai/api/v1", api_key="<OPENROUTER_API_KEY>", ) ```


Text summarisation already frequently misses things, sometimes gets them completely backwards. Can't convince myself this will be any better.


I see the value in this, and thank you for posting it.

That said, it really makes me wonder at how insanely inefficient information transfer over the internet can become. This is like... doing OCR on screenshots of PDFs to send messages, rather than just passing around text. Nice, searchable, parseable, indexible, extremely compressible, editable, readily versionable text.


That's pretty useful. I think yt itself is experimenting with similar feature as I see some videos have AI summaries in the official app.


Yeah. "Some." I've seen one in the past two months.


It seems to have some reasonable length limitations. It refused to distill this epic analysis of dungeons across 3 games, since the video approaches 4 hours long: https://m.youtube.com/watch?v=PajArJbPfpE


What is really cool about this is that my attention span is kept after reading the summary.

Like the following video: https://m.youtube.com/watch?v=JBkAKCJJuFQ

I am not getting emotionally invested to the point of losing the end focus of the speaker. Actually now I am looking forward to it.


My struggle with all these tools is that it’s been universally easier to type “summarize this video <link>” into chatGPT than to even keep track of the link to service. I desperately want a openAI-powered business to be viable, but it’s a snake eating its tail


Most videos would do well with a summary of the summary of the generated transcript. Should take 30 seconds to go through.



This is not the only service of its kind. There is also the Video Gist:

https://www.videogist.co/

I can't speak to the differences between the two. Once casual observation is that the one posted by the OP seems to be a lot faster..


I thought maybe I'd finally be able to through a Wendigoon video with this, since I'm generally interested in the things he talks about but can't stand some of his linguistic tics. Unfortunately looks like most of his videos are too long for this, which I guess is ironic considering the name.

TL;DP


You could try my app https://github.com/rmusser01/tldw

Supports arbitrary length videos and also lets you choose what LLM API to use.


Here’s one video above 1 hour and it works with Scribe https://www.appblit.com/scribe?v=FQUo2r-ow-k


I use YouTube's transcript feature or 2x speed for slow talkers.

Especially with political commentary many YouTubers stretch their content or talk slowly for more ad deliveries. It is annoying, because often there is an important five minute section hidden in a one hour video.


Does it just parse the captions like every other service or does it analyze the audio too?


> An error occurred: tuple indices must be integers or slices, not str


This should be fixed!


There's also a funny one like https://tldw.tech/ which makes everything into a The Verge video.


Cool! I had thought about doing the same but in a Chrome extension that would use Llama 3.2 with WebGPU, but when I tested it, it was very slow and sometimes crashed the browser.


Got a chuckle from that recent commit message. I feel you



Nicely done! Thanks for open sourcing it as I've been able to run it locally and experiment with the prompt.


I like the idea, but I get the message: "Too long video"... So I am wondering about its application if it only works for short videos


I would like to see one of these that produces an outlined summary then has timestamps as citations for the outline's individual entries.


That would be brilliant. I notice that in math/programming videos it will often just say “using a mathematical trick/concept” or “using bit manipulation” when it could just include the actual formula/equation. It would be nice to be able to go to the relevant section in the video to find it.


Will I still get to learn about Brilliant?!


It would be nice if it also gave the videos better titles. Then make it an extension to replace the titles automatically.


Ask and ye shall receive. Not only titles, but thumbnails as well!

https://dearrow.ajay.app/


Change the prompt to "in bullet points" to increase the value by about 80%.

(And add "by section" to get to 100%!)


Bit of a waste to host this locally and still need an openai api key. This would be great with Ollama


How is this different from using Gemini to summarize a video given a YouTube URL?


I tried a video 1:42 hours long and it told me it was too long. Increase the limit?


Curious how well this compares to the same functionality built into Gemini.


Can’t wait till futurama style streaming-into-my-brain is available


too long even this site cannot "watch" it


From some light testing, the results here are pretty lackluster. I've used videosummarizerai.com in the past via ChatGPT and the ability to specify the format of the information you're looking for is a huge plus.

Here's the summary of this video [https://www.youtube.com/watch?v=W1WkG8WuRcg (AoE2 video)] from tldw.tube:

> The video discusses the most powerful team combinations in the "10x Shared Civ Bonus" mod for Age of Empires II. It highlights how certain civilizations, when combined strategically, can create overpowering unit types and tactics. The focus is on four-player team compositions that leverage unique benefits: for instance, the combination of Franks or Incas for free castles, Japanese and Tatars for rapid trebuchet fire, and Celts for enhanced onagers. The video also explores various combinations for cavalry and infantry, including the Goths for swift unit production and a range of secondary civilizations to amplify strengths. Commenters contributed ideas for clever combos, emphasizing the limitless potential for exploiting the mod's mechanics to develop broken strategies that can quickly overwhelm opponents by accelerating unit production or enhancing existing attributes, thereby showcasing the humorous and chaotic nature of the mod's gameplay.

and here's the summary of the same video from videosummarizerai.com via ChatGPT with the prompt "Give a summary of all civ combos mentioned in order":

> Here’s a summary of the civilization combinations mentioned in the video in order:

    Castle Spam Team:
        Incas: Free castles and massive villager armor bonuses.
        Teutons: Adds +30 range to castles.
        Celts: Castles fire at four times the usual rate.
        Sicilians or Spanish: Increased castle build speed (castles built in 30 seconds).

    Ultimate Trebuchet Team:
        Japanese: Faster fire rate and instant packing/unpacking for trebuchets.
        Tatars: Adds +20 range to trebuchets.
        Britons: Adds splash damage and perfect accuracy with Warwolf.
        Saracens or Celts: Additional attack bonuses or faster firing trebuchets.

    Best Onager Combo:
        Celts: Increased fire rate and durability for onagers.
        Mongols: Super-speed movement for onagers.
        Koreans: Adds +10 range and removes minimum range.
        Slavs: Makes onagers free.

    Scorpion Team:
        Khmer: Adds range and additional projectiles.
        Romans: Reduces cost and increases attack rate.
        Ethiopians: Adds torsion engine tech for better area damage.
        Chinese: Adds huge damage to projectiles for almost quadrupled effectiveness.

    [goes on to list remaining 4 team combos]


"Too long video" lol


I tried it on music videos, and with most of them I got the distinct impression it's drawing from the description or external info, maybe PR, because it was very generic, the funniest I saw being https://tldw.tube/?v=kxstMJY2Q28

I was about to give up because after all, this isn't for music videos, tried something it could not possibly do well with, and I'm impressed, against my will.

https://tldw.tube/?v=7X62OJG6Whg

This isn't just an acceptable summary, this is really good, considering how much could be missed (like it did with https://tldw.tube/?v=6k7kk1lUnis for example).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: