Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: EnfinBref- {GPT3-5|Mistral-7B} YouTube summaries, segment by segment (enfinbref.io)
26 points by bclavie 7 months ago | hide | past | favorite | 11 comments
A neat (in my opinion) little side-project I've been working on, both to get somewhat basic React skills going, and to work with LLMs on even more cool projects to build.

It should work for most major languages and output English summaries (or French summaries, if using the main https://enfinbref.io page instead of the /en/ subpage), no matter the input language.

Currently planning on expanding in various directions, including some nice new features like choosing a summary type, better video type identification and LLM routing, and bullet points exec summaries. Pretty basic on functionalities at the moment, and relying on a few tricks. The key stack:

- FastAPI + Python backend, with some extra libs for type validation (Pydantic), translation and YouTube transcript fetching. - Chained LLM calls with logic. id video type w/ a light model, break down into segments and sections, parallelise as much as can be, general high level summaries. - Models are a mix of Mistral fine-tune and GPT-3.5, with prompts tailored to the identified type of content and the current context. - Front-end is my first foray into React + Tailwind, with my last front-end experience before that being jQuery.

Inspired by a post a while back about Summary Cat, but with a more in-depth approach: all summaries are segment-by-segment to get a more in-depth view at potentially complex videos. Segments are defined as being 3mn long for short videos, 5mn for longer ones. Anything above 45mn is broken down into 45 minute sections, both for ease of context length handling (solidly into gpt-3.5-16k territory, which is already more annoying to run than Mistral-7B, and any further would require GPT-4) and because things get a bit murkier to handle in terms of clarity when going above that limit.

(the name is from a common French idiom for "anyway")




Cool I really like this, I've used https://www.summarize.tech/ very often, will try your site next time I need it, too. Thanks


How do you break down the segments/sections? Is it just fixed time? What happens if there are more than one topic discussed in the segment?

Are you using both chatgpt and mistral? Do you use them for different tasks?


> How do you break down the segments/sections? Is it just fixed time? What happens if there are more than one topic discussed in the segment?

Currently it's just a dumb fixed time rule, based on max video length (3 or 5mn segments). I played around a bit and it's the easiest way to implement things that works remarkably well. If there are multiple topics, there a few branching paths in the code, but a lot of it comes down to believing in the LLM's ability to make sense of it. I've got some ideas to improve, but would need a bunch of work to implement well.

> Are you using both chatgpt and mistral? Do you use them for different tasks?

There's a degree of A/B testing (well, "A/B testing", since we're not collecting feedback) where some of the summaries are GPT, some of them are mistral, mixed together for the same video. Mistral being superbly fast means it's also really useful to support the branching coding logic (e.g. something I'm working on right now is having an entirely different summarisation style if a video is about sports, and while a logistic regression would do that pretty well, it's not particularly robust, and won't tell me what sport it is if the transcript is full of typo) or to clean up the video transcripts.


Now this could actually be useful.

From time to time information is only provided in the form of video. But watching video is much less convenient for me. Even if useful, it just doesn't vibe with office work.


Thanks! I agree -- I find it much easier to skim a few paragraphs than to skim through a video when trying to consume information quickly if I'm not sure I want to commit to a full, long vid. Hoping to make it useful enough that it ends up paying for its own server costs so I can keep it around!


Pardon my French, but merde, this is impressive. I've tried it on 20-40min French videos and the summary and section-dividing is spot on.


Merci! It's early on but I'm quite happy with how the first prototype turned out.


Impressive, thanks. How could one run something like that on local videos ?

Btw I love the name.


> Impressive, thanks. How could one run something like that on local videos ?

It depends how involved you'd want it to be really. You can get a very simple summary using something like Whisper to transcribe a video and having basic LLM calls. More involved summaries/segment breakdown/fine-tuned models would be a lot more work, but might not be needed for something quick with local vids?

> Btw I love the name.

Thank you! This is actually a project I had back in 2018ish, which fizzled out because I didn't have enough time to get good enough summaries going during the pre-LLM era. I let the domain name expire and a few weeks ago realised it was still free, so got building again and re-bought it!


Merci pour les conseils ! Je pense que je vais m'abonner dès que je retrouve un taf :)


Haha merci! Prends ton temps, pour le moment le tout tourne sur des crédits cloud gratuits alors la seule utilité du bouton premium c'est de faire de la lumière. Bonne chance pour ta recherche!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: