Hacker News new | past | comments | ask | show | jobs | submit | terryops's comments login

I tried to let it add punctuations to a list of words from asr, but it changes the words no matter how I tweak the prompt. lol.


lol mind sharing an example?

also in case you haven't already, can try with other models too by changing it from the code to see if they do better


Breakdown of SubEasy.ai

SubEasy.ai is an all-in-one platform where you can create automatic subtitles, AI translations, transcriptions with speaker names, chat with the transcription, and export it as a video or text file document.

Transcribe:

1. Powered by Whisper: We leverage OpenAI’s Whisper model, which supports many languages with high accuracy, especially in multilingual scenarios. This gives us a competitive edge against ‘traditional’ transcribe services. 2. Enhanced Accuracy and Readability : Whisper isn’t perfect, so we aimed to maximize its potential. We implemented the following:

  - Clear +: Whisper can pick up background noises in audio/videos, like passerby voices, music, and even honking. Using Clear +, we remove these noises with DEMUCS and normalize the audio before sending it to Whisper for transcription.

  - Subtitle Reflow: Many audio/video-to-subtitle applications group large blocks of text within the same timeframe, resulting in overly long subtitles on the screen. With our exclusive Subtitle Reflow feature, you can have context-aware cutting and time-aware segmentations, improving the viewing experience. We actually use smaller NLP models to achieve this, if you’re interested in tech spec. (Just to say don’t use LLM everywhere, it’s just too expensive and very unpredictable)
3. Enhanced Transcription View: We turn audio into well-constructed articles with punctuation, sentences, and paragraphs, useful for previewing podcasts, long audios and videos, and meeting minutes.

  - Speaker Recognition: This feature identifies different speakers in a multi-speaker conversation, making it easier to follow who’s speaking. We use NVIDIA Nemo toolkit for state-of-the-art accuracy in Speaker Recognition.
What Makes it Next-Gen?

1. Context-Aware AI Translation: Most translation services work sentence by sentence, missing context-specific meanings. Using modern AI models, we create context-aware and highly accurate translations. We also introduced a second round of refinement and proofreading, launching AI Plus translation, which can sometimes outperform human translators.

2. Chat with the Transcript: We integrated GPTs with our platform, allowing users to interact with their documents with natural language. You can summarize, and rewrite transcripts and much more on ChatGPT. Since ChatGPT now roll out a lot of features(previous plus-only) to free users, actually you can use this feature with extra cost!

3. Integrated AI Companion: You can create summaries, meeting minutes, show notes, and social media content with one click without leaving the page. Regardless of the transcript language, you can always get AI content in English(Or other languages you prefer).

What Makes the Product More Than Good:

We offer a WYSIWYG video preview with multiple subtitle styles, a lightning-fast subtitle/transcript editing interface, document management system, search, video output, multi-format document output, and more. We believe we have the best overall performance and experience in this specific field.

Final Thoughts

Creating SubEasy.ai has been an incredible journey, inspired by a simple yet profound desire to make my wife's viewing experience more enjoyable. It started as a personal project but quickly evolved into something much larger, driven by the potential to help others facing similar challenges with transcriptions and subtitle translations.

For those who need reliable transcription and translation services, I invite you to give SubEasy.ai a try. You might be pleasantly surprised by its capabilities and the seamless experience it offers. Whether you're curious about the technical aspects, the cost, or just want to provide feedback, I'd love to hear from you. Your insights will help us continue to improve and innovate.

Thank you for taking the time to read about our journey and the creation of SubEasy.ai!


I created it, and it does an excellent job with subtitle segmentation and translation. Feel free to try it!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: