
Show HN: SpeechBoard – Edit Podcasts from the Transcript - craigcannon
http://speechboard.co/
======
craigcannon
Hey HN!

Craig from YC here. Ramon Recuero and I built SpeechBoard.

Here's how it works: you record a podcast and upload it to SpeechBoard. We run
it through a few speech to text APIs to generate a transcription for you. From
there you can delete words from the transcript and we cut them from the audio.

Then you can download three files: your edited audio, your original file with
cuts marked in metadata for importing to Audition/Audacity, and labels for
importing into Audacity.

Nailing the in/out points of words was the hardest part, which led us to
create the Audition/Audacity import feature and now I think that's the best
part :)

Let us know what you think!

~~~
AndrewUnmuted
This is great! You've nailed one of the major pain-points in creating and
producing high quality podcasts/audiobooks.

The other major pain-point that I've identified after spending 10 years in the
podcast and audiobook industries is audio mastering. As a mastering engineer,
I know how troublesome it can be to master audio so that it is at a
competitive dynamic range.

Would you be interested in implementing an "auto-master" feature that I've
developed? If so, please reach out! My email is in my HN profile.

~~~
jv22222
I think you can pass the output audio file through
[https://auphonic.com/](https://auphonic.com/) and that will get the job done
for you.

------
dacohenii
This reminds me of a recent Radiolab episode [0] in which they discuss a new
technology that would allow you not only to remove words from audio, but also
to add new ones in the speaker's voice -- and likeness, in the case of video.
They made an example video [1], which was not super-convincing, but a very
good demonstration of what it may be able to do in the future.

Of course, the focal point of the episode isn't the technology itself, but the
implication it could have on society once it gets good enough that its output
is virtually indistinguishable from real video (i.e. fake news in the form of
convincing-looking videos).

Highly recommended if you have the time.

[0] [https://www.radiolab.org/story/breaking-
news/](https://www.radiolab.org/story/breaking-news/) [1]
[https://www.futureoffakenews.com/](https://www.futureoffakenews.com/)

~~~
rrecuero
Take a look at Lyrebird as well [https://lyrebird.ai/](https://lyrebird.ai/)
;)

------
binaryjason
Very interesting application. I am not sure if you guys have looked into this,
but there is a Python library that can detect timestamps on word level if
given the audio and transcript. It's pretty accurate for English:
[https://github.com/readbeyond/aeneas](https://github.com/readbeyond/aeneas)

~~~
craigcannon
Thanks! Yeah, we have used it and can confirm it's pretty good :)

------
levistoddard
I've been working on a similar problem, and excited to share as well.

Sample:

[https://reader.listensynced.com/ycombinator-jessica-
livingst...](https://reader.listensynced.com/ycombinator-jessica-livingston-
on-how-to-build-the-future.html)

As a podcast junkie - I've often run into issues with searching and sharing.
Linking transcript and audio is first step to solving this...

~~~
craigcannon
Neat! I know that guy :)

------
graham1776
There is a podcast I listen to every time it comes out (Uhh Yeah Dude), but
the creators don't create transcripts. One idea I had was directly targeting
podcast creators and creating a service whereby you create searcheable
transcripts for listeners looking for an old podcast.

I think the transcript creation service in itself must be worth something for
these guys.

~~~
froindt
I would be quite interested in that, especially if it could integrate with my
podcast app.

Since August of 2016 I've listened to 30 days worth of podcasts and saved
another 22 days worth of time by skipping introductions, listening faster than
1x, etc. The most annoying thing I'm facing is that I've heard _hundreds_ of
stories and the audio cannot be indexed easily. If I want to send a friend to
a specific episode for a certain story, I have no good way to remember if it
was the Freakonomics podcast, This American Life, Story Collider, Planet
Money, or one of the 20+ other podcasts I listen to.

I'd love a system which would make available a searchable transcript of every
podcast. I couldn't pay for transcribing all of them, but I'd pay 50
cents/podcast. Google tells me 1.75/minute will get a transcription from the
top listed service, so if we had 210 people like me, we could transcribe an
hour of audio.

~~~
craigcannon
Roger that. Thanks!

------
rectangletangle
I like how the core functionality/UI is immediately accessible above the fold.
No log in, or other barriers to simply trying it out.

Great job!

~~~
craigcannon
Thanks! :)

------
sturmen
This is excellent. Looking forward to the full release with the hope that
pricing is affordable for hobbyists. :)

~~~
craigcannon
Thanks! Can you email us?

human@speechboard.co

We're looking to chat with hobbyists to see what you'd like out of it.

------
mistercow
This is really cool. There's something magical about just editing text, and
then having real spoken audio change to reflect it.

I hit an error when I trimmed the text down to:

> Hey this is a different original.

> The text cuts into

Maybe I was too aggressive?

Some undo support would also be really helpful. Have you considered just
having a free-form text field, then using something like wdiff to produce the
edits? That might make the UX easier, since you wouldn't have to manually
reinvent the text editing tools people expect (although you'd have to handle
invalid edits, like people adding new words).

~~~
craigcannon
Thanks!

Yeah, without looking at your logs I suspect you cut too much. :)

We were using a free-form text field before and it led to a couple issues:
cutting words in half + inputs. Both of those basically break it now so we
went for a slightly less convenient but mostly functional demo.

I totally agree though, this needs a lot of polish on the UX side.

------
onuralp
Hi Craig,

This looks very interesting. You might be the right person to ask about
something related that I am currently working on: do you know of any app that
would extract keyword / name based parts of audio? For example, extract only
the parts where Elon Musk speaks given audio input (podcast, YouTube etc.)?
Alternatively, extract only the parts (-30 and +30 seconds) when a specific
word is mentioned.

Thanks!

~~~
craigcannon
Hey!

Audiogrep may be able to do that for ya -
[https://github.com/antiboredom/audiogrep](https://github.com/antiboredom/audiogrep)

~~~
frik
> This looks very interesting. You might be the right person to ask about
> something related

Hi Craig, do you know an app/code that can split the audio/transcript based on
persons? Detect different persons in a podcast and group the transcript by
person. Thanks!

~~~
craigcannon
Hey!

That's something we're also interested in.

You can read up on the subject and see a few projects here -
[https://en.wikipedia.org/wiki/Speaker_diarisation](https://en.wikipedia.org/wiki/Speaker_diarisation)

[https://dsp.stackexchange.com/questions/3119/library-to-
diff...](https://dsp.stackexchange.com/questions/3119/library-to-
differentiate-people-by-their-voice-timbre)

But to answer your question, I have yet to try an app that can do it well.

~~~
raja
Speechmatics diarisation is pretty good.
[https://www.speechmatics.com/](https://www.speechmatics.com/)

~~~
craigcannon
Cool. Will check it out.

------
vermontdevil
My question is the availability of transcripts for deaf and others to utilize?
Is this possible as another feature of your service?

~~~
craigcannon
What other features would you need to make it work well for you?

~~~
vermontdevil
Hi. Not for me. Just thinking out loud if the transcripts are automated by
your service and it’ll be a way for the podcasters to provide along with their
audio recording.

~~~
craigcannon
Ah. Gotcha.

------
dogruck
I would like an (automated) “IMDB for Podcasts.” Specifically, I would like to
be able to find, and be notified of, every podcast where a given person
speaks.

Similarly, I’d like automated data on what ads are run/read.

Essentially, I’d like rich automated metadata, in addition to timestamped
transcripts.

~~~
craigcannon
Yes! I feel that way too. So far I've found Breaker has the best guest search.
Definitely lots to work on in the podcast space :)

------
jtbayly
Won't let me upload a file in Safari, so I tried Chrome. Upload works, but
then I just get "An error has occured. We are looking into it."

Like the demo. Wish I could try it out on some other audio.

~~~
craigcannon
Try incognito in Chrome. We're trying to sort out that bug. Thanks for your
patience!

~~~
jtbayly
No dice, but I signed up for your mailing list. Will look forward to the final
product.

------
orliesaurus
Somewhat relevant (but more enterprise-y) project based in Austin, TX for
anyone local: [http://clarify.io/](http://clarify.io/)

------
patwalls
Hey! This is awesome.

Side question: I just need (good) transcription of audio. I've never been able
to find a good service for the price.

Does anyone have any recommendations?

~~~
GarethX
I’ve used rev.com for a while. They’re quick and accurate.

~~~
patwalls
$1 a minute is just way too expensive for what I need. That sounds like a
human is doing it...

Any services doing this automated?

~~~
thenomad
I just discovered [https://trint.com](https://trint.com) .

No idea if they're any good, but they're certainly cheaper than Rev.

------
geetfun
Really magical to see audio editable like this. Love it.

~~~
craigcannon
Thanks!

------
orliesaurus
what languages are you supporting out of the box?

~~~
craigcannon
We've only tested with English so far but it should be able to handle a bunch:
Arabic, English, Spanish, French, Portuguese, Japanese, and Mandarin.

If you test another language out definitely let us know how it performs.

------
phirschybar
cool idea

~~~
craigcannon
Thanks!

------
hn_hates_tor3
Is this a YC project or a personal project? Are you applying to YC with this
to get funding by any chance? Nice one by the way!

