
Show HN: MP3 to Text - sabbakeynejad
https://www.veed.io/tools/mp3-to-text
======
remram
"MP3 to Text" seems very inaccurate since you can only upload video files. In
fact uploading an .mp3 file shows "File type not supported".

edit: I get it, OP just keeps submitting his service with different
descriptions until one gets some upvotes. Only took 25 tries to get 30 points.
Shameful.

~~~
arteez
Just goes to show that people upvote anything if it sounds cool, and don't
bother checking it out.

------
fredley
MP3 to text? Why does it ask me to upload a video?

~~~
sabbakeynejad
Opps, thats a UX problem. Will get this fixed now.

------
rock_artist
Very cool but how do I know what languages supported? It says "VEED is able to
recognise and transcribe languages from all over the world - English, Spanish,
French, Chinese, and many more".

From my experience with NLP/AST the tricky part is models for some less common
languages.

~~~
sabbakeynejad
This is true, we support over 55 languages. The more popular the language the
better the results.

------
rubyron
What’s the pricing? What speech-to-text engine is being used?

Clicking on the Sign Up button on iOS Safari does nothing.

Clicking on the Get Started button takes me to an Upload Video form - not what
I expected from a mp3-to-text service.

~~~
remram
Apparently you're limited to 50 MB for free, which is pretty short if you
can't send audio files but only videos.

------
mxuribe
This would have been genius in the napster days of yore; why seek out and
download mp3s yourself? Just sit back and have people send stuff __to you __!
I kid, i kid! ;-)

------
remram
Is there even a good offline version of this? There are some opensource tools
for speed-to-text but what about batch processing of audio files?

~~~
synesthesiam
You may be interested in voice2json for offline batch processing:
[https://voice2json.org](https://voice2json.org)

Here's an example using GNU parallel:
[http://voice2json.org/recipes.html#parallel-wav-
recognition](http://voice2json.org/recipes.html#parallel-wav-recognition)

~~~
yorwba
> voice2json is optimized for:

> Sets of voice commands that are described well by a grammar

> Commands with uncommon words or pronunciations

> Commands or intents that can vary at runtime

Doesn't sound like what you'd want for a generic transcription service.

~~~
synesthesiam
It supports open-ended transcription too:
[https://voice2json.org/commands.html#open-
transcription](https://voice2json.org/commands.html#open-transcription)

Users have reported good accuracy with the English Deepspeech profile:
[https://github.com/synesthesiam/voice2json-
profiles](https://github.com/synesthesiam/voice2json-profiles)

------
Lemmih
How does this work? And is it more accurate than YouTube's automatic captions?

------
asah
Multi speaker?

------
236dev
reminds me of descript

