You're likely to find people using this to build solutions for remote depositions (in the US). Seems to be something fairly ripe for disruption, the pandemic is exacerbating the demand.
Is there a recorded demo of it somewhere? Would be nice to see it in action as I'm having a little trouble understanding the workflow.
While we were trying to build up a corpus of transcription data for our company, I often thought we should build something like this an open-source it. We ended up building a one-off hacked-up thing to do it instead, but I'm really glad this exists for any future people in our shoes.
Annotating speech data is super tedious and anything that improves the process even 10% is a huge, huge win.
Even we used to do the same. Hence, we developed this tool to mitigate a lot of the pain points that previous tools brought. Thanks for sharing your experience and would love to hear your experience with this tool.
How does this compare to ELAN (https://archive.mpi.nl/tla/elan) in regards to doing the actual annotations/transcriptions? Or could ELAN/EAF-files perhaps be considered for input formats in future releases?
At our lab, we extensively work on problems that involve speech data. This includes tasks like speech recognition, speech scoring, emotion recognition, topic detection and speaker diarisation. Some of these tasks have public data available, while tasks like speech scoring and low-resource speech recognition, the data is fairly limited for supervised learning. Hence, we developed this annotation tool to generate corpus for our need.
In case still not clear, it does not do the transcription, it does not. Oh Hi Mark. It asks you to manually annotate it (in case you want to prepare a training data set for your algorithm), its not an AI algorithm.
This is the most helpful comment here. I still don’t understand what the tool is for though. Up until now I assumed it would allow me to get automatic transcriptions, including breaking them down by speaker.
I was looking into that space recently and I have used otter.ai for transcriptions which gives you 6000 minutes/month for 8 USD, which is insanely cheap in that space. Their British language model is quite good as well.
I’ve bulk exported generated srt/vtt files from my fav podcasts and using tinysearch that was posted here recently with ableplayer to provide audio full text search of my Jekyll published podcasts posts and with clickable timestamps to audio play of search phrases.
Whenever I want to know what podcaster has to say on specific subject a quick search makes such a difference!
Interesting usecase! Currently, the tool allows creation of datapoint along with reference transcripts. From what I understand, you wish to fix the subtitle start and end time while keeping the transcription for that segment same. If yes, we plan to add an enhancement where you can pass annotations aka segments with transcripts. This should solve your usecase.
That would be awesome. My hacky solution was a waveform and start and end sliders. It would just iterate through and you could accept, reject, or modify the times and text.
Thank you for your suggestion! We're working hard to get a demo video out and intentionally left space for it. But sure, we can add a placeholder image till then.
Actually, wasn’t bad when I looked at it. I’ve seen much much x5 worse.
A few font-awesome, testing-library, ES-Lint & react imports. Some of those broader libraries have been broken up so you don’t have to import the whole enchilada.
But ya in a larger project, mixing and matching the versions of some of those components can get tricky. This repo seems reasonable in dependencies, the dependencies of dependencies on the other hand can be crazy in any project these days.
yarn.lock is just under half a megabyte, and lists 1461 packages that it installs. (232 of them are second or subsequent versions of the same package, which typically indicates unmaintained software. It has five versions of kind-of, and four versions of ten other packages.)
I think you should refer to package.json for actual dependencies. But yes, I agree that tool dependencies are dependent on a lot of dependencies. I'll evaluate and reduce tool dependencies, if possible.
That being said, the gzipped js bundle size is fairly small (under 200kb).
Is there a recorded demo of it somewhere? Would be nice to see it in action as I'm having a little trouble understanding the workflow.