We have been building a DeepSpeech model with our data for the past year and we have recently hit 95% accuracy on the LibriSpeech dataset. That puts us close to the published results for DeepSpeech 2. However our dataset is conversational audio and we do much better with our own internal dataset compared to PaddlePaddle. Here's a blog post on the method we followed to build our models.
We have been using this internally in our service and it saves a ton of time and effort during the typing stage. It is nowhere near to the accuracy which our transcribers can achieve, but we are getting close. We are offering automated transcripts free for a limited time. Please do try it out.