Hacker News new | past | comments | ask | show | jobs | submit login

There shouldn't be any issues with copyright, as long as you aren't redistributing the original work. Otherwise all neural networks would be illegal, since most training data is copyrighted.

As for errors in the subtitles, that's still good enough. As long as the machine learning model can deal with uncertainty, it would just not learn from those examples and learn from the ones that are correct. It might even learn to abbreviate sentences itself!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact