No, that wouldn't work too well. It's for YouTubers who stream their desktop screens and I need to extract some information to automatically process it. The desktop streams always look very similar so I don't need advanced AI/neural nets to extract that.
Hmmm... what does "with OCR added" mean? If there is text in the video (e.g. street sign) that it can also be searched??