I am look for applications that can perform OCR on scanned images having Indic scripts (Devanagari, Tamil etc) and create a searchable pdf as an output. There are several applications which can extract the text from images, but is there any application which can create searchable pdf?
https://products.aspose.app/pdf/searchable
so that, I think, it possible to extend it to Devanagari on your local with Tesseract and Aspose.Pdf with C# code snippet:
CallBackGetHocr recognizeText = (System.Drawing.Image img) => { string tmpFile = Path.Combine(outputFolder, Path.GetFileName(Path.GetTempFileName())); using System.Drawing.Bitmap bmp = new System.Drawing.Bitmap(img); bmp.Save(tmpFile);
reply