Their are many things are not happening good in India and political power occupied by the current ruling state is not able to handle situation as well. This is true that Modi makes things worse because he dont have vision or architecute to propose. Modi known by the name of friends of few very rich people and ingnoring rest of the issues which anybody can see in India like elephant in room. So its coming from people of India to support other political parties.
By the way he come to power with false promises and he know how to manage the mass media by controlling it. But this kind of management would not going to solve any problem in hand or in future.
Location:Any
Remote:Optional
Willing to relocate:Yes
Technologies:● 14+ years (more than 6 years into Big data and Cloud technologies) of proven expertise across multiple business domains and technology areas. Writer of 200+ articles on Big data technologies for vast community at LinkedIn, Hortonworks, Medium, Logika and Ammozon.co.in ● Having multi cloud experience with end to end project deliveries of two GCP, two Azure and one AWS cloud platform projects. Design and implementations of four Bigdata Projects, two Machine Learning projects.● Certified Azure Data Science, Docker, Kubernetes and Big Data professional. And good understanding on cloud virtualization, networking, storage and data security.● Experience with various Hadoop flavors like Hortonworks Data Platform HDP, IBM BigInsight, Cloudera Distributed Hadoop CDH. Expert with Distributed components like HDFS, Hive, Pig, Tez, Spark, HBase, Cassandra, Oozie, Yarn, Sqoop, MapReduce, Storm, Kafka, Spark, Flink.● Expert in Lambda Architecture for real time Hadoop and Streaming Application using Flume, Kafka, Spark, Hive, Hbase, Solr and Apache Flink.● Good understanding statistical modeling, applying Machine Learning algorithms and Data Analysis approach using Python, R and Azure Machine Learning Studio.● Working experience with programming languages like Python (iPython, Lambda, Panda), Java, JRuby and VB.Net. Working experience in data visualization tools such as Banana Dashboard, D3.js, Tableau, Crystal Reports, and Business Objects. Built automation tools and developed Utilities in Java, Python, JRuby, Perl, Unix Shell.● Worked on administration activities and performance optimization techniques on Hadoop, NoSql and Oracle. Expert in application Data Modeling, Database Design, Data Aggregation, Data Lake, scheduling and monitoring.
Résumé/CV:https://www.linkedin.com/in/mukesh-kumar-bigdata-ml/
Email: OracleMukesh@rediffmail.com
Around couple of years ago I am working on a home project and utilised Tesseract and Laptonica for OCR. Storage and search HDFS, HBase and SolrCloud on extracted text. You can find the details here on my website. I was very impressed with conversion of hand written pdf docs with 90% readable accuracy. I have named it as Content Data Store(CDS) http://ammozon.co.in/headtohead/?p=153 . Source code is open and you may find steps on installation and how to run here.
http://ammozon.co.in/headtohead/?p=129http://ammozon.co.in/headtohead/?p=126
A short demo
http://ammozon.co.in/gif/ocr.gif
I didnot get time to enhance it further but planning to containerize the whole application. See if you find it useful in its current form.
I had a similar problem and ended up using AWS' Textract tool to return the text as well as bounding box data for each letter, then overlayed that on a UI with an SVG of the original page, allowing the user to highlight handwritten and typed text. I plan to open source it so if anyone's interested let me know.
Not a fan of the potential vendor lock in though, so it's only really suitable for those in an already AWS environment not worried about them harvesting your data.
Location: India
Remote: Yes
Willing to relocate: Yes
Technologies: Expert in RDBMS from development to Database Design to Peformance Tuning, Hadoop, Hive, HBase, Spark, Kafka and other Hadoop components. Wroking expirence with GCP and Azure Cloud.
Résumé/CV: https://www.linkedin.com/in/mukesh-kumar-bigdata-ml/
My Website: http://ammozon.co.in/
Email: mukeshfromharyana@gmail.com
Big data, Machine Learning, GCP and Azure professional along with proven experience into multiple domains.
I have made some progress on this as my home project using same compression and scan. I call it DFA - digital file analytics where data/images/scanned documents are sent remotely using Kafka to Hadoop and then run OCR to extract text and compression. If the document is more then 10MB go to HBase otherwise HDFS. Near real-time streaming using Spark and Flink is done too. Visualization using Banana dashboard is not so cool as it shows word counts, storage location, images and tags. Analytics on top of extracted data using ML would like to do next.
I learned how to complete the vision of leaders with persistence and dedication. In my IT career came to know that I can make any vision successful. I am following and think they are doing good to the world. My new leader who wants to convert my idea and his idea into a product.
Note: I am an employee working on half of salary what I earn in my last organization. Obviously, money is not always the great thing.
This kind of basic abstraction in place gives us a way of gluing disparate data systems, processing real-time changes, as well as a being an interesting system and application architecture in its own right.