If you just throw everything into a neural network, then you won't really understand the breadth of the problems you're solving, and you'll be therefore ignorant of the limitations of your hammer. While NNs are incredibly useful, I think a deep understanding of the core problems is essential to know how to use NNs effectively in a particular domain.
After getting a grip on those concepts, Szeliski's Computer Vision: Algorithms and Applications (http://szeliski.org/Book/) had some really amazing coverage of CV in practice. Mastering OpenCV (https://www.amazon.com/Mastering-OpenCV-Daniel-Lelis-Baggio/...) was very useful when actually implementing some algorithms.
Before the Deep Learning Craze started in 2011 more classical Machine Learning techniques were used in CV: Support Vector Machines, Boosting, Decision Trees, etc..
These were (and still are!) used as a high level component in areas like recognition, retrieval, segmentation, object tracking.
But there's also a whole field of CV that doesn't require Machine Learning learning at all (although it can benefit from it in some cases). This is typically the area of geometrical CV, like SLAM, 3D reconstruction, Structure from Motion and (Multi-View) Stereo, anything generally where you can write a (differentiable) model of reality yourself using hand-coded formulas and heuristics and then use standard solvers to obtain the model parameters given the data.
Whenever it's too hard to do that (for example trying to recognize many different things in images) you need a data-driven / machine learning approach where the computer comes up with the model itself after seeing lots of training examples.
As for resources the other answers are already giving a great overview. Use Karpathy's course for an intro to Deep Learning for CV but don't expect it to be comprehensive in terms of giving you an overview of CV.
Learn OpenCV for more low level, non-ML and generally more "old-school" Computer Vision.
A personal recommendation of mine is http://www.computervisionblog.com/ by Tomasz Malisiewicz. It's an excellent resource if you want to get an overview of what's happening in the field.
I would argue Kinetic or Geometrical Computer Vision problems, things like Tracking, Mapping, Reconstruction, Depth Estimation are best suited for the classical approaches like VO, SFM/MVS, SIFT/SURF, HOG etc... and are a separate category of CV problems than object recognition/detection/segmentation - much more capable of being done with ML because dimensionality is reduced.
But there's also a whole field of CV that doesn't require Machine Learning learning at all (although it can benefit from it in some cases).
In fact, Machine Learning has made almost no progress on most of what you mention, specifically SLAM and Multi-View Stereo. It takes completely rethinking how those are done when they are approached from the Deep Learning perspective.
He has articles on solving actual problems with OpenCV, dlib and tensorflow. I subscribe to the blog and try to do some of the tutorials myself.
Udacity is another great resource. Their self driving and robotics nanodegrees are great.
I am on the same path as you trying to pivot my career from full stack engineer and add CV + ML skills to it.
When we have decent robot hardware, I want to be the one programming them, not the one getting replaced by them.
Also, for a general and high level introduction to neural networks, I wrote a Learning Deep Learning in Keras http://p.migdal.pl/2017/04/30/teaching-deep-learning.html, focusing on visual tasks.
For example, he goes through a few examples where a neural net has too many weights, or too little data or improperly connected nodes. All three result in problems, but the problems exhibit themselves in slightly different ways and with expertise you can start identifying them.
Fun and trendy though it may be, I would not focus on deep learning / convolutional neural networks to start off. Deep learning is a small subset of computer vision. I would focus more on understanding the basics of image processing, camera projection geometry, how to calibrate cameras, stereo vision, and machine learning in general (not just deep learning). Working with OpenCV is a good place to start for all of these topics. Set yourself a project with tangible goals and get to work.
A good online reference is: http://opencv-python-tutroals.readthedocs.io/en/latest/py_tu...
I too got this url referred by somebody, and I got excited after their extended intro why, how etc their course different and better then any other.
Though after 5 videos i know nothing more then from any other ML/AI guide on the internet then i did before. 99% is only related to image classifying, and i'm simply seeing too many guides for that.
If anybody has some good links/videos on ML/AI on structured data, please comment and i'll be thankful and happy to click 'm :)
"Certainly I'd pick DL over more linear models for most problems. But I'd pick random forests over DL for most structured data problems."
"Deep learning is best for unstructured data, like natural language, images, audio, etc. it sounds like you may be dealing more with structured data, in which case the Coursera ML course would be a better option for you"
More discussion here - https://www.reddit.com/r/MachineLearning/comments/5jg7b8/p_d...
A lot of 'traditional' computer vision methods e.g. Hough detector are simply inferior to deep learning approaches.
Plus, it's a lot easier than you'd think to get up and running, especially when you leverage pre-trained models...
disclaimer: not related to any of these
In regards to OPs original question, I'm actually working on solving your very problem right now. About 1.5 years ago I created the PyImageSearch Gurus course (https://www.pyimagesearch.com/pyimagesearch-gurus/) with the aim of bridging academia with actual real-world computer vision problems. The course has helped readers in their academic careers, such as securing grants (http://www.pyimagesearch.com/2016/03/14/pyimagesearch-gurus-...) as well as students become practitioners and land jobs in the CV startup space (http://www.pyimagesearch.com/2017/06/12/pyimagesearch-gurus-...)
Within the next month I'll be launching PyImageJobs which will connect PyImageSearch readers (especially the Gurus course graduates) with companies/startups that are looking to hire.
Finally, I'm also working on my upcoming "Deep Learning for Computer Vision with Python" book (https://www.pyimagesearch.com/deep-learning-computer-vision-...) which is now 100% outlined and I'm on to the writing phase.
Definitely take a look and if you have any questions, please let me know or use the contact form on my website if you want to talk in private.
Looking forward to your book. Keep up the great work.
This is the most comprehensive book I know of on Computer Vision. The diagrams in the book (including captions) themselves do a great job of explaining things.
Measure the speed or count the number of cars passing by your street. Try to implement an OCR for utility meter. There are lot's of applications you can train yourself in, and I guarantee that you will learn a ton from each and every one of them.
Princeton CS598F Deep Learning for Graphics and Vision
Stanford CS331B: Representation Learning in Computer Vision
UVa CS 6501: Deep Learning for Computer Graphics
GaTech CS 7476 Advanced Computer Vision
Berkeley CS294 Understanding Deep Neural Networks
Washington CSE 590V: Computer vision seminar
UT Austin CS 395T - Deep learning seminar
Berkeley CS294-43: Visual Object and Activity Recognition
UT Austin CS381V: Visual Recognition
And best of luck to you!
Disclaimer: Never worked with any technology related to Computer Vision, just a bloodboy beginner Python programmer.
Usually no OpenCV on successful products. Facebook Ads has dedicated research engineers implementing their real time photo analysis algorithms.
In addition, you can try to work through a serious project related to computer vision to help you solidify concepts. I worked used Style Transfer as my motivating example: https://harishnarayanan.org/writing/artistic-style-transfer/
However let's keep in mind that the field of computer vision is much vaster than that. Deep learning approaches have been very successful at solving problems in computer vision, but not all of them and not without drawbacks. I believe any course on classic computer vision will give him more insight as to what challenges computer vision aims to solve, how, and what approach might solve what problem.
I am using open cv to process the documents, curious if I am missing out chunk of cv algorithms specially for scanned administrative documents (financial,personal documents)?
So, I was wondering if those algorithms work better for natural images (buildings, people, things etc) than document images (text, graphics) and if so, there must exist algorithms to process such documents I am unaware of.
I would recommend starting with one of the many OpenCV tutorial books, and maybe work your way through a few of those. Then move into books that cover more of the algorithms behind the library like "Multiple View Geometry" by Hartley and "Machine Vision" by Davies, among many others.
However, I can't tell you if OpenCV is still the framework of choice and/or widely used in the field you want to go into.
Nodes with a map will go to other mind maps with resources. :)
this book: http://shop.oreilly.com/product/9780596516130.do has a number of worked examples that explain things well.
It does touch on Machine Learning, but it focuses much more on the fundamentals of computer vision, like feature detection, that allows things like SLAM to exist.
You'll notice that all of the top contenders use Neural Networks, but I would
bet that many of them use at least some traditional CV techniques to transform the images at various steps. That said, many of the more modern deep learning approaches are ditching CV altogether, just feeding in raw pixels without any normalization or transformation, leaving fewer parameters to tweak.
Best example that I have is a pulse rate detector that I put together, that uses OpenCV for video frame extraction & display but bare numpy/scipy for the rest.
Even if you end up using neural nets, understanding how to think about the problems is useful.
I started from wanting to develop AR apps during my undergrad, Here are the best resources I have found to date:
Computer Vision is very theoretical and experimental, so the more hands on, the better! My approach has been to go top-down, overview the landscape and slowly progress deeper.
Begin with the best library for CV in my opinion: OpenCV. The tutorials are amazing!
Python tutorials: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_tutorial...
C++ tutorials: http://docs.opencv.org/3.0-beta/doc/tutorials/tutorials.html
Immerse yourself in these and build any apps you think of!
Then go into:
pyimagesearch tutorials http://www.pyimagesearch.com/
and aishack.in http://aishack.in/,
tons of great tutorials to learn different topics of vision with coding walkthroughs. Understand the examples and rewrite applications.
Then Dive Deep:
Get the new OpenCV3 book, a nice deep overview of many topics in computer vision. https://www.amazon.com/Learning-OpenCV-Computer-Vision-Libra...
And watch this course on youtube:
I feel like then, you will have so much exposure that when you dive into formal classes and textbooks, you will really understand and be enlightened.
This was the general way I learned computer vision, and recently I completed a cv internship for nanit.com . I was not hired for my formal knowledge, but they were impressed by all the various projects ive done and knowledge I had on many vision topics.
I also recently took a formal course of vision at Cornell:
All the assignments have starter code in python and opencv.
This was an amazing class as it dove deep into 3D computer vision, which is so relevant to augmented reality!
Also, here is a link opencv examples for iOS: https://github.com/Itseez/opencv_for_ios_book_samples
here are links for opencv example for Android: https://web.stanford.edu/class/ee368/Android/
Hope this helps! Shoot me a dm if you or anyone has more questions!