Computer Vision: Algorithms and Applications (2010)

colincsl · on Sept 27, 2016

For those interested in learning vision from a machine learning perspective I would suggest "Computer Vision: Models, Learning, and Inference" [1]. Szelinski's book is also great but gives a more classical overview of computer vision.

[1] http://www.computervisionmodels.com/

weavie · on Sept 27, 2016

Most of my attempts at getting into CV fail because my knowledge of Maths is incredibly rusty. I did Maths up to university level, but haven't used it since.

Are there any good courses / books or other resources that would specifically help at getting up to speed with the maths knowledge required to understand a book like this?

randcraw · on Sept 27, 2016

I can't recommend Szeliski's or Prince's book as a good first text in CV. Szeliski's is unnecessarily math heavy and lacks worked examples or explanations of concepts. It's esentially a reference book. Prince's is (reputedly) excellent at approaching CV from a ML perspective, but it doesn't really cover the main concepts in CV, so it's a poor source for the fundamentals.

Davies' book "Computer and Machine Vision" is the least math intensive CV text I've found. Its emphasis is more on MV (industrial vision) than CV, but it covers the basics of CV pretty well.

Klette's "Concise Computer Vision" recent (2014) has good coverage of CV, but it is concise (oft short explanations), and DO NOT buy the Kindle version. It fails to render a custom font used by the author, so some of the most important characters within are invisible.

Trucco and Verri's "Intro Techniques for 3D Vision" (1998) is clear and short though a bit dated, but it covers the basics well. Likewise Shapiro & Stockman (2001) is a well written survey but also dated. I like both.

Avoid the Forsyth & Ponce book unless you're a masochist. Very unclear exposition.

I like the GaTech CV video course (the free videos are part of their online MS in CS, OMSCS). Essa and Bobick's presentations are clear and lively, the coverage broad. They don't use a text. Also, the videos from Shah's course CV at Central Florida are available on Youtube. He offers good clear coverage of the topic, though without enthusiasm.

https://www.omscs.gatech.edu/cs-4495-computer-vision

joshvm · on Sept 27, 2016

I wouldn't recommend Szeliski for that - the book is basically a literature review (and is now a little out of date given how fast CV moves). It's great if you need a background survey on how things are done, but not for learning from.

The level of maths required varies a lot. Linear algebra is used heavily everywhere, you see a lot of optimisation (e.g. Levenberg Marquadt) and in some places graph theory. However if you're just using tools like OpenCV then you can get by with a fairly poor understanding of the maths (say 2nd year undergrad of an engineering degree).

Part of the challenge is reading past the maths. If you read a paper from Pattern Analysis there's a lot of, frankly, obtuse notation. So you see images described as discrete mapping blah blah. A lot of it is fairly straightforward set theory. It's necessary for proving things, but when it comes to actually implementing this stuff it's nowhere near as complicated as it looks.

weavie · on Sept 27, 2016

Yes I think the notation is quite a stumbling block. Eg, As soon as I see an integral sign I am lost. I have no idea how to transform that into an algorithm.

joshvm · on Sept 27, 2016

You might be better off looking through the code of a mature image processing library like OpenCV or this one (recent HN post) written in Go which is very clean:

https://github.com/anthonynsimon/bild

weavie · on Sept 28, 2016

Good idea! Thanks.

fitzwatermellow · on Sept 27, 2016

Aude Olivia's course at MIT is also a good resource:

6.869: Advances in Computer Vision (Fall 2015)

http://6.869.csail.mit.edu/fa15/index.html

Or just jump right into the deep end with OpenCV:

http://opencv.org/

stared · on Sept 27, 2016

The only one risky thing is that in computer vision right now things change each 6 months. So, one will never know which techniques are still state of the art, and which were outclassed by some ConvNets 4 years ago.

emanuelev · on Sept 27, 2016

Regardless the impact machine learning techniques are having on the field, I still think is valuable to know and understand what are the classic methods that they are allegedly replacing.

rsp1984 · on Sept 27, 2016

It is actually not about "classic" vs. "modern". A lot of what's described in the book is as relevant as ever.

Mostly what's changed is that what used to be heuristics and hand-crafted features is now getting replaced by proper learned models. Also some generative models are getting replaced discriminative models since deep learning does a very good job in creating those.

stared · on Sept 27, 2016

Sure. Just it's nice to know which pieces are:

- considered obsolete (e.g. complicated and nasty, and still outclassed by NNs),

- building blocks of current techniques,

- provided inspiration for current techniques,

- are still state of the art.

rsp1984 · on Sept 27, 2016

A lot in the book is about geometry and Structure from Motion. It's an area that's becoming augmented but not superseded by Neural Nets. For example if you want to generate a 3D model from images you can't just throw a NN at it. Same for tracking etc...

emanuelev · on Sept 27, 2016

Although that might change real soon. There are a number of ongoing works that attempt at using CNNs to estimate both geometry and movement.

zump · on Sept 28, 2016

Got a link?

SLAM using SfM (from monocular vision) using Deep Nets would be huge. Would reduce the specifications for LIDAR in SDC's for example.

Kenji · on Sept 27, 2016

Things like principal component analysis, eigenvalues and filtering are timeless and fundamental tools that won't go away in the near future.

Even if you don't go into computer vision - these have a wide range of applications, for example in sound or signal processing which are closely related.

qntty · on Sept 27, 2016

I used this book in my digital image processing class (a class for EE majors). A lot of the book is applications of classic signal processing techniques.

ansgri · on Sept 27, 2016

This is maybe the best textbook on the basics of image processing, computer vision and computational photography. The author is from MS Research and they do know this stuff.

zanalyzer · on Sept 27, 2016

He is at Facebook now

zump · on Sept 28, 2016

Geez, I wonder what the offers these notable researchers are getting.

rawnlq · on Sept 27, 2016

Given how fast this field moves, is this still a recommended textbook (other than being free)? If not what would a good supplement for it be?