The topics touched on in the first 30 minutes of the video include: AWS, Jupityr Notebooks, Neural Networks, Tmux, and a few others. I understand that this is the reality of the situation today (very large up front cost of setting everything up) but it would be better to not even touch upon something like Tmux because it's not absolutely essential and just results in information overload. You can replace it with something like "I like using Tmux to save my terminal sessions, check out the wiki to learn more" instead of "here's this 3 minute tutorial on Tmux in the middle of how to use Jupyter notebooks". Very few people are smart enough to be concentrating on following what's going on with AWS/Jupyter notebooks and then pause that, process the Tmux stuff, and then context switch back to AWS/Jupyter.
There's a reason why the wiki/forums are so invaluable. There's definitely some really good information in the videos so if you guys had the time I really hope you edit the videos to an "abridged" versions that focus on only 1 topic instead of jumping around so much.
The lessons are not designed to stand alone, or to be watched just once straight through.
The details around setting up an effective environment are a key part of the course, not a diversion. We really want to show all the pieces of the applied deep learning puzzle.
In a more standard MOOC setting this would have been many more lessons , each one much shorter - but we felt the upside of having the live experience with people asking questions was worth the compromise.
I hope you stick with it and if you do I'd be interested to here if you still feel the same way at the end. We're doing another course next year so feedback is important. On the whole most our in-person students ended up liking the approach, although most found it more intensive than they expected. (We did regular anonymous surveys. )
Personally I enjoy seeing how others setup and use their tools, and feel it is underappreciated in many courses; attempting to fill that gap is a little of my particular bias.
Interesting - that explains a lot, thanks. Sorry to say, but I for one hate courses / tutorials where the tutor tries to force me to use his/her pet technology, totally unrelated to the topic at hand. I have always wondered what made them do it, and I guess they just want to help.
Still, if you want to be helpful, show me this favorite tech of yours and optionally point me to a separate resource for learning it, I might check it out sometime. But I didn't come here to learn tmux, I came for deep learning. My time is precious so please don't waste it. I would understand if you wanted to teach me basics of TensorFlow (and TensorBoard) or similar, but tmux, vi, emacs, bash,... are all outside the scope. IMHO of course. :)
Do tmux, then notebooks, then whatever, instead of splicing one in to the other.
Mixing topics just makes the material less effective at both things and hard to follow.
However my reading of the literature on this topic has led me to believe that the standard treatment is not the best approach for most students. I'm particularly influenced by David Perkins on this point - although I'm sure I could be applying his theories a lot better!
My collaborator, Rachel Thomas, has written about these ideas : http://www.fast.ai/2016/10/08/teaching-philosophy/ . We're learning as we go and we're spending a lot of time talking to our students to understand what's working and what's not. I hope each lesson sees some improvements...
It's okay to teach a little tmux, a little notebooks, and a little AWS each lesson, rather than the course be topic by topic. I agree with your point that it's a better way to teach material in a course, because it allows you to get people started doing things.
My point was that within a lesson, you should separate topics, because people don't handle the topic switches well when trying to listen -- it's basically just a cache/stack smash. Diversions are distracting, even when materials are interwoven in the course.
Why would it matter if one of their students prefers something else besides tmux? The course isn't a comparative analysis of session managers.
CMU, UT Austin, Georgia Tech, UCSD..
I am not in US, thought getting MS in one of these would boost chances of getting into something like google brain, or Open AI..
Is it waste of time and money in your opinion?
Its better to go for Ph.D.
Remember for going into openai or google brain you need to be among top even after Ph.D.
OpenAI and Google Brain, like most other more research driven deep learning institutions, are more interested in the results you can produce rather than the accreditation you hold. Publications obviously count but well used or written deep learning projects / packages would too. Many PhDs who come out having spent many years in academia still wouldn't get an offer from these places and many of the talented people I know in these places don't have a PhD either.
To the parent of this post, I'd also look into what I'd refer to as "Masters in industry" i.e. Google Brain Residency and other similar opportunities. From their page, "The residency program is similar to spending a year in a Master's or Ph.D. program in deep learning. Residents are expected to read papers, work on research projects, and publish their work in top tier venues. By the end of the program, residents are expected to gain significant research experience in deep learning.". This is likely an even more direct path than most institutions would provide. Though obviously the competition is fierce, many of my friends who participated in this ended up with a paper in a top tier conference by the end of the.
Thanks for letting us know.
Recommendations on how to go from basics (being able to fine-tune pretrained ImageNet/Inceptionv3 with new data etc) to a real project? I'd like to play with semantic segmentation of satellite images (hyperspectral). Any pointers?
(We'll be looking at more segmentation techniques in the next part of the course next year.)
Are you still at Enlitic btw? The radiology use case looked really promising for machine learning
The wiki (http://wiki.fast.ai) has links to necessary learning resources for each lesson.
Are there any others you would recommend learning and/or brushing up on in a more general sense for ML/DL?
Thanks for making this available by the way!
This is really refreshing to hear Most of the books I have looked at seem to be mostly math-first and very densely so at that. As much as I appreciate math it's not always the warmest welcome.
I am looking forward to working through your course and seeing part 2 as well. Cheers.
But I guess this all applies quite similarly in the neural network sense in DL/ML as well?
Sometimes reading and understanding papers requires that background, I guess.
That said, I strongly disagree with your disagreement. There was a recent paper whose abstract I read that made me think of homotopies between convolutional networks. Unfortunately I lost the paper behind some stream and never got to read it proper. In the context of that, I realized that the search for convnet design will likely soon be highly automatable, obsoleting much of the work that many DLers are doing now.
What will be future proof is understanding information theory so that loss functions become less magical. Information theory is needed to understand what aspects of the current approaches to Reinforcement learning are likely dead ends (typically to do with getting a good exploration strategy, also related to creativity). Concentration of measure is vital to understanding so many properties we find in optimization, dimensionality reduction and learning. Understanding learning stability and ideal properties of a learner/convergence means being comfortable with concepts like Jacobians and semi-positive definiteness for a start.
Probability theory is needed for the newer variational methods, whether in the context of autoencoders or a library like Edward (whose like I think is the future). Functionals and the variational calculus is becoming more important, in both deep learning and for understanding the brain. There's lots of work in game theory of dynamical systems (think evolutionary game theory) that can help contextualize GANs as a special case of a broader category of strategies.
Much to the contrary, the topics I mentioned are both the future of deep learning and future proof in general. This blog post by Ferenc captures my sentiment on the matter: http://www.inference.vc/deep-learning-is-easy/
However this course is about teaching how to use deep learning today to solve applied problems. Most participants in the in-person course worked in business, medicine, or life sciences. It is to that audience that my comment was directed.
So we're perhaps each guilty of assuming the person asking the question wishes to study deep learning in the way we are focused on ourselves. :) Hopefully between the two of us we have now answered the question more fully than either of us individually!...
If you can, try and organize or join study groups where the levels of skill are varying. Having a group will help in those times when it seems all too much and you're ready to quit.
Finally, don't try to compete with the well heeled industry titans and their GPU Factory Farms. Find an understudied but important niche where your lack of knowledge is not so much a setback because even if available mental tools will differ, everyone is equally ignorant on the terrain.
In America you can be a graduate in computer science without knowing calculus, and in some other country they teach calculus/probability in high school.
So the above person might have studied the chain rule and entropy in high school. So its not exactly graduate level math for everyone.
This certainly isn't true at my university in the US. As a matter of fact, I'm not sure why you named a country at all seeing as this varies by university. Anyway, I would definitely agree that on the scale of reputable international universities this is not grad-level math. A graduate school in a well-qualified university would expect students to either know this material or be able to learn it on their own. They are intro classes for the math major at my (and many other) universities.
I don't know how. Every CS program I've ever seen requires calculus.
>and in some other country they teach calculus/probability in high school
Practically every high school in America offers calculus. Not everyone is required to take it, but most of the students going on to study CS do.
For instance, it's extremely easy to set up an MNIST clone and achieve almost world-record performance for single character recognition with a simple CNN. But how do you expand that to a real example, for instance to do license plate OCR? Or receipt OCR? Do you have to do two models, 1 to perform localization (detecting license plates or individual items in a receipt) and then a second model which can perform OCR from the regions detected from the first model? Or are these usually done with a single model that can do it all?
I'm not sure if answering these questions is a goal of your course, or if they're perhaps naive questions to begin with.
For this particular question, a model that does localization and then integrated classification is called an "attentional model". It's an area of much active research. If your images aren't too big, or the thing you're looking for isn't too small in the image, you probably won't need to worry about it.
And if you do need to worry about it, then it can be done very easily - lesson 7 shows two methods to do localization, and you can just do a 2nd pass on the cropped images manually. For a great step by step explanation, see the winners of the Kaggle Right Whale competition: http://blog.kaggle.com/2016/01/29/noaa-right-whale-recogniti...
(There are more sophisticated integrated techniques, such as the one used by Google for viewing street view house numbers. But you should only consider that if you've tried the simple approaches and found they don't work.)
MNIST is considered a simple toy example, and it has 50k images spread across 10 classes.
ImageNet has 1m images spread across 1k samples.
One of the things that has made image recognition in the form of categorisation easier is that using a network pre-trained on ImageNet, and then finetuning it to your task actually works pretty well and requires far fewer images.
The struggle with doing something like license plate OCR is that it's unlikely that doing that you can transfer the learning from ImageNet to your target task.
So, in reality your struggle is going to be more around the data than the model. If you already had a system deployed that was getting data in and you were getting some feedback of when your model failed, then this problem would be easily solved, but if you're building from scratch this is going to be your biggest problem.
And since you don't necessarily know ahead of time how easy or hard your problem is, you don't know how many samples you will need or how much it will cost you.
So, if you did actually want to build a license plate reader using deep learning, my suggestion would be to try and artificially create a dataset by generating images that look like license plates and sticking them in photos in the state you expect to see them in (i.e. blurred, at weird angles, etc) and then training a neural net to recognise them. That would give you a sense for how hard the problem is, and how much data you will need to collect.
In terms of the model; I would probably just try having 6 outputs with 36 classes per output corresponding to the characters/digits in order. I don't know if it will work well, but it's a good baseline to start with before trying more complicated things like attention models or sequence decoders (https://github.com/farizrahman4u/seq2seq )
For me the key take away here is that - some one who has been a consistent beater in the Kaggle competition for two years and the founder of a ML company is teaching a "hands-on" course which fills a gap (from tech talk to step-by-step hands on) and I think I can live with this method of teaching.
Or should I take a more theoretical course such as Andrew Ng's to get into ML?
Anyhow, it was a great introduction, and light on the calc (but more emphasis on probability and linear algebra). If you have matlab or octave experience, it will also help (I didn't - the revelation of having a vector primitive was wonderful once I got the swing of it, though).
Note again, though, that I took the ML Class - not the Coursera version; I have heard that they are identical, but it has been 5+ years since I took it, too.
The Udacity course has a very different aim - it covers much less territory and takes much less time. If you're just wanting to get a taste of deep learning, it's a very good option, but it's not a great platform to build longer term capability on IMHO.
Last night I spent an hour or so getting my system (Ubuntu 14.04 LTS) set up to use CUDA and cudnn with Python 3; setting up the drivers and everything for TensorFlow under Anaconda - for my GTX-750ti.
That wasn't really straightforward, but I ultimately got it working. It probably isn't needed for the course, but it was fun to learn how to do it.
I would like to take this fast.ai course as well, but so far the Udacity one is eating all of my (free) time. Maybe I can give it a shot in the future.
If it supported, then the major difference is GPU memory, which limits the size of the network you can train. The newest models are faster than some 1-2 year old ones, but older hardware does the job fine.
What do you think of this? Will this work fine?
I am wondering, why you chose to leave Enlitic and start fast.ai?
In short, my wife got sick and needed brain surgery while she was pregnant, and I ended up being away from Enlitic for nearly a year. It made me reassess what I really wanted to do with my time.
Now that I spend all my time coding and teaching, I'm much happier. And I think that making deep learning more accessible for all will be more impactful than working with just one company. Deep learning has been ridiculously exclusive up until now, on the whole, and very few people are researching in the areas that I think matter the most.
Finally, I think I achieved what I set out to do with Enlitic - deep learning for medicine is now recognized as having great potential and lots of folks are working on it.
I really hope that you get past the wall! If you do find yourself getting stuck, demotivated, etc, please do come join the community on the forums, since they can really help overcome any issues you have: http://forums.fast.ai/
Is NVidia GPU a must have... can I workaround?
It's worth spending the $0.90 per hour to use AWS, or less if you get spot instances.
You can do much of your prototyping on just a CPU, and only run on the GPU on more data once it's working well.