However, the first lesson took a bit of stamina to go through. Much of it was introducing basic Unix/AWS/shell/Python things I know intimately and have strong opinions and deeply set ways about. Shell aliases, how to use AWS, what Python distribution to run, running Python from some crazy web tool called notebooks (and not Emacs), etc. felt like I was forced to learn a random selection of randomly flavored tools for no good reason.
Yes, it's a random selection of tools. The good reason to bear them is that you'll learn how to implement state of the art deep learning solutions for a lot of common problems.
So, I ended up viewing the lessons not as "this is how you should do it", but rather as "here's one way to do it". And it does get much easier after internalizing the tools in Lesson 1.
Just something to keep in mind when branding this as "deep learning for coders". Coders have deep opinions about the tools they use :)
You're partially right. But I would like to mention that in many ways, their way is the "standard" way to do things.
Just as an example, the "crazy web tool called notebooks" is pretty much the standard way of working in many different areas of machine learning/deep learning (and others). It doesn't mean you necessarily have to do it yourself, but tons of material out there will be in this format, so it's valuable in and of itself to know it.
Also, you really should give notebooks a chance, they're a game-changer for productivity, IMO. (Although I'm a vim user, not emacs, so maybe you shouldn't listen to me ;).
Also the python distribution in question (anaconda) is becoming standard for data science - pytorch, for instance, (covered in part 2) assumes you're using anaconda for its default install method.
I come from the opposite side of the spectrum. I know nothing about UNIX/AWS/Shell and have no intimate or strong opinions on any of the tools. I am confused by basics such as how to make a special request for P2 access, since the set up video is out of date. The git cloning recommendations on the wiki is also confusing to me. I'd rather see a video on how to do the git cloning or step-by-step instructions (with pictures) on how to do this and what to expect if it has been properly done.
I felt like some of the material is outdated and I find it not clear how to get around it. I ended up getting frustrated with the set up videos and git cloning instructions, so I skipped it in their entirety. I hope the future iterations of this class spell things out for beginners like me. Otherwise, I can't even begin the class since I don't know how to request access to P2, clone git, etc.
I am not uneducated. I come from a pure mathematics background and don't know anything about this set up business. I can code, know theoretical CS, but when it comes to setting up the tools (along with the outdated material) I am utterly lost.
At this point I'm just watching the videos. I can't actually do any of the coding stuff since I don't have the tools set up, but I like their top-down approach and am learning a lot despite these obstacles.
We (including the rest of the learning community) would love to help you - could you post on the forums the problems you are having and we'll try to help sort them out? If you've already posted, please at-mention me there so I don't miss it.
It's a fine line between helping too much vs too little in lesson 1! For the next time we run part 1, we're thinking we'll have an optional weekend workshop to teach the necessary (non deep learning specific) software pieces - python, numpy, AWS, shell, etc. That would end up being a separate mini-MOOC I suspect.
Based on the discussions on the forums it seems most students are between these two extremes, and largely follow the video advice, branching off sometimes where they need to do some additional research, or are already familiar with some alternative approach.
Edit: actually thinking more about it - probably your best bet is to simply use http://crestle.com . All the data, notebooks, and software is pre-installed, so you can start coding right away.
Thanks for the reply. I had not (yet) posted on the forum as I was methodically searching through the previous replies to ensure my question had not already been answered. Thanks for crestle recommendation. I will now look into it. This should simplify my tool setup. If I still have questions around tool setup after posting, then I will at-mention you. I think the mini-MOOC (for people like me) is a great idea. I get lost in basic tooling (that many experienced devs can skip). So, if you guys include a mini-MOOC on tool set up, then this would benefit people like me tremendously and help us get up to speed with everyone else.
Will the revamped course in October be offered online (like the current version is) for public viewing?
I'm doing part 1 at the moment, and the outdated AWS setup instructions on the site frustrated me a bit as well.
Maybe an idea would be to leverage the community a bit more? Make it easier to edit the wiki for MOOC-students which are the ones watching/reading the material later on. And maybe refer to that wiki more on the site.
I end up using the forums a lot for trying to find info on stuff that is not working, or I don't understand. But it's not an easy platform to find information since it isn't structured well.
It's been a while since I watched the course, but IIRC the setup video had an overlay that said to simply put "fast.ai MOOC" (or something similar) into the P2 request form. Worked perfectly for me. Did that not show up for you?
Yeah, that showed up for me. It was more about how to use the setup scripts, they had changed a bit it seemed.
I thought if the resources around the course evolve over time, then the material should as well. It would be easier to keep a wiki up-to-date and refer to that, instead of a video (and site). For setup and "sidenotes" specifically, I'm not saying the lessons should be re-created in the wikis.
Although the homework and lesson notes in the wikis was indispensable.
The downside is you don't get as much control over your environment (you get a terminal in the browser, but no SSH or sudo access, for example). If you have the inclination and ability to manage your own instances, EC2 is more flexible.
Just wondering: what kind of infrastructure are you running on? If I were to build something like this I'd probably run it atop of EC2 or GCE, so pricing would never be lower than them.
Most of the time I also ended up just watching the videos, picked up the concepts I didn't yet know about, and experimented on them in my own way. I did perform all the Kaggle submissions, though, to kind of calibrate my level.
I wonder how big the crowd between our two extremes is — the people who actually do run the commands exactly as explained in the videos? If the material is outdated, beginners cannot. Experts will do their own thing. How many people will follow the actual instructions?
I started watching the videos in Part 1 and was disappointed to find out that I was going to have to create an AWS account. I really don't want to deal with accidentally racking up a large bill on AWS.
If you have the hardware, you can do without an AWS account (you might need to be an "expert" to interpret the instructions to run on your own hardware instead of AWS). I used my gaming rig with a 1080 GTX Ti card.
I've been trying to get it up and running using the latest versions of the libs. I have theanos finding my GPU now and running basic cuda code but cannot get the cuDNN libraries to be recognized even after installing a few versions of the Nvidia site. Ugh. Giving up for now.
You can do it on a poorer graphics card, but that's only if it has enough GPU RAM, and only if its supports a relatively recent compute version.
You _can_ run it on a CPU, and that can work fine for very simple models (like the Tensorflow demos with MNIST) but anything much larger and things really start to take forever. Like, you'll be waiting for months.
A GTX 1060 with 6 GB of RAM is a good intro card. You can find them for < $350.
The AWS P2 instances recommended for the course ($0.90/hour) have half a K80, which has access to 12 GB of VRAM. The list price for that card is $5000.
The GTX 1080 Ti has almost as much VRAM as half a K80, 11GB, and can be found for around $700.
Better to install linux. Not all tools support Windows, and for those that do it is still a second class citizen and kind of annoying, e.g. look at installing CUDA/cuDNN on Windows vs Linux.
I agree that the focus on specific tooling was tricky. That said, I really do like the notebook environment. It's mostly the AWS setup that I found a bit fragile.
Highly recommended! The first course was the first thing I came across that helped me contextualize the DL field into something that might be relevant for my work. It's a great way to get your hands dirty.
One point of comparison is Cam Davidson Pilon's Bayesian Methods for Hackers, they have a similar vibe: practical applied advice from a field that tilts towards the academic...
In fact that book inspired me to create a spreadsheet that implements MCMC in order to make it easy to understand and visualize - we're planning to start an "Introduction to Machine Learning" course in a couple of months where I hope to show off the result of this...
Thank you so much for this, for me Deep learning Part 1 was a top notch course that really helped me learn by actually doing things in variety of topics (e.g competing in Kaggle, creating spreadsheets to understand collaborative filtering & embeddings, sentiment analysis using CNN and RNN etc).
I found the top down approach to very effective in keeping me motivated as I worked my way through the course.
It took me 6 months of watching(and rewatching) the videos and working on problems to get comfortable.
I have done a few MOOCs: Andrew Ng's machine learning, Coursera ML specialisation, edx Analytics Edge and all of them were good learning experience but fast ai's deep learning part 1 really stood out.
For me, the combination of Deep Learning Book + Fast ai MOOC + CS231n (youtube videos & assignments) cover almost everything I want to learn about the subject.
@jph00, I'm half way through neural style transfer and I am loving it.
I somehow forgot to mention in the post - we're teaching a totally updated part 1 course (keras 2, python 3.6, Tensorflow 1.3, recent deep learning research results) starting end of October in San Francisco. Detail here: https://www.usfca.edu/data-institute/certificates/deep-learn...
I'll go edit the post with this info now - but figured I'd add a comment here for those that have already read it.
I felt like the setup of the first part was at time a little frustrating, since I started it during a time when Keras had switched to a newer version which wasn't compatible some of the utility code that was written. Add this to the newbie factor to notebooks, and it was a pretty rough first week or so to setup and get actual learning done. It took me a bit of time to realize notebooks were more like repeatable trains of thoughts than well-written production code.
The other thing is that some of the supplementary material was really long and at times made me feel like, why take this course instead of just going through a course mentioned in supplementary material (e.g. CS231n wrt CNN's)? I think I ended up spending hundreds of hours reading/watching/practicing CNN's by reading papers, watching Karpathy's 231n videos, and doing a couple tutorials from data scientists who elaborated on a specific problem they were solving. I guess at times when watching Part 1's videos and doing the notebooks, I didn't feel like I was 'getting it' as much or as fast as when I was getting the information from other means.
While the forum discussions can be helpful, it was also wadding through a ton of unstructured content. And the service they used for the forums hotmapped the find shortcut to their own built-in search, which was a little annoying. I don't know a great solution to having more structured data, but perhaps adding some of questions that were answered to the lesson's Wikipedia. Or maybe splitting the technical issues from the high level concepts.
Lastly, I think it was either HN or /r/MachineLearning but someone had suggested a book regarding Machine Learning and hands-on Tensorflow usage which I picked up, and I felt like my pace of learning really sped up afterwards. I think part of it was Tensorflow has a lot more written about it so when you encounter an odd problem, chances are someone else has something to say about it.
All criticisms aside, I think I'll try going through Part 1 a second time around prior to going through Part 2.
I had the opposite experience. I'm basically as non-math as you can get and still be in the sciences, and I found the classes quite intelligible (on a fast watch without the notebook, and then a slow watch with the notebook for each class).
FWIW, I think the supplementary material wasn't strictly necessary from a 'using the libraries' point of view. I'll never contribute to this field, but I feeling like Jeremy's explanation were conceptually helpful if not rigorous.
For me, the order was: lesson, notebook+lesson, wiki + supplementary material if something wasn't making sense, and the discussion board if all else failed. That discussion board is basically useless unless you're taking the class in real time I think, which has been my experience for all MOOCs.
If you stick to the setup approach in the videos (using the AMIs we created for the course) all the software will be the correct version to go with the notebooks and videos. If you try to use different versions of the software, it'll be a lot more work.
CS231n is great and I'm glad you checked it out. There isn't really anything nearly as good online unfortunately for the other areas we covered (NLP, collaborative filtering, etc). CS231n is not as code-oriented as the fast.ai course, and doesn't try to get you to the point you can replicate state of the art results - but it's got great conceptual content and Karpathy is a terrific technical communicator. I think the two courses go hand in hand quite nicely!
Totally agree on the unstructured content in the forums. I would suggest they open up the wikis more for editing by MOOC-students, and work to keep that up-to-date. Wading through forums for those nuggets of information to fix your problems is time consuming and boring.
Pro-tip: Press CMD+F (or CTRL+F) a second time to "override" the shortcut for search.
The n00best path to data science and machine learning state of the art is now complete, no excuses! 2015: Andrew Ng's Coursera MOOC; 2016: Kaggle competitions with xgboost and ensembles; 2017: deep learning code-oriented courses with fast.ai and GPU hardware for the masses. Thanks, very lucky to witness and try this.
jph00, I found the first course hard to follow because of some broken links and poorly organized content. One link that was necessary kept taking me to a password protected page. This is about a month ago.
It would be good if someone could revisit part.1 and make those minor editorial fixes if they haven't already done so.
I might be being too precious about my time, but I also found the first video about your teaching philosophy somewhat gratuitous; I wish I hadn't watched it.
The move from platform.ai to files.fast.ai could have been communicated better - sorry this impacted you. (We tried to highlight everywhere we could, but we can't change the video itself on YouTube unfortunately.)
We're redoing the whole of part 1 starting in October so this problem will be fully resolved then. Until then, follow the links on course.fast.ai or the forums, rather than what you see in the part 1 videos, or just remember to always replace platform.ai with course.fast.ai.
Oh and about the teaching philosophy video - until we posted that we had quite a few students express confusion about the top-down approach. After posting it, we've received a lot of positive feedback about it. I understand it's not helpful or interesting to everyone, but overall it seems to have been a successful addition for most.
I like the philosophy and I liked that you communicated it. I think you could have communicated it much more simply & compactly. I suggest you either hire an editor or take an editorial perspective to your content for the redo of part 1.
The former is better than the latter because they will benefit from having a different perspective to you and might be able to give feedback that you haven't considered.
thanks. I'd also like to suggest publishing a docker image as well for course software. Like you I like to also read / experiment when travelling and an offline workflow can often be handy.
It might be worth it to do an updated version of course #1 at some point, many of the examples will not work now if you try to follow along with the video because libraries (for example Keras) have been changed or upgraded.
(I know you have plenty to do, so really no pressure, this is simply a nice to have for newcomers that 'tuned in late'.)
Recently started the first course and wanted to say I am really loving it. Thank you and Rachel for your time and effort into making this topic more accessible.
Getting things set up for lesson 1/2 I noticed there were a few confusing hiccups along the way, I was wondering how do I get access to edit the wikipedia to smooth out a few issues I struggled with?
The forums are generally helpful but it's a lot to wade through and overwhelming for newer participants.
Earlier this year, the lessons stopped working for a while because the setup scripts ended up installing Keras 2, but the code was built for Keras 1.2. I was bitten by this in March, but I've been told everything works fine now (the setup scripts now use Keras 1.2 specifically).
Upgrading the lessons to Keras 2 is of course nice; I wonder if the videos will need to be re-recorded for that?
I had high hopes for them, but it turns out it just puts a tiny 'i' icon in the corner of the screen, with no way (AFAICT) to draw attention to it.
When we redo part 1 in October, we'll try to find a way to incorporate a clear link to an errata page or FAQ in each video, so that anyone with issues has one place to go.
Is this a volunteer effort on your part? I don't mean to be unnecessarily harsh in that case, but otherwise:
I'm not sure I can any more strongly say that you need to get this URL into these videos using whatever tools you can, whatever tools the platform provides. If someone misses it because of the ineffectiveness of the platform, you still get credit for trying!
I assume it should also be the very first thing in the video description at this point. You are losing many more people who would never take the time to share this issue with you due to stale content! (There's a reason it's half-way to the top on your announcement of part 2...)
Afraid I may have missed the window on the chance to provide feedback to jph00 via this channel, but here goes.
Am watching Part 1 now and only two sessions in, but there are some tweaks I would love to see. First the positive: I really appreciate the approach of hands-on and teaching theory only as it's needed and in conjunction with applied work.
Would love to see a tiny bit of time spent on setting up tools for people who already have good Nvidia GPU systems. My Ubuntu system has python (2.7) and python 3.5 both installed, but no Anaconda... I don't know if I'm going to totally screw up my system if I install Anaconda over those working existing python installations, for example.
It would be great to hear the questions. I can barely hear a faint voice in the background as Rachel reads the questions (presumably from online) but it seems like it would be a very easy tweak to have her closer to a microphone. Maybe this happens in later sessions and I just haven't gotten to them yet.
It would be great if so many things weren't abbreviated in the code variable and function names. Examples: nb for notebook, t for ?, a for array(?), U, s, and Vh for ?, ims (?), interp (interpretation or interpreter or interpolation?), sp, v, r, f, k, trn (train or turn or something else?), pred (predicate or prediction?), vec_numba (?)... the list goes on. Yes if I knew the field these might be obvious but for some of them I'm still learning. "np" I understand since that's standard practice and you explained it. It would be really really easy to just spell out words in the code, as well as being a good practice in general imho, and, since you are trying to teach stuff, it would seem appropriate.
Those nitpicks aside I'm really stoked about the course and really appreciate everything you have been putting into it!
> Would love to see a tiny bit of time spent on setting up tools for people who already have good Nvidia GPU systems. My Ubuntu system has python (2.7) and python 3.5 both installed, but no Anaconda... I don't know if I'm going to totally screw up my system if I install Anaconda over those working existing python installations, for example.
Anaconda lives in its own folder (usually in $HOME). You can't screw anything up by installing it, and in fact you can hardly tell it's there. You need to set your path to actually use Anaconda's programs, and you shouldn't do that in .bashrc, but just in the shells where you are actively using it, with something like:
I haven't started on the course yet, but I have exp. with Anaconda. On every fresh linux install, I install Anaconda first. It won't do anything to your system. It'll ask for a path, install all the necessary files there and then add that directory at the start of path so that when you type python on terminal the anaconda python will start. You can still access the previously installed python version by typing python2.7 or python3.5.
Honestly can't recommend this course highly enough.
It's definitely not perfect - the notebooks are not commented and the material does tend to jump around a bit - but what it does do, it does extremely well.
This course will teach you how to actually build deep learning systems and build the kinds of things you read about PhDs doing...
Latently (SUS17) also provides a more self-directed path to learning deep learning focused exclusively on implementing research papers and conducting original research: https://github.com/Latently/DeepLearningCertificate
That seems interesting, but there are so many papers and little indication of which ones are more important (or which ones to implement first). I realize this is for advanced learners, but some guidelines, or at least a section pointing to survey papers would be really helpful as a starting point.
This is exciting! I went through Part 1 a few weeks ago (probably have to cover embeddings and RNNs again...) and felt it was totally worth it.
Part 2 seems equally strong in content (if not stronger). It's a beautiful time to be n00b in deep learning & AI, and learn via material like these. No excuses. Knowledge is power.
I've been looking to do part 1, so this is really cool - looking forward to this too! On http://course.fast.ai/part2.html the thumbnail for lesson 8 has specs for building a PC, with advice to use pcpartpicker. For part 1 I liked the idea of using AWS and only paying for a few hours, does part 2 have a hard requirement of a >$100s investment in hardware?
There is no requirement, but if you are spending a meaningful amount of money per month on AWS (over $100) and plan on working on DL projects for the next year or two it might make sense to make the initial up front investment.
For folks who've gone through part 1 and 2. Do you think the course provides enough material to tackle tasks like deep learning ocr [1] or custom object detection in images?
What is the goal of these trainings? To get a taste so you understand the conversation? There is a lot more to data science than neural networks, and I'm skeptical that teaching one family of models will create a set of implementers that don't compare and contrast solutions.
Based on the feedback I'm reading here about Part 1, I'm going to start recommending these courses to non-academic friends who have expressed interest in learning more about Deep Learning.
However, the first lesson took a bit of stamina to go through. Much of it was introducing basic Unix/AWS/shell/Python things I know intimately and have strong opinions and deeply set ways about. Shell aliases, how to use AWS, what Python distribution to run, running Python from some crazy web tool called notebooks (and not Emacs), etc. felt like I was forced to learn a random selection of randomly flavored tools for no good reason.
Yes, it's a random selection of tools. The good reason to bear them is that you'll learn how to implement state of the art deep learning solutions for a lot of common problems.
So, I ended up viewing the lessons not as "this is how you should do it", but rather as "here's one way to do it". And it does get much easier after internalizing the tools in Lesson 1.
Just something to keep in mind when branding this as "deep learning for coders". Coders have deep opinions about the tools they use :)