This script allows one to batch download videos for a Coursera class. Given a class name and related cookie file, it scrapes the course listing page to get the week and class names, and then downloads the related videos into appropriately named files and directories.
Why is this helpful? Before I was using wget, but I had the following problems:
1. Video names have a number in them, but this does not correspond to the
actual order. Manually renaming them is a pain.
2. Using names from the syllabus page provides more informative names.
3. Using a wget in a forloop picks up extra videos which are not posted/linked,
and these are sometimes duplicates.
Inspired in part by youtube-dl (http://rg3.github.com/youtube-dl) by which I've downloaded many other good videos such those from Khan Academy.
Let me know if you like it.
In that case i'll still have a weekend project.
I see what you mean though, it's not really full NLP either way, I just used that term in place of regular expressions because it was in the NLP class that I learned about them (first homework is a phone and email scraper.) Probably my fault for using semantics wrong.
By the way, congratulations on your script, jplehmann! Wish I had found yours before losing time doing mine...
It's back up, must have been a small glitch. Might I add that I love the fact the script picked up on the video I dropped earlier.
I think coursera really needs to come out with a native solution and a standard way of numbering/organizing videos.
To me the most simple & quick was this bookmarklet.
Edit: Tested, yeah it does, and won't skip a video if it was only half downloaded.