
Coursera courses preserved by Archive Team - mihaitodor
https://archive.org/details/archiveteam_coursera
======
white-flame
It's great that there's a central place for this, at least once it's organized
sensibly.

The stupidest, most counter-productive aspect of all these MOOC systems is the
artificial schedule imposed on the course. While I've been able to take a
couple to completion, I've had to let some by the wayside due to scheduling.
Once that happens, you're disincentivized to catch up because of being behind
and those that affect scoring. When I've gone back to finish courses that I
had to leave by the wayside for the moment, sometimes I've lost access to the
materials because the course has shifted to its next "semester". There's
absolutely no reason for that. While there are a few courses like music or
writing where you are collaborating or cross-reviewing other people's work,
most of them are standalone lectures, homework, and tests.

~~~
beams_of_light
That aspect is frustrating. I took all of Dr. Chuck's "Python for Everybody"
courses, and have lost access to the videos. There have been times when I
would have liked to have reviewed the videos, because while I did learn the
material and perform the work at the time, I hadn't used that knowledge much
in the months since. I feel as though I'm losing some of it, and would like to
review. Having paid $300+ for the courses, I don't feel I should have lost
access to the materials.

~~~
0xmohit
This seems to be what was indicated in
[https://news.ycombinator.com/item?id=11881767](https://news.ycombinator.com/item?id=11881767)

I was under the impression that would remove access to old courses by June 30
for those who hadn't paid. Removing access for those who've paid goes too far.

~~~
chii
this is why i always (try to) work out a way of downloading an offline copy of
the videos/course notes.

------
dhawalhs
For those unaware, Coursera shutdown their old platform on Jun 30th [1].

Many of the courses on the old platform are slowly coming back on the new
platform. When I built the list [2] of courses on the old platform the course
count was 472, now its around 390. Some of the notables that I was excited to
see come back are:

Neural Networks for Machine Learning with Geoffrey Hinton [3]

Computer Architecture from Princeton [4]

Programming Languages from UW by Dan Grossman [5]

Introduction to Natural Language Processing by Dragomir Radev [6]

Many of these courses were last offered a couple of years ago. Hopefully more
courses form the list [2] start coming back.

[1] [https://www.class-central.com/report/coursera-old-
platform-s...](https://www.class-central.com/report/coursera-old-platform-
shutdown-download-courses/)

[2] [https://www.class-central.com/collection/coursera-old-
stack-...](https://www.class-central.com/collection/coursera-old-stack-all)

[3] [https://www.coursera.org/learn/neural-
networks](https://www.coursera.org/learn/neural-networks)

[4]
[https://www.coursera.org/learn/comparch](https://www.coursera.org/learn/comparch)

[5] [https://www.coursera.org/learn/programming-
languages](https://www.coursera.org/learn/programming-languages)

[6] [https://www.coursera.org/learn/natural-language-
processing](https://www.coursera.org/learn/natural-language-processing)

------
Rondom
If you want to help Archive Team in its efforts to preserve disappearing
content and have some bandwidth to spare: Run an "Archive Team
Warrior"-Appliance! This way you can help downloading all this content that is
about to disappear!

Or do you have some Digital Ocean promotional credits left, that are about to
expire? Spin up a (few) VMs with docker-containers running the warrior on
DigitalOcean!

[http://archiveteam.org/index.php?title=Warrior](http://archiveteam.org/index.php?title=Warrior)

~~~
yq
I started the warrior VM after seen this message.

However it seems whatever they are crawling cannot handle this type of mass
distributed crawler, for the past few hours My computer haven't done anything.

------
peatfreak
You know what would be AMAZING? If there was a Coursera course (or some other
MOOC course, books, etc) that explains how archive.org works from the
foundational technologies upwards. So, like, you could build your own mini
version of archive.org as an exercise. It'd be a fascinating project and could
be a great case study in web archiving techniques and information retrieval.
Does anything like this exist already?

~~~
Godel_unicode
And then archive.org could archive it, and you would achieve inception.

------
nym
Archive.org accepts all kinds of donations.

[https://archive.org/donate/](https://archive.org/donate/)

Credit Card, PayPal, Bitcoin. Brewster is an amazing Steward of the Internet
Archive.

Video Tour: [https://vimeo.com/59207751](https://vimeo.com/59207751)

------
sharmi
All the names are 'Coursera Curses' instead of 'Coursera Courses'. Someone
there must have been really upset with Coursera's approach :) .

~~~
beardicus
Most Archive Team efforts are given a punny name (at least for the associated
IRC channel), usually the offending organization's name twisted into something
mildly negative. Coursera > #cursera, Soundcloud > #soundclown, etc.

------
marai2
This is incredible, I was just on the Coursera website trying to go to my old
courses to continue from where I've left off, but I couldn't get to them. The
link to my previously enrolled courses was taking me to the newer version of
those courses which haven't started yet. So I thought I'll search HN because I
remember reading that someone was archiving these ... and boom! it's the top
link on the front page! Yay HackerNews!!! There is some Voodoo-AI going on at
HN!

~~~
marai2
If anyone wants a quick index of the courses in those 45G webarchive files
here is the index of course slugs. Unfortunately it looks like the webarchive
slugs don't match the Coursera slugs anymore so I couldn't pair the names with
more readable course titles:

[https://gist.github.com/marai2/b548ce70b6af4789522c6ef5e54c6...](https://gist.github.com/marai2/b548ce70b6af4789522c6ef5e54c6bbf)

~~~
mihaitodor
Those slugs that you extracted don't seem to be correct. I wanted to match
them with the ones from here:
[https://docs.google.com/spreadsheets/d/1kaWxZG3krI83WfdzlExW...](https://docs.google.com/spreadsheets/d/1kaWxZG3krI83WfdzlExWL-
_S5UBai-CYuofGR5DZcd0/edit?pref=2&pli=1#gid=1433118913)

------
chris_wot
Whilst I hate some aspects to MOOCs, the fact is that I spent about $120 to
learn the basics if SAP, whereas if I'd gotten "proper" training on the same
subject matter it would have cost me thousands.

I'd love to see a site that specialises in user contributed content along the
lines of Wikipedia. It's funny though - take SAP as an example: I'd be just as
happy reading a book that explains it all better than what is out there right
now! A book that assumes you are into technology but have little skills or
knowledge of the business processes that SAP gets intoned in, and which gives
you a rundown of this before giving a detailed rundown on how SAP implements
these processes.

Sadly, no such thing exists, but happily for me I stumbled upon
[http://www.accountingverse.com/](http://www.accountingverse.com/) and
[http://www.accountingcoach.com/](http://www.accountingcoach.com/) (no, I'm
not affiliated with them in any way) and it turns out they didn't cost
anything and I finally "get" double-entry book keeping, financial transaction
concepts like the general ledger, journal, accrual method and the fundamental
accounting equation. Wish I'd known this earlier to be honest - as I say, I
lament that there are no books on SAP core modules that go from concepts to
the nuts and bolts of how SAP does things :-(

------
ipsum2
Archive.org does amazing work, I would highly recommend donating to them if
you can.

~~~
ris
The actual _work_ done here was by ArtchiveTeam, which is not the same thing.
Archive.org are just doing the hosting of the end-results.

------
tgarma1234
I have taken a couple of Coursera courses on R and Stats. They basically give
you a brief outline of some topics you might want to pursue more in depth and
they give you access to a discussion forum. I haven't found that this method
of learning/teaching is very useful. There seems to me to be a huge
opportunity waiting to be developed if someone can make a site like that but
with more interactive elements AND where the learning/teaching is based on
sound educational principles that can be demonstrated to effectively result in
skills mastery. As it is now, Coursera is basically skimming cash off of the
internet's insatiable google searching for information, like for example
someone might google "Learn R" and then fall into the trap of paying $49 for a
class that consists of nothing but videos really without having a clue about
whether or not the videos really work to communicate knowledge or even whether
the videos are touching on anything meaningful. If it hadn't been for the
"Johns Hopkins Data Science Course" branding on the class I signed up for I
wouldn't have fallen for it I am sure.

~~~
mindcrime
_If it hadn 't been for the "Johns Hopkins Data Science Course" branding on
the class I signed up for I wouldn't have fallen for it I am sure._

Just to provide a counter-point.. I've taken 5 or 6 of the classes in that
sequence, and have found them well worth every penny I've spent so far.
Probably you could argue that the same information is available elsewhere for
free, but the classes have worked for me, and the way I study and learn.

Obviously YMMV, but they've been a bargain from my perspective. I think
because, if nothing else, they provide some structure, sequencing and a token
measure of accountability... whereas if I just said "Hey, I'm going to teach
myself R from this book" it would be a lot easier to loaf around, waste time
reading HN instead of studying, etc.

That said, I don't argue against the idea that online learning could still be
_better_. In fact, I don't think we've even come close to tapping the full
potential of this stuff.

~~~
jclos
I agree with you, while they are far from the optimal to consume something
there is always some need for alternative ways of learning, if only to address
the variety of people who would like to learn. I find that I retain content
better when I learn it in multiple ways (explore using MOOCs and podcasts
about it, go deeper into it using books and practical work if applicable) and
I'm sure it's the same for most people.

~~~
mindcrime
Yeah, same here. I'm doing the Johns Hopkins Data Science classes on Coursera,
as well as the Duke "Statistics with R" series, but I've been supplementing
that with both dead-tree books ( _R in Action_ among others) and other videos
(like the Professor Leonard ones on Youtube), reading Wikipedia articles,
etc., etc. Gaining a good understanding definitely involves attacking the
problem from multiple angles in my approach. :-)

------
govindpatel
How can i use this? Those file are so big. Is there is any way I can download
only courses which I want?

------
philippnagel
Is there a way to torrent/mirror the data from archive.org? Storing all of it
in a central repository seems counter-intuitive to me.

~~~
detaro
Individual pieces have torrent links. (And there probably are scripts to fetch
an entire category somewhere...)

------
unreal37
This is a collection of torrent links of copyrighted material? Is that right?

I guess I'm asking, how is this legal?

------
RCortex
Does anyone know if they archived the webpages, assignments, and quizzes too?
Or did they just manage to download the lecture videos? I'll try downloading
it myself and checking, but I don't have the fastest internet connection.

~~~
mihaitodor
I believe they grabbed everything. Do a quick search through this script:
[https://github.com/ArchiveTeam/coursera-
grab/blob/master/cou...](https://github.com/ArchiveTeam/coursera-
grab/blob/master/coursera.lua)

If you manage to download one of the archives, please let me know what exactly
is contained in it.

------
616c
I will now commit to myself and make it known here:

If I am forced to buy one of these new Coursera certs, I will donate every
time to the Archive Team.

------
united893
Is anyone able to navigate this at all? I'm looking for Pedro Domingo's
machine learning course.

[https://class.coursera.org/machlearning-001](https://class.coursera.org/machlearning-001)

I know there are others but I really like his method of teaching, and can't
seem to be able to find an archive of it.

------
avodonosov
I would pay several dollars to keep some courses I took in their original form
(even archived form, no new edits or posts). I guess many other course
participants would too.

That might be a source of income for coursera, probably enough to cover
operational expenses of running the old platform with old content.

------
enraged_camel
Maybe I'm missing something here... but where are the actual course titles?

~~~
deftnerd
I think this is just the unorganized data dumps direct from the archive team.
Most likely someone will now go through and combine all the dumps and make it
more usable and put it at a different URL

~~~
enraged_camel
I see. That makes sense, thanks. :)

------
Joof
Thank god. Geoffrey Hinton's RMSProp for deep neural networks is still cited
in papers from his slide on his coursera course (the only place it was
published AFAIK). It would be a shame to lose that forever.

------
satyajeet23
That's really amazing.

------
suyash
How to tell which class is it though? The title doesn't give away.

~~~
mihaitodor
Have a look here:
[https://news.ycombinator.com/item?id=12062989](https://news.ycombinator.com/item?id=12062989)

------
reachtarunhere
It is great that someone decided to act on it.

~~~
awqrre
there is many people that acted on this but I'm glad that
[http://archive.org](http://archive.org) exist and hope that they never
disappear or get forced to delete content...

~~~
chris_wot
Couldn't sci-hub mirror this?

------
gameofdrones
Welcome to encouraging bitrot.

------
coderdude
Has this archive been licensed properly? None of the other comments brought
this up that I saw and the site doesn't appear to mention licensing whatsoever
(which is very odd). I like archive.org but they've always seemed to have had
a _whatever_ stance towards licensing and copyright. Is this santioned or just
a wild west effort?

~~~
beardicus
The Internet Archive is a library, and thus has a bit more flexibility
regarding copyright issues. They also tend to have a "save it first and answer
takedown notices after" philosophy. The Archive Team (not an official Internet
Archive organization,) doesn't give a shit about copyright, or rather they
ignore it in a pragmatic effort to save more content.

