Big Data University: Free Database And Hadoop Courses (bigdatauniversity.com)
35 comments

No one has said it yet, but this is awesome. As someone working through bunches of books at the moment, it's great to see some kind of structure to at least guide me while I go about my own things.

Even if I don't use the courses fully, I'll certainly take bits and pieces as I go forward.

I'm interested in dabbling into this and am wondering what books are you looking at/recommend?

Sorry about the confusion - 'Big Data' is something I haven't delved into at all yet - I'm following a discrete maths course along with JUST getting into Ruby/Rails & Android development. I'm new to CS, and only recently followed along with www.cs50.net last fall. I did, however, ask my brother-in-law, whose main focus is big data in the medical field, about it. I'll post his recommendations if any later on - so check this in a day or so if you can.

If you're taking CS in school, make sure to take Linear Algebra. If you're self-taught, The khanacademy.org Linear Algebra playlist is invaluable.

Thanks for the heads up, I downloaded a cache of courses that were recorded and made available online some time ago - one of those being Linear Algebra. I'll have to work through it soon!

It really depends on what your background is / what context you're learning in.

My background's in CS and I've mostly been a web developer for several years. Only know the normal SQL stuff but wouldn't consider myself an expert. I'm also interested in OLAP but I can't seem to find anything on it other than what it is.

If you ever wanted to dabble in Hadoop take a look at Hadoop Fundamentals I http://bigdatauniversity.com/courses/course/view.php?id=301. And if you'd rather do hands on exercises on the cloud instead of installing all this stuff on your laptop take Hadoop on Amazon Cloud course http://bigdatauniversity.com/courses/course/view.php?id=309 and get $25 from Amazon. Free courses plus $25 bonus is a pretty good deal.

Just to clarify, you get a $25 credit for Amazon Web Services which is meant to offset the cost of the AWS you use during the course. It's not like a $25 Amazon gift card.

Just remembered, Cloudera has some nice tutorials & videos on Hadoop, Map Reduce, etc


I know Hacker News comments tend to go stale quickly, but this should be in here as well:


Cloudera was my number one source of Hadoop information while I was working on getting my first cluster going and useful. I would highly recommend them as a resource.

Could you give us more insight in the differences between the Cloudera videos and the BDU courses?

I haven't had time to watch the BDU courses so I can't give any opinion on them though I plan to watch them when I get some time tomorrow. Having a quick glance though the BDU ones appear to have more hands on material. The Cloudera videos are parts of their on-site course which would have the hands on stuff but that has been trimmed from the videos.

I guess it depends on how you best learn. For me the Cloudera videos gave me a good overview and understanding of the various to prepare me to dig in deeper. Combined with the Hadoop book I was able to setup and run a small(12-node) cluster and use it for data storage and report generation.

Thanks for providing these amazing resources & that too completely FREE. woot !

But the UX on the site is terrible. If & when time permits, please take a look at the Coursera, Udacity, Codecademy, Udemy, Lore(lore.com, formerly coursekit) sites.

User engagement is directly proportional to the usability & smooth, pleasing UX(user experience)

FWIW, this is running on Moodle, a very popular e-learning platform.

Does somebody know how to get access to the courses that require an "enrollment key" like this one: "spreadsheet like analytics" http://bigdatauniversity.com/courses/enrol/index.php?id=462

It seems that as the amount of data being produced continues to expand at an unprecedented rate, it will become essential to master Hadoop which seems to be the gold standard for managing big data - would anyone disagree with this?

Well, you will encounter data in many forms. Sometimes it will already have been "hadooped-down" by someone else, and you can analyze it on a single machine. Don't underestimate what a single machine can do these days, if you have say 16 cores and 32 GB of RAM.

Or you can set up a system that will incrementally summarize the data, and then you could do smaller queries against those summaries. That is the goal of Storm AFAIK.

I think that is better model for a lot of applications. The model of having your production systems save terabytes of raw data and then analyzing it in a big batch job leaves a lot to be desired. It works but it's not very flexible and has this latency problem.

Hadoop is good in that it's the only open source solution I know of that can churn through hundreds of terabytes of data. But I wouldn't say it's a complete solution for "managing big data". It's part of one.

While it is sometimes important to be able to manipulate large datasets, every single competition on Kaggle can fit in a laptop's memory.

I wouldn't be surprised if Hadoop (or another Map/Reduce implementation) becomes a sort of "assembly language" for big data, with higher-level abstractions built on top. You can already start to see this happen with Hive and Pig.

Not to be a nitpicker, but you've spelled "enroll" incorrectly several times. As an education-focused initiative, you should probably change this.


Enrol is acceptable in non-US English.

Interesting - was unaware of that. Thanks.

As a foreigner I'm aware of the neighbor/neighbour distinction, but I would definitely consider 'enrol' an error.

He probably didn't see the connexion to British-English.

I believe that's the British/Canadian spelling.

I logged in with Google auth, and it still sent me an e-mail to confirm my e-mail address. That's unnecessary.

I'm really glad a resource like this exists, though. Looking forward to working through the Hadoop Fundamentals course.

Anyone taken the cloudera courses and can compare this to it? My employer is going to spring $2800 for a cloudera hadoop training. I wonder if it's worth it.

This is really interesting, but who's behind this? I can't find any clue in the about / contact us pages.

To answer my own question, articles and downloads all point to the IBM website, and the first lesson teaches you to "Get started with Hadoop-based data analytics on IBM Cloud" so it's clearly IBM pushing for their BigData solution based on Hadoop.

oh man. moodle pains.

