

Introduction to Data Analysis in R - danso
http://f.briatte.org/teaching/ida/index.html

======
danso
As someone who's teaching data-analysis-oriented classes, I peruse a lot of
open syllabi...and this was one of the most well-organized and easy to read
that I've found. I'm strongly hesitant to teach R...preferring Python because
it's closer to languages that I'm used to, and for its general-purpose
utility...but reading through these lessons, it's hard not to be awed by what
R can do, particularly in visualization.

This demonstration of using R to geocode (via the ggmap extension of ggplot2)
was particularly cool (and also, as an example of the OP's organized notes,
includes a copy of the data since the original link went dead):
[http://f.briatte.org/teaching/ida/101_geocoding.html](http://f.briatte.org/teaching/ida/101_geocoding.html)

~~~
baldfat
I would strongly recommend looking at R. I started using Python and Pandas and
when I ran into issues with work requiring M$ Office documents R just amazes.

Also the amazing growth in R in just the last few years.
[http://www.tiobe.com/index.php/content/paperinfo/tpci/R.html](http://www.tiobe.com/index.php/content/paperinfo/tpci/R.html)
(O know that ranking is not the greatest argument for a language BUT it does
show (somewhat) its growth. Specifically the flexibility of R (12 ways to do
one things) has allowed it to evolve quickly and the libraries are just
amazing. RStudio has changed R with Hadley Wickham's ggplot2, dplyr, reshape2,
tidyr and etc. It just makes the the language do so much and change so
quickly.

I use to be in love with all things Python and now I still respect Python and
Pandas but I kind of gone to more domain specific tools.

~~~
SharpSightLabs
I also highly recommend R.

Dplyr and ggplot2 (noted by baldfat) are exceptional.

I recently wrote a tutorial on dplyr here:
[http://www.sharpsightlabs.com/dplyr-intro-data-
manipulation-...](http://www.sharpsightlabs.com/dplyr-intro-data-manipulation-
with-r/)

To put this simply, dplyr's syntax is set up to create streamlined workflows.
All of the major data management tasks (sort, subset, group, summarize) are
easy to do. And they can be "chained" together (much like using pipes in
Unix).

Ggplot (another R package) is an amazing data visualization tool. The syntax
has a deep underlying structure, based on the Grammar of Graphics theoretical
framework. I won’t go into that too much, but suffice it to say, when you
learn the ggplot2 syntax, you’re actually learning how to think about data
visualization in a very deep way. You’ll eventually understand how to create
complex visualizations without much effort.

GGplot and dplyr are the reason I settled on R (instead of Python). When you
use them together (again, using "chaining") you can explore your data rapidly
and also create really high quality analyses.

------
misframer
I'm a little confused about what exactly this is. Take the time series
section, for example. The entire section seems to only have examples. The only
reading listed is the "R Cookbook," and it's also mainly examples.

From the syllabus:

> _The aim of the course is to show how to perform elementary data analysis in
> the social sciences._

I feel like the time series section doesn't teach basic time series analysis
at all. For example, they show plots of the ACF and PACF without going into
detail about what those are and how they're different. I don't think that's
helpful.

This looks like a nice set of examples, but far from an actual course!

