Hacker News new | past | comments | ask | show | jobs | submit login

R is hands-down the best language for data manipulation, analysis, and visualization: it's a language truly centered around treating data as a first class citizen. That focus does make some traditional programming workflows more error prone (helpful interactive data analysis features like vector recycling, flexible automatic type conversion, and non-standard evaluation provide lots of footguns), but the last decade of language improvements (stringsAsFactors = FALSE!) and R packaging ecosystem improvements have made the situation much nicer. The flexibility and lispy expressiveness of the language make it really fun to develop in, once you've gotten over the initial quirks.



100% agree, especially on the lispy expressiveness. I love that I can build analysis pipelines in a functional style, which has always clicked with me more than other paradigms.

Tidyverse is a godsend for at least getting initial data transformations sketched out and for gently introducing new users, but I do believe one should gain an understanding of how to do all of these things in plain R.


I agree with this. I wonder what it would take to let R spread beyond its niche into a more popular data science language. My worry is that with polars coming along, Python is catching up where it's behind, and staying ahead where it is ahead.


Have you used the Wolfram Language and if so, how would you compare the two?


I have. R is far less verbose and maps far better to data analysis. The Wolfram lang is far more expressive and powerful for symbolic computation. So basically, Wolfram for doing math Research, R for applied stats.


thx!


I have not. I started using R due to its open source codebase and ability to audit and understand exactly what its doing under the hood—being able to see how statistical formulae were implemented in code was invaluable in understanding and interpreting a package's analytical output.


thx!


R gives you a relatively simple set of tools that you can combine in powerful ways. The Wolfram Language seems to have a specialized function for everything, which is nice sometimes but it takes me longer to get started when doing exploratory data analysis, since I have to remember more nuanced stuff.

I absolutely love R. Once you get your head around data types and the 20 most important functions, you can do amazing things.


Is there a decent tutorial or book on getting over the hill? I can do some basic stuff in it but it's just not catching like other languages do.


My personal favorite resource is "R for Data Science" by Hadley Wickham. It covers lots of nice data manipulation and visualization examples, and provides a good introduction to the tidyverse, which is a particular dialect of R that's well-suited for data analysis. It's available for free at:

https://r4ds.hadley.nz/

For more specialized analytical methods there are lots of textbooks out there that provide a deep dive into packages for a specific field (e.g. survival analysis, machine learning, time series), but for general data manipulation and visualization it's hard to beat R4DS.


[1] will give you a more programming language-focused perspective, as opposed to many other R books.

--

[1] https://adv-r.hadley.nz/


An option to the Hadley book that also covers some nice statistical methods is Statistical Rethinking by McElreath. Not really available for free though but interesting read.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: