
R generation - kgwgk
https://rss.onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2018.01169.x
======
FranzFerdiNaN
I've tried both R and Python for data analysis and data wrangling and the
experience with R was a whole lot better for me (I work as a business
intelligence consultant). Python is really steeped in programmer culture,
which you notice in everything from its user base to its manuals. Which is
great for people who want to learn programming or study CS or so.

But I'm not a programmer and I don't want to be. My biggest program ever
written is maybe 150 lines to covert a mess of excel files to something
usable. Programming is a necessary means to an end for me to do my job, but
every time I tried to learn actual programming it bored me to tears. R, while
I fully admit has a syntax which is clunky at best (although the tidyverse
really helps here), does exactly what I need. Its users are all active in
roughly the same space (data analysis in all its forms) so finding help is
easy. The community is incredibly friendly, easily found on Twitter with the
#rstats tag. It has packages for everything I would ever need. Basically, R
helps me get things done, which is all it needs to do for me.

~~~
fritkot
I think your post illustrates well why both languages are important to have. I
also work in the Business Intelligence - space, but my job is usually more
focussed on setting up data pipelines and developing a small specialized web
app here and there. So if I'm already doing that, and I do analysis on top of
that, of course I'm going to pick Python, because I can stay in the same
ecosystem and I already have a better feeling for how things are going to
work. But this wouldn't make sense at all for someone who generally works more
on strategy and decision making and wants to dig a little deeper from time to
time.

Although when people ask me for a recommendation on what to learn, and they're
actually interested in programming and building things themselves, I always
recommend them to at least give Python a shot and see if they like it, as it
then also gives them the possibility to do a lot of other things with the
language.

------
clircle
I'm a statistician that knows just enough about computer programming to be
dangerous. It's hard to overstate how incredibly important R is to my
discipline. No other language allows me to express my thoughts and prototype
statistical ideas quickly and effortlessly.

If statistical programming was still stuck in the SAS era, I think there is a
non-zero probability I would have lost interest in the profession years ago.

~~~
et2o
Glad you're still interested! I agree–R is light-years ahead of the
competition (Python, essentially) in most statistical domains. It seems to fit
the workflow better.

~~~
skydaddy
Light years ahead? It doesn't even support 64-bit integers. The R language
cannot count past 2.1 billion. In 2018. Don't get me started on the package
system.

~~~
clircle
I'm always hearing about these and other problems that programmers have with
the R language. In practice, none of them seem to be a big deal to
statisticians.

64 bit integers? Never had a need for this, but maybe there is a package that
can add them in?

What's wrong with the package system? I maintain an R package on CRAN and I'm
quite smitten with the package system.

~~~
srean
> I'm always hearing about these and other problems that > programmers have
> with the R language. In practice, none > of them seem to be a big deal to
> statisticians. > 64 bit integers? Never had a need for this

This kind of thinking is pervasive in a wide corner of statistics and this is
pretty much why machine learning stole its crown, that's why I get to hear
quips like "Statistics ? Is that even relevant ?"

~~~
nightski
Crown of what? They are complementary fields. Try doing ML without any
statistics and lets see how far you get.

~~~
srean
They are closer than complementary and that is exactly what I am trying to
point out.

How much funding, support, mind share, conferences, venture capital yadda
yadda does stats get compared to ML -- thats what I mean by losing its crown.
All that ML has now could easily have been stat's were it not for the attitude
I draw attention to.

Isnt that sad, given that ML in many cases is just stats without the baggage
"R can solve all of my problems and whatever R cant solve isnt a stats
problem"

~~~
nightski
Yes I agree wholeheartedly, sorry I originally missed your point. I do think
it's more of a DL vs. the world problem than stats vs. ML but everything you
said rings true. I was kind of odd in that I started with DL and then became
fascinated with statistics and lost focus with DL. Statistics is just so much
more useful outside of the niche domains where DL is extremely effective.

~~~
srean
> Statistics is just so much more useful outside of the niche domains where DL
> is extremely effective.

This. Sorry had to.

------
amrrs
R and Rstudio have been my primary bread and butter maker. You might argue
that Rstudio has financial interest in growing R community but it's fair and
it has helped overall R as a language entirely (not just as statistical one).
Hadley Wickham's _tidyverse_ is an amazing example of how one can simply get
stuff done. Along with %>% pipe it is almost like idiomatic python. The
ecosystem is amazing, I can write a web scraper in a couple of lines of code
as simple as beautifulsoup. I think that's the strength of R. The intuitive UI
and easy setup of RStudio just help someone get started with no hassle. That's
a key I believe. I have also trained non-programmers and they managed to do
data analysis in R using tidyverse in a matter of a month or two max.

------
j7ake
In my opinion, R is the best language for quickly exploring, iterating models,
validating the models, and repeating the cycle. I would be curious to hear
whether Python, which I use for preprocessing the data into flat tables, is as
good as R in this domain.

Also ggplot2 and tidyverse packages are powerful for accelerating your
iterative cycles.

~~~
curiousgal
Absolutely not! If you're given a dataset and you're trying to fit a model or
even run exploratory analysis, Python doesn't even come close to R. Even with
Jupyter, the experience is not as smooth as R+Rstudio.

~~~
laichzeit0
Except if you like Vim keystrokes. The R Studio implementation (as I
understand it's just the ACE library) is really crap.

I find even the Juptyer-notebook Vim plugin is better than the R Studio
implementation :(

~~~
j7ake
I use Rstudio with its vim implementation. I would be curious to know what are
the things that are missing compared to vim?

~~~
laichzeit0
There's a bunch: [https://support.rstudio.com/hc/en-
us/community/posts/2008793...](https://support.rstudio.com/hc/en-
us/community/posts/200879313-Better-vim-mode)

The most annoying one for me is the way x behaves. Go to the end of a line.
Press x (delete character). It's supposed to delete the character, then move
the cursor to the left, so you can delete the next character. It just stands
there. That's wrong (try Vim or Visual Studio with vim keybindings). It just
annoys me a lot because that's how I've always gone about deleting chars
instead of using backspace or using ctrl-h.

------
remarkEon
R was the first language I learned (I ... do not recommend this) and while I
don’t use it Day to Day at work, I’m always using it for fun side projects
about everything from baseball statistics to urban planning. I’m not a
professional statistician, by any means, but fully appreciate the work the
community does with this toolset.

------
zestyping
It's really interesting that there are such divergent opinions on R and
Python. I'm curious why some people seem to find R incredibly frustrating and
Python entirely sensible, and others find R miraculous and smooth and Python
inaccessible.

Is there some set of improvements to Python that would make it popular with R
users, or vice versa? Or is it a fundamental difference in taste that will
never be bridged?

~~~
krutulis
I've been programming for a long time and had to learn R a few years back for
a series of projects. The documentation and community seemed to me, as a
newcomer, to be very focused on how to solve particular statistical and
modeling problems, and I was almost always able to do what needed to be done
relatively quickly. (Python documentation and communities, in contrast, span
all kinds of applications that can be difficult even for an experienced
programmer learning Python to sort through and evaluate. I can also imagine
the migration to 3.x has been a challenge for newcomers.)

Although I enjoyed learning and using R, as a CS person I was bothered that I
understood how to do X in R, but I had no clue about what was happening when I
did X. I found this paper to be particularly useful in describing R from a CS
perspective: [http://www.lirmm.fr/~ducour/Doc-
objets/ECOOP2012/ECOOP/ecoop...](http://www.lirmm.fr/~ducour/Doc-
objets/ECOOP2012/ECOOP/ecoop/104.pdf)

------
_Wintermute
R was the first programming language I learned and I still use it a couple a
times a week for work, but it wasn't until about 2 years into using R when I
started learning python that I realised programming isn't difficult and
frustrating, R is just badly designed.

------
Cieplak
If you want to sprinkle some type safety onto your R, check out this lovely
piece of kit:

[https://tweag.github.io/HaskellR/](https://tweag.github.io/HaskellR/)

------
sampo
Thinking about the long run: Is there any version of the future where the best
and largest library of statistical procedures could be somewhat language
independent, and not organically part of a small language that itself is an
unholy alliance of a matlab-like quick-and-dirty syntax and a hobbyist-level
scheme interpreter runtime?

~~~
disgruntledphd2
If you really want this to happen, you should probably start contributing to
Julia.

~~~
msaharia
There is an xkcd comic for this one on JavaScript Frameworks.

