

R vs Python Speed Comparison for Bootstrapping - jonbaer
http://climateecology.wordpress.com/2013/08/19/r-vs-python-speed-comparison-for-bootstrapping/

======
chewxy
The comments here are better:
[http://climateecology.wordpress.com/2013/08/19/r-vs-
python-s...](http://climateecology.wordpress.com/2013/08/19/r-vs-python-speed-
comparison-for-bootstrapping/)

~~~
rm999
Thanks for the link. The comment that indicates how bad the comparison was:

>Hadley Wickham: It’s a bit of an unfair comparison because you’re comparing
R’s high-level lm function with Python’s low-level sm.OLS. If you use R’s low-
level equivalent, lm.fit, you’ll find R is much much faster (~10s -> ~1s) on
my machine.

~~~
acchow
Thanks for this. Everyone I know who uses R raves about how fast it is, so it
was surprising to see the claim go the other way around.

~~~
srean
Are you serious ? Rarely have I heard "R" and "fast" mentioned in the same
sentence (the current one being one such exception:). In fact, only in the
world of R would anyone call Python super fast. That line made me chuckle. R's
strength lies elsewhere, no other programming language has such an
exhaustive/encyclopedic ecosystem of third party statistical libraries. And of
course there is ggplot.

Regarding speed comparisons I would also like to add that such comparisons end
up being a pissing matches where both camps end up comparing external
functions implemented in C. So at some point it stops being comparison of the
core language implementation.

I do some decent amount of statistical and machine learning related coding and
I dont like R. Its not the speed that I mind, or the lack of. I find
programming in R very error prone. It could well be a personal failing, but I
cannot keep track of its weird inconsistencies and special edge cases. The
other gripe I have about it is that there are very few books that tries to
teach it systematically, more often than not these books are styled as a grab
bag of tricks and incantations.

The speed problem of R is in the process of being addressed by quite a few
alternative rewrites. Radford Neal is writing one, Luke Tierney is writing
one, then there is Riposte.

A side comment I would like to make is that it always warms my heart to know
that people who have made their distinguished mark in stats theory (Neal and
Tierny certainly have) also have a lot of interest in programming. In fact I
have observed that if they are really really good, there seems to be a
correlation. Though this is clearly anecdotal. I bring this up because there
is also the "all talk no code variety" and I have no problems with that if
that "talk" was good. My problem is with the schmoozing variety, sometimes
inhabiting positions of influence and prominence, that has no idea what they
are talking about, their knowledge seemingly gleaned form reading overstated
popular articles about algorithms and techniques, the cosmopolitans of
research if you will. My advice to aspiring grad students, learn to recognize
them and stay away if you do not want to be them. A good signal is to check
how many recent first author or single author paper does the researcher have
and whether he or she writes nontrivial code. This is usually indicative of
how much commitment and excitement the person has for his/her craft.

~~~
rm999
>there are very few books that tries to teach it systematically

This book helped me a lot because it goes into how the core language works,
something that is (sadly) almost always ignored by people who learn and teach
R: [http://www.amazon.com/Software-Data-Analysis-Programming-
Sta...](http://www.amazon.com/Software-Data-Analysis-Programming-
Statistics/dp/0387759352)

As far as speed, R is a lot like similar numeric tools/languages. It's
designed to be very high level, building on top of heavily optimized
subroutines written in lower level languages. This is why the kind of
comparisons in the article are silly: almost every language will end up being
the same speed because they're all just calling the same basic BLAS routines N
times with virtually zero overhead (relative to the inner loop).

------
tuananh
Would you add Julia for comparison?

