

Why R Doesn't Suck - paulgb
http://paulbutler.org/archives/why-r-doesnt-suck/

======
dododo
i've used R for several large projects. it sucks in several ways for the
projects i worked on:

1\. it is extremely slow for numerics. slower than MATLAB. slower than python
(with numpy). after talking to several stats people, it seems pretty much
everyone ends up writing most of their code in C when using R. (in contrast, i
didn't find this necessary in python or MATLAB for similar projects.)

2\. its syntax is quite clunky. want to concatenate two strings?
paste(string_a, string_b, sep=''). radford neal has a series of blog posts on
R's design flaws: [http://radfordneal.wordpress.com/2008/09/21/design-flaws-
in-...](http://radfordneal.wordpress.com/2008/09/21/design-flaws-
in-r-3-%E2%80%94-zero-subscripts/)

3\. uninformative error reporting. by default, the stack trace isn't printed.
even if it is, often the errors don't really tell you what went wrong.

i don't see any advantage to R over python. yield makes a nice replacement for
the lazy evaluation of R.

~~~
thesnark
I agree with all of your points, yet I still find myself using R on a day to
day basis instead of python. My reason for doing so is the large number of
statistical packages that exist for R. I know that scipy exists, but it does
not seem to have nearly the same coverage as CRAN + base libraries... Are
there other python libraries I should look at?

~~~
dododo
it's true that CRAN+Bioconductor have a lot of coverage of statistics. if you
need to fit a particular kind of model and it's not using MCMC then perhaps
it's the best you can do if someone has already written the code.

we use sage (<http://www.sagemath.org/>) which includes a lot of the common
python numerics: numpy, scipy, matplotlib, networkx. it provides nice
interfaces to R and gp/pari. i also use mayavi2 for 3d plots (something i
could never get R to do well under linux...) enthought have a lot of nice
things for python and scientific computing. there's also pymc which i've not
used (i just write the MCMC code directly).

------
ggruschow
Are useful and powerful the opposite of sucky to most people?

Not to me. Consider a parallel example:

Visual Basic (pre-.NET, think ~3.0) opened up programming for GUIs to a much
much broader audience (because Hypercard was made for a far less popular
system). VBmade embedding hugely powerful applications within your own, remote
object calls simple, etc. It was the most popular programming language.

It was incredibly successful, useful, and powerful. None of that could change
the fact that it fundamentally sucked though.

It sucked so much though that Microsoft killed it. They killed their most
popular programming product ever. They killed the product that was used to
make most of the applications for their platform, their office suite, their
web server, their database server, etc. In the transition to .NET they
apparently they felt its foundations were so fundamentally flawed that they
had to redesign the language.

~~~
phaedrus
I agree with your thesis but I wouldn't classify classic VB as particularly
powerful either. Usually when people say a language is powerful, they seem to
mean that a small number of primitives can be combined to produce a large
variety of structures or that brief code can accomplish a lot of work.

Maybe a better term for powerful would be "power to weight ratio".

~~~
ggruschow
You're right: it's primitives sucked. However, VB let people do a huge amount
of work _without_ code: Think the GUI designer, VBX/OCX controls, and OLE
embedding. That's the power i was talking about, not a way to implement
advanced algorithms. (People used freaking SQL databases to get passable data
structures... Ugh!) Most code I've seen people writing was more related to
that gluey crap than advanced topics.

Side-note: Most of my work in the past year has been done in R.

~~~
derefr
Those weren't really parts of the _language_ , though; they were parts of the
_platform_. We inconsistently compare languages (i.e. Java) to platforms (i.e.
the JVM), which is, I think, a lot of the reason we're so bad at comparing
both.

------
carbocation
> As in Haskell and O’Caml, operators are just syntactic sugar for ordinary
> functions. Enclosing any operator in backticks lets you use it as if it were
> an ordinary function. For example, calling `+`(2, 3) returns 5.

This is awesome. This is probably one of those things that I should have known
but didn't. It strikes me as being very useful in combination with the 'apply'
functions.

~~~
masklinn
> It strikes me as being very useful in combination with the 'apply'
> functions.

It's very useful with higher-order functions in general. Even more so because
most of the languages with operators simply being (binary) infix function
calls also default to curried functions[1], which you can easily partially
apply.

Haskell also has the reverse operation (MLs probably have it as well) of being
able to use a binary function as an operator: "a `foo` b" is equivalent to
"foo a b", but sometimes reads much better.

[1] <http://en.wikipedia.org/wiki/Currying>

------
tel
To me it's a statistical calculator with beautiful graphing capabilities. Once
the question becomes difficult enough to consider it programming, it's time to
pull up numpy.

------
makmanalp
Anyone who thinks R sucks obviously hasn't used the ggplot2 package for it:
<http://had.co.nz/ggplot2/> !

The neat thing about R is that it supports the functional paradigm, but it
doesn't bash you on the head with it. My fellow programmers who are not
familiar with lazy evaluation, continuations, list iterators (is that the
right word? such as map / filter / fold) can still use it without feeling like
they're missing an arm.

~~~
silentbicycle
> is that the right word?

Higher-order functions, or "combinators" if you want to sound all math-y.
They're not really _list_ iterators, because it makes just as much sense to
map or fold over trees, arrays, matrices, etc. Whether you need structure-
specific versions like maplist, maptree, etc. is just an implementation
detail.

------
RK
I have never used R, but why (in the beginning) would you generate data in
python and then plot in R, rather than just plot in python (which looks much
nicer IMO)?

~~~
thesnark
Plots in R are far superior to those generated with matplotlib or other
packages. Take a look at ggplot2.

~~~
dododo
what about 3D or 3D with time in R? this is what python has:
<http://code.enthought.com/projects/mayavi/>

i'm not really sure about far superior for 2D:
<http://matplotlib.sourceforge.net/users/screenshots.html> vs.
<http://had.co.nz/ggplot2/>

~~~
revorad
I remember looking at that matplotlib page for the first time and coming away
thinking those graphs look a lot better and sharper than the default R ones.
However, I found the reason for that is the images on that page are quite high
resolution PNGs. If you make graphs of 200dpi or more in R, they look equally
good. The annoying thing though is that you have to adjust all the margins and
character size settings in R to plot at a higher resolution.

------
agentq
My programming background is fairly extensive in non-functional languages. I
work in finance, and the company I joined uses R for a considerable amount of
model prototyping.

I'd really like to switch to Python+numpy/scipy, but I haven't been able to
find an equivalent of a data.frame, or some numeric+string data structure that
allows for easy slicing on both.

Does anybody have any suggestions?

~~~
etal
Does numpy's record array do what you want?

<http://www.scipy.org/RecordArrays>

It doesn't quite have first-class status in the library the way data.frame
does in R, but it does let you index an array using strings.

~~~
agentq
Thanks, I'll check it out!

------
sandGorgon
Given the existence of Incanter+Clojure plus the Leiningen project (similar to
R's inbuilt package management) - how does R compete? Especially considering
primitive types support in Clojure -
[http://groups.google.com/group/clojure/browse_thread/thread/...](http://groups.google.com/group/clojure/browse_thread/thread/c8c850595c91cc11)

~~~
dagw
R's libraray support is far more complete. R has been the defacto standard for
statistics research for over a decade. This means that someone has already
written an R library to handle basically any type of statstical or probability
application you care to imagine. Also, for much the same reasons, the
documentation for R is much much better.

~~~
hyperbovine
_someone has already written an R library to handle basically any type of
statstical or probability application_

Also, frequently the person writing the library is the guy who invented the
estimator in the first place. To me this counts for something.

