
Ask HN Hackers: What's R's future? - wildanimal
The R community seems to be growing rapidly, with many favorable reviews of the language/system. For instance --<p>Forbes Magazine, Names You Need to Know in 2011: R Data Analysis Software:
http://blogs.forbes.com/smcnally/2010/11/10/names-you-need-to-know-in-2011-r-data-analysis-software
(and links therein)<p>And some of its developers are suggesting that they scrap it and start over (don't know if the whole "Core Team"'s on board tho'):
http://www.stat.auckland.ac.nz/~ihaka/downloads/Compstat-2008-Slides.pdf<p>Are there parallels like this in the development of other languages/environments/ecosystems (e.g., Python3, Perl6 "revisions")? How do these efforts usually end up (I guess we're still waiting to see about Python3 and Perl6...) -- and how would it affect your business's decision to develop a library in this environment?
======
goodside
I'm a non-IT employee (but with a comp-sci background) working in the
insurance sector, and I'm currently managing R adoption for a group of about
30 business analysts with minimal programming background.

Programming in the business world is screwed up beyond all imagination. The
more money a given application is responsible for, the more likely it is that
it's a house-of-cards (pun intended for MVS nerds). They're always mishmashes
of COBOL, SAS, DFSORT, and random proprietary languages that have never been
the subject of a third-party book, and were sold to a company that was sold to
a company that was sold to CA Technologies back in the 1970s. Whatever these
languages can't do is implemented through Escher-painting constructions of
Excel references and VBA macros.

So, when people say that R has some issues, I say, "boo fucking hoo".

Most businesses suffer from an unnatural separation between IT and the
business end. If business people want something programmed, they call IT. They
don't learn Python and do it themselves, because Python is a "programming
language". R is the first real language that business people are being
encouraged to learn, because it's an "analysis environment". You have no idea
how often I have to edit the word "programming" out of my presentations for
this reason.

R will win in business because it's decent, and it's been around long enough
to not be scary to managers. I'd be cautious about drawing comparisons to
other languages that have undergone big design changes, because, as far as I
can tell, the existence of a decent language in the business world is entirely
without precedent.

(Edit: In case the above came off as sounding like a "non-hackers are idiots"
rant, it wasn't meant as such. Many of the people that produce these hideous
monstrosities of SAS and VBA code have PhDs in statistics and atmospheric
sciences. You can be pretty smart without knowing how to write software well.)

~~~
SoftwareMaven
It is surprising how many business people aren't afraid to build monstrous
macro-driven spreadsheets, but cringe at the thought of programming.

Makes me wonder if it is a UI problem. In the macro spreadsheet, you have data
with code tied to it. In the programming world, you generally have code
accessing data.

The abstractions might just be wrong for business people and a "simple" change
could reduce a lot of IT pain.

~~~
jacques_chester
The reason that spreadsheets are popular is the same reason that many students
trying to solve a maths problem immediately plug in the knowns, without
rearranging terms first. It's fear of the abstract and comfort of the
concrete.

------
ahi
I predict a 10 year campaign of conquest followed by a 30 year death march. R
is a complete mess that kicks ass in its niche. There are too many data types
and the syntax seems kind of random, but two lines of R can get you
publication quality graphics.

R is really becoming huge in academia. As far as I can tell, health sciences
is the last SAS holdout. I expect it to take over business as well. Biz types
will love it because it's so powerful as a scripting environment, but the
programmers building and maintaining stuff with it will come to loathe it. R
will become the PHP of analysis; ubiquitous but hated, and no one will have
the chutzpah to fix it.

Random aside, anyone notice that the Kiwis are all over R? The original
creators and the guy who wrote ggplot2 among many others.

~~~
stevenbedrick
Yeah, in my experience, most biostatisticians (especially those involved in
public health and clinical research) are SAS folks. Some of that is inertia- a
lot of these people learned SAS at the same time they were learning stats.
However, I think that most of SAS's continuing prevalence is due to the fact
that, for all of its (many, many, many) problems, SAS is a freakin' log
chipper when it comes to statistics- it doesn't care how much data you throw
at it, or what kinds of crazy and/or exotic statistics you ask it for- if you
can decipher its syntax, you can get it to do it.

Even for stuff that a lot of other programs can do just fine, SAS often has an
edge. For example, everybody and their brother can do a logistic regression
model... but SAS can give you confidence intervals for all kinds of crazy
parts of the model that SPSS won't even bother calculating and that R will
only give you point estimates for.

The other great thing about SAS is that a lot of the good statistics books
from the last twenty or thirty years include SAS sample code- for example, I'm
currently having to do some off-the-beaten-path ANOVA stuff, and the reference
I'm using (Edwards' "Analysis of Variance for the Behavioral Sciences") uses
SAS as its language of choice.

That said, I personally find the SAS "language" to be alternatively
bewildering and nostalgia-inducing (the "cards" command, anybody?). SAS is the
only language about which I can honestly say "it makes R's syntax look clean
and predictable". Also, the Windows version of SAS is an absolute abomination
from a UI standpoint. And, their licensing schemes are draconian, and
installing the damn thing can easily take an entire day, especially if (say,
for example) the installer gets confused because you've already got a JDK
installed on your computer. Not that I'm bitter, or anything...

Of course, as others have noted, in bioinformatics, R either is already the
default or is almost there. I know that in my department's bioinformatics
courses, they use R, Python, and Perl almost exclusively, and only break out
the SAS when there's something specific they need it to do.

------
msy
While there may be a new language that deals with some of the deficiencies of
R at an unspecified point in the future, R is here today. It works, it works
very well for the tasks it was designed for and is both well written and well
supported. Get coding.

------
eliben
I wonder - could R in theory be rewritten as a Python library? If not, why
not? Is there any special syntax of R that makes it more amenable to
statistical analysis than Python? Performance concerns?

It's just a shame to see a whole language popping out of something that could
just be a library.

~~~
cdavid
[caveat: I am a numpy/scipy developer]

The idea of rewriting a large body of code in a different language does not
make much sense.

Also, being a niche language has some nice consequences:

    
    
      * R has been there for a long time through its predecessor S
      * R is a specialized language: little chance to see it being screwed up by some library which wants to change everything, as it happens too often in python
      * Because it is a niche language, its behavior is consistent across platforms (it is just easier to do with R than with python, or other "real" languages).
    

Note how being a "real" language goes against those advantages. Also, most
researchers are very lousy programmers. Often, their software is super smart,
but the code quality is awful and write-only. A less powerful language may
mitigate those issues

~~~
hadley
Would love to know why you think R isn't a "real" language.

------
roadnottaken
It is still going strong in bioinformatics/genomics. I think it's slower and
clunkier than the alternatives, but for stats and graphics it's pretty easy
for scientists to learn...

~~~
michaels0620
It's also slowly getting a foot hold in Pharma. I think the large Pharma
companies would love to get out from under the expensive SAS licenses.

------
thibaut_barrere
Incanter is a project worth checking out if you use R I believe.

~~~
sandGorgon
as is this article - <http://lambda-the-ultimate.org/node/3726>

_We propose developing an R-like language on top of a Lisp-based engine for
statistical computing that provides a paradigm for modern challenges and which
leverages the work of a wider community._ \- Ross Ihaka (co-developed the R
statistical programming language with Robert Gentleman) and Duncan Temple Lang
(core developer of R)

------
cschmidt
R as a language could use some help. It is painfully slow, a big memory hog
(since it copies large objects with abandon), and has lots of language
"gotchas". We had a saying where I work: "R is really fast if you write it in
C".

However, the libraries are great. Anything you'd want in statistics is already
there. So I do use it all the time.

Just to say something nice, I do like data frames (a two dimensional matrix,
where each column can have a different type).

------
devmonk
I'm more interested in the education vs. income vs. prestige graph in:
[http://blogs.forbes.com/smcnally/2010/11/10/names-you-
need-t...](http://blogs.forbes.com/smcnally/2010/11/10/names-you-need-to-know-
in-2011-r-data-analysis-software/)

Looks like there might be a ceiling for prestige, and that income is not as
related to it as I would have thought. But, what are the units?

------
dlib
I would love to see Python become the standard language for scientific
computing including statistics but right now R is popular and gets it done.
Don't get me wrong, R is excellent for its purpose and I like working with it
and its many specialized libraries. However, Python is fun to hack away in and
is more of a multipurpose language.

R works for me right now so that's what I'll stick to.

------
xtho
It think it's more like "abandon Perl/php/javascript/whatever in favour of
Python/Ruby/Lisp/whatever".

As long as there aren't more than 10 books written about the yet unborn data-
cruncher saviour and as long as the brand new alternative isn't adopted in
courses, I wouldn't bother -- unless you want to be the saviour's father (i.e.
developer) of course.

------
TWAndrews
I believe that over time, it will become the standard for statistical
computing in most businesses, eventually displacing much of what SAS does
today.

It's attractive to embed into databases like Neteeza, Teradata and other
analytic databases, and vastly easier to use than SAS.

Even if it was rewritten in python, I think that would be unlikely to slow
down it's adoption, which is driven by grad-students, researchers and quants
who often have no real programming background (and frequently aren't
interested in learning more than they need to generate figures for their
publications).

------
tel
Implement data frames and trellis graphics in Numpy/Scipy and I don't think
I'd go back to R for much.

------
inthewoods
I would love to use R for my latest project, but as far as I know you can't
create a compiled executable for use in a runtime, live web environment.
Anybody know of a solution to this issue?

~~~
tcc619
Rserve is an tcp/ip interface to R so you can send R code from any other
language. I have used the rserve-ruby interface to some success.

~~~
inthewoods
I didn't see a ruby interface - where is it? Also, this doesn't deal with the
issue of speed - can R scale to handle, say 100 simultaneous connections?

------
konad
I do wish they'd change the name, R is such a difficult search term

~~~
tcc619
using rseek.org will help for search on R-related pages.

