

Ask HN: Python or R? - washedup

What&#x27;s your preference, and why?<p>More specifically, comparing R to Python library.
======
aaren
I don't use R. I use [pandas] / [statsmodels] in Python. I don't do
particularly special stats though.

I read this [blog] earlier and as a result I don't think I'll bother to learn
R unless I have to.

[pandas]: [http://pandas.pydata.org/](http://pandas.pydata.org/)

[statsmodels]:
[http://statsmodels.sourceforge.net/](http://statsmodels.sourceforge.net/)

[blog]: [http://www.talyarkoni.org/blog/2013/11/18/the-
homogenization...](http://www.talyarkoni.org/blog/2013/11/18/the-
homogenization-of-scientific-computing-or-why-python-is-steadily-eating-other-
languages-lunch/)

------
svjunkie
What's the problem you're trying to solve? It's hard for anyone to give useful
advice without any more context.

That said, I recommend whichever language is easiest for you. I use R and have
not fully learned Python, so I have an obvious bias. If you're performing
complicated statistical analysis, I'd recommend R, but for more traditional
programming, I get the impression that Python interfaces more efficiently with
other languages.

------
stadeschuldt
I use both. I like R for its charting capability and the sheer amount of
packages for different use cases. I use Python to pre-process data and get it
into because it is a lot easier than in R. Also Scikit-learn, NumPy and Pandas
are really nice.

------
xixi77
As everyone says, depends on what you are trying to do.

For all interactive/exploratory analysis, for statistical graphics, for more
advanced statistics, for most statistics-related research work in general, I
would definitely pick R out of these two.

If statistics is only a small part of the application, if you already know
exactly what you have to do (i.e. no data exploration), if you have to do a
lot of web/text processing -- probably Python.

Also, check which one has more/better packages related to what you are doing.

For some stats projects I would go with something else entirely though.

------
code_scrapping
They're not really directly comparable. Python is an general purpose
programming language, R is a statistical processing tool.

You could compare R to Matlab, or R to python library with similar scope
(numpy or pandas).

~~~
xixi77
I'm not sure that's the distinction -- there is no shortage of general-purpose
programming code written in R, as well as statistics done in Python.

It's more about use cases -- is it all about statistics, or is actual
statistics only a small part? Is the focus primarily on research and
exploration, or on implementation and deployment?

------
floppydisk
It really all boils down to what problem you're trying to solve, what kind of
analysis you're trying to do, and what the performance requirements are. For
basic stats, R and Python will be comparable in terms of library availability
and functionality. If you start getting into more specialized and/or esoteric
statistics, you will find more R packages (libraries) than you will Python
libraries.

------
bjoerns
I use both for my data analysis (though I'm not doing anything too fancy these
days). R is great at statistics (that's what it was designed for) but a bit of
a terrible programming language. So I combine the two, do all the IO stuff etc
in Python and run the actual statistical analysis in R (RPy is a great Python
interface to R).

~~~
bjoerns
...although even at the risk of getting laughed at, Excel is an awesome tool
for a lot of quick and dirty data analysis stuff (depending on what you do).
Throw in a little Python/R (e.g. via datanitro) and it's even more powerful.

