It is missing the quick backtick interpolation that you get with R-Markdown, and some of the nice UI stuff like inline Shiny graphs and clickable tables (but it is easy to output Org-format tables that can be piped into other languages).
Reproducible research must be reproducible by the unenlightened, after all!
I have tried R in Jupyter a few times and it was nice but the advantages in R Notebooks is just awesome. Git playing nice is the best advantage.
I still am clueless to the religious Python vs R and the smack that is read that "serious" work is done on in Python? R works best for me.
its already been said, but I do NLP a lot. R handles text poorly. humans use a lot of text.
tensorflow, neural networks, etc is better in Python
between pandas, list comprehensions, python collections library, sklearn, spyder, I feel I have a lot of power at my finger tips and its easy to do most of the machine learning I want.
importing a package takes a meaningful amount of time in R. Several seconds, that is just unacceptable.
its a personal matter, but R has syntaxes that get on my nerves. python list: a = [1,2,3] a = c(1,2,3). perhaps its because i used other languages before, but my fingers are more adept at hitting [ which requires no shift compared to (. some people love curly braces and lots of parentheses in if/for statements, I appreciate them not being there.
I have to fight with R on scientific notation, always copy - pasting into my code: options(scipen=999)
that said, spyder is buggy, and R studio is fantastic. I still haven't come across a good python IDE that is par with R studio.
edit: I forgot to say, I feel pyspark is far superior to sparkr. last i seen, sparkr only works with a VERY old version of spark. I dont even think that version is supported anymore. this is a bit of a big deal to me
It's certainly taken some time investment, but after bouncing around all the editors for both, with some config, Emacs (with ESS for R and anaconda mode for Python) is the best environment I've found for both languages.
On top of that, in R, argument passing in function calls is call-by-name-and-lazy-value - meaning that for every argument, the function can either just treat it as a simple value (same semantics as normal pass-by-value, except evaluation is deferred until the first use), or it can obtain the entire expression used at the point of the call, and try to creatively interpret it.
This all makes it possible to do really impressive things with syntax that are implemented as pure libraries, with no changes to the main language.
Overall R seems a little weird at first but the more you get to know the language the more you realise it's actually pretty well thought out.
My someday project is `#lang arcket` for Racket, which would allow people to use existing R code, and mix with Racket, with appropriate data.frame data structures and whatnot.
Although now that I think more about it, it's not quite as bad as Python, because the data structures are mostly opaque pointers (at least until someone uses USE_RINTERNALS - which people do sometimes, even though they're not supposed to), so all operations have to be done via functions, where you can map them accordingly.
You'd also need to emulate R's object litetime management scheme with Rf_protect etc; but that shouldn't be too difficult, either.
Some more reading on all this:
Maybe aiming for "mostly compatible, with some porting work for a handful of the more popular non-R (C, Rcpp) packages would yield a better result in the end.
Full unicode is supported. Unicode pi is implemented to mean the pure mathematical entity, so at compile-time it is turned into an memory reference to the most possible exact value.
The metaprogramming in julia is so good I wrote a verilog DSL that transpiles specially written julia into compilable and verifiable verilog - in 3 days.
I ended up learning how to use Python/pandas/IPython because I had had enough and wanted a second option on how to do data analysis.
Then the R package dplyr was released in 2013, alleviating most annoyances I had with R. dplyr/ggplot2 alone are strong reasons to stick with the R ecosystem. (not that Python is bad/worse; as I mention at the post, both ecosystems are worth knowing)
While the syntax of Python is "cleaner" for backend scripts, R feels more straightforward when working with dataframes (dplyr) resulting in things to report on. The syntax for ggplot2 fits the same category.
As much as having one languages for both categories would be nice, using both today seems like a better option.
I've often found myself wishing for a collaborative environment like that... and there it is.
Thank you!
I'm more familiar with Jupyter than R Notebooks. I'd second the point about version control in Jupyter being.. hard. There isn't really a good pattern for it yet.
I would note that I believe the latest version of Jupyter has prettier tables though!
Edit: Also, matplotlib makes me sad. Surely there could be something better which abandons it completely?
You have other options like bokeh and plotly
Can you explain why? I've never gotten the appeal. Besides the concept, the classic implementation (ggplot) does not make nice graphs in my opinion. To me these look wrong... I guess I'm not quite sure why. There is something about it too cookie cutter, cartoonish and information-light: http://r4stats.com/examples/graphics-ggplot2/
The answer is always: math
A good book on statistical theory is harder to come by, though.
Follow it up with Elements of Statistical Learning by three of the same authors for more advanced stuff.
It won't teach you much about theoretical statistics, or even things like experiment design, but you will learn a LOT about regression, classification and model fitting which is what everyone seems to want to be able to do these days.
I think this is an excellent overview [1]. Learning probability from a measure theory angle is more difficult to grok compared to the frequentist approach everyone is more familiar with, but I found it much more enjoyable. (I learnt the usual way from doing computer science undergrad, but now re-doing it more rigorously for masters in financial engineering)
I wish they could use RStudio for a while and understand just how important is the feature for someone using Python for research.
