That said, R’s strength these days is about expressive power through the Tidyverse collection of libraries and DSLs.
If I were to pick an “R Next” project, it’d be to focus on a better, more expressive Tidyverse for Racket that plays even more nicely with relational databases and frameworks like Spark.
I use Python instead of R unless I'm told to use R. I hate using R, even if there are better libraries written for it than the Python equivalents. I can't stand the terrible naming conventions (seriously, can't you at least be consistent with CORE FUNCTION names?) and ridiculous amount of data structures. There are vectors, lists, matrices, tables, data frames, S4 classes, environments, oh my... I've been programming in R for a couple of years now and it still takes me around 2-3 tries to figure out what's stored in a variable and how to access it. Do I need two ['s, a trailing comma inside the , etc.
Debugging R basically seems to mean "use a hack to generate stack traces."
Maybe I'm just stupid, but I see _absolutely_ no reason to encourage use of R over Python. I love lisp and the ideas it espouses, but R seems to take the worst from that world.
Python doesn't even give you matrices as first class citizens; while I used a lot of Python before I used R, it still feels like they bolted lapack onto an unrelated scripting language and built things with it. More or less because that's what it is.
Personally I don't think the R language is anything special, good or bad: it's a typical sloppy interpreted language (though many of the difficulties described in the above 2010 document no longer exist). It's the package management system that makes it useful. It's not even a great package management system, especially when dumb kids use it like it's nodejs. But it's good enough to allow potentially crummy programmers (aka statisticians) to contribute meaningful and useful code to the ecosystem.
My example is if A is a matrix and b/c are variables then you don't know what the data type of A[b,c] is. I can tell you the types of A, b, c and that doesn't help; you need to know about the actual data stored in the variables to know if the return value is still a matrix or if R has thrown out the dimension information and jumped back to a vector (potentially transposing the result). You have to know about the drop=FALSE option and at that point the syntax of doing a complicated equation involving recursion and matrices falls apart.
The syntax is an embarrassment for working with matricies. I'd rather use a lisp-style (-> A (mmul v) (subset 1 k 1 j)), which isn't ideal but at least it doesn't have random options being set in the middle of it.
That single decision should be enough to disqualify R from being a well designed language for mathematical applications. The pigs breakfast that is the *apply() function family is a similar story.
The distinction between vectors, matricies, lists-of-lists and data frames is archaic too, the conceptual model should be a single 2-d data structure and then support additional operations under certain conditions. At least that particular decision makes sense at the time R was designed.
R, like Python, has far outgrown its initial scope. I don't think it was initially envisioned to be used the way it is today. But both have been kept in use as costumes for C/C++.
One of the things I've noticed the most in the last 20 years, to your point, is that the language used to be a lot more straightforward and simpler, more predictable. Over the years a lot has been added in a sort of haphazard way, and as a result today you have this kind of Frankenstein language that isn't what it started with.
As for data structures, though, I don't really see R as being that different from other languages. Many of them are the same as in other languages, but just have different names (and I do wish they used similar terminology). Others have been taken up in other languages as people have come to appreciate their utility.
Being a wrapper for C/C++ can only go so far. Eventually you have to write in R (or Python) and the speed shows, if you have enough data to deal with.
Sometimes though, the knowledge encapsulated in an R package is not trivial to understand and reimplement. The concepts methods use may involve math you're unfamiliar with (and use as black boxes), so then you have to look at the package code and try translating R->Python, then attempt to refactor to something sane (and hope you don't skip any underlying logic). That or learn the theory of what was implemented so you can now implement it in Python (which may not be feasible with tight deadlines).
Machines are faster now,
we have seen the hassle with Python 2 to 3 adoption (or non-adoption) and how hard it is to change a language,
generally the model to use a slow but comfortable language for model specification and execute it via C lib is more accepted now,
and last but not least: The Tidyverse really has momentum now.
Sure, Julia and Python are coming after R, but the ecosystem itself is far from done..
This is Julia's native, in-language approach (no delegating to C/C++):
Giving the timing, I'm interested in what he might think of Julia, which seems to have reaches a similar conclusion - statisticians need a new tool.
You can see via The R Journal (https://journal.r-project.org/archive/2018-2/) and read through what researchers have done and published via packages.
edit: grammatical error fix