
Future of Statistical Programming - luu
http://www.science.smith.edu/~amcnamara/FoSP.html
======
babahoyo
Getting people to switch to R (or Julia) from excel is to balance two
competing goals.

First, you need to establish that scripting is far superior to point-and-click
interfaces via reproducibility and legibility. _This_ is the fight worth
having, as it gets to the core of what it means to do science and share
results.

Second, you need to convince them that they don't have to sacrifice the
tractability and ease of use of excel. Jupyter notebooks, for all their
faults, really sell this idea well. But the first goal should dominate the
second. Nteract and shiny are great, but I they are big tools that are
difficult to teach beginners to code up. We shouldn't say "use R because you
can still use point-and-click interfaces", we should argue against point-and-
click interfaces all together in this context.

I also agree with the poster below that running a script again with a new
parameter changed is super super easy, and you dont need a GUI to explore data
like that.

~~~
Nydhal
I disagree that scripting XOR point-and-click. Why can't we manipulate a
dataframe using both approaches?

~~~
jhbadger
If the pointing and clicking generates code that can seen and stored, then
fine. A reproducibility problem comes from the fact that most point-and-click
tools don't. As a computational biologist who uses R I am often frustrated by
experimental colleagues who use Excel and often can't remember how they
transformed the data months later. I don't necessarily have better memory but
I can go into my script and look.

------
gumby
> When I refer to tools for learning statistics, I mean things like applets,
> TinkerPlots, and Fathom....You can't put TinkerPlots on your resume, and if
> you actually want to apply the skills you've learned to real data, you need
> to learn another tool.

I can't really see why this is a problem as stated. You probably used a
textbook as well, and you can't put that textbook on your resume either. Just
because it's software doesn't make it different.

> So, the first thing that a future tool for statistical programming should do
> is bridge the gap between learning and doing. I think that a good tool
> should be able to ease people into programming, using some sort of visual,
> drag-and-drop interface that exposes novices to the entire trajectory of
> data analysis (data import, cleaning, new variable creation, plots, summary
> statistics, models). Then, it should have a way to make the transition to
> more traditional or textual coding. Ideally, one could look back and forth
> between the visual representation and the textual one, and a change in one
> interface would be reflected as a change in the other.

In statics, as programming, it's not clear a drag-and-drop pedagogical
interface helps much (believe me, I"ve worked with such systems since I was a
kid in the late 70s). This paragraph quoted above, and the part of the article
above it, actually have the solution: use a bunch of helper code in R (which
is what they use now) but generate the realtime results in an adjacent window.
This is the approach used by LOGO back in, yes, the 1970s: you'd type a
program (in a real programming language -- basically a Lisp) and see the
results immediately. You could go back and modify the program and see _those_
results immediately.

~~~
xte
Today's "notebook UI" seems to be popular (think about Jupyter, Mathematica,
SaGe, ...) and in the end they are graphic-sugar and a bit limited org-mode
implementation...

On "graphic programming" LOGO was a toy, SmallTalk (actually Squeak) are
another story (and the basis of very first modern workstation, the Xerox Alto)
:-)

~~~
infinite8s
LOGO was not a toy, it was a means to teach children the basics of
differential geometry.

~~~
xte
Ok, but try to use it for anything else today...

~~~
infinite8s
Do you expect to hammer a nail with a screwdriver?

------
zzo38computer
I use SQLite (just the command-line version is good enough) for statistical
programming. Many functions are missing, but I wrote extensions with the
needed additional functions, and there are also some useful extensions that
can be found in the SQLite source tree but that are not part of SQLite itself,
so we can use that too.

~~~
wdkrnls
Somehow I think "statistical programming" means something different to you
than it does to statisticians and statistical scientists.

~~~
wdkrnls
But I would be very interested to learn more about what you are doing with
SQLite if it is more than finding a few means and standard deviations.

------
xte
Just as a reminder
[https://panicz.github.io/pamphlet/](https://panicz.github.io/pamphlet/) about
how R is not such an innovative solution for statistic...

~~~
FranzFerdiNaN
Yeah, let’s not reinvent everything that is available in the R ecosystem again
for some reason. R might not win a beauty price for best programming language
but it does the trick just fine.

~~~
xte
I'm curious to understand why is born in the first place since we already have
many lisp/scheme then... R itself seem a reinvention of the wheel...

~~~
kgwgk
According to an article published in 1996 by Ihaka and Gentleman:

“The work has been heavily influenced by two existing languages-Becker,
Chambers, and Wilks' S (1985) and Steel and Sussman's Scheme (1975). We felt
that there were strong points in each of these languages and that it would be
interesting to see if the strengths could be combined. The resulting language
is very similar in appearance to S, but the underlying implementation and
semantics are derived from Scheme. In fact, we implemented the language by
first writing an interpreter for a Scheme subset and then progressively
mutating it to resemble S.”

([https://www.stat.auckland.ac.nz/~ihaka/downloads/R-paper.pdf](https://www.stat.auckland.ac.nz/~ihaka/downloads/R-paper.pdf))

Unfortunately R’s gain was LispStat’s loss:
[https://www.jstatsoft.org/issue/view/v013](https://www.jstatsoft.org/issue/view/v013)

~~~
xte
Thanks :-)

------
mtraven
The article doesn't mention this, but after some poking around it appears that
the interactive demo is written in Morphic, the Smalltalk successor that Alan
Kay and colleagues have been working on for the last couple of decades.

