
R at Microsoft - vladiim
http://blog.revolutionanalytics.com/2015/06/r-at-microsoft.html
======
trengrj
Microsoft buying Revolution Analytics is another piece of evidence shoes
Microsoft has its A-game back. R is massively popular in the statistical
community, it is basically the PHP of analytics (huge standard library with no
namespaces but very easy to get started with and powerful for advanced users).

Additionally R is one of the few languages where you can be productive in a
non-unix environment. At my old Windows workplace, R was the only open source
language with working network NTLM authentication out of the box. Everything
can be self-contained in RStudio, an interpreter and gui package manager to
stop you jumping to the command line, and a help system to stop you using man.

By buying becomming closely associated with R, Microsoft is more likely to be
a component in large analytics investments i.e. "let's get Microsoft's R
distro, oh and why not their big data plaform too?", and ensures its services
and databases are well integrated in this upcoming language.

~~~
fixxer
"PHP of analytics"

I started laughing, then I realized you meant that as a compliment.

R has a great community and a ton of unique packages, though many of those
great packages are thin wrappers over Fortran/C libraries and are available in
a host of other languages (Python, Julia) that, in my opinion, have better
features when it comes to testing and group-based development.

In my experience, R degenerates into a rat's nest more easily than the other
candidates. I've been hired twice for this explicit reason.

BTW, I think your comparison with PHP is apt. R has many of the same strengths
and flaws.

~~~
taylorwc
Slight typo: compliment is correct here, rather than complement. Excellent
discussion.

------
IndianAstronaut
The write once run anywhere concept is intriguing. R has major scaling issues
and a lot of the algorithms and packages in the CRAN just don't scale to any
reasonable amounts of data. It really was designed in an era of controlled
experiments with minimal data.

~~~
stdbrouw
> an era of controlled experiments with minimal data

This world hasn't disappeared, y'know. Many of the specialized packages on
CRAN deal with things like missing data, removing selection bias from
longitudinal studies and distinguishing mediated from direct effects. These
concerns are by and large not even applicable to the data the average data
scientist deals with, so why even bother making them scalable?

On the ML front, on the other hand, R has never really been a forerunner.

~~~
blumkvist
I don't know why this comment is downvoted. It is true. The vast majority of
businesses will never deal with machine learning. ML has very limited
applications in the "real world", as opposed to regression analysis with some
small samples.

~~~
ska
Your point that simple regressions with small(ish) sample sets are important
is valid. This:

> "The vast majority of businesses will never deal with machine learning."

However, is myopic. It plausibly will not hold for much longer in most areas,
and is already not true for a significant number of industries.

~~~
blumkvist
I find ML and "3d printing" very similar in the expectations and extent
business will adopt them.

~~~
ska
On what basis? They are not similar in any respect that I can think of.

3D printing has had limited impact for prototyping in the many decades it has
been around, and in the expected quarters. That's pretty much what every
analysis I've seen except a very few boosters have predicted, for what it's
worth.

On the other hand ML has fundamentally changed: document processing, search
engines, remote sensing, retail sales, e-commerce, etc. ; It has also strongly
impacted areas of logistics and shipping, manufacturing, (e.g. computer vision
& robotics), medicine, law enforcement, finance, defense, also etc.

It's also poised to find market advantages in many non-obvious industries as
larger scale data becomes available... unclear how far that will extend.

Where's the parallel you see?

~~~
blumkvist
Poorly understood, high hopes, a lot lower practical applications. Predictions
are not as applicable as one might think at first glance.

I hope you do not confuse machine learning with statistics and data mining.
They are very different.

~~~
ska
No, I do not make such a confusion. I hope you do not make the converse one,
of re-categorizing techniques "out" of machine learning as soon as we
understand how they work.

At any rate, your experience in industry seems to be quite different than
mine.

------
jordigh
Does someone know how does this work with regards to the GPL? I have heard
that they have a complete reimplementation of R. Really? What about all of the
GPL'ed packages at CRAN, do they just not ship those?

I also know that the R developers don't really enforce the GPL. Is that what's
going on here?

~~~
DannyBee
If they ship their own from-scratch-not-using-any-third-party-code
implementation, they are fine

If they are shipping a GPL version of R, then this is a legal grey area, with
different opinions from differing lawyers, mostly on whether it's a derivative
work covered by the GPL or not.

It's honestly, not worth getting into the whole discussion, because there are
no lights at the end of the tunnel, only opinions on all sides that are
usually supported by reasonable arguments.

~~~
StevePerkins
I'm surprised that no one in this thread (or on the broader Internet, as far
as I can tell from a cursory web search) has any information the status of
this "Revolution R" package that Microsoft recently acquired.

Is it a clean-room reimplementation of R? (like the relationship between .NET
and Mono) Or is it a "distribution" of R? (like the relationship between
Debian and Ubuntu?)

Whether the R implementation here is non-GPL, or whether it actually is
running in a "fork-and-exec" separate process, I'm sure that Microsoft has
their bases covered. They know more than a thing or two about software
licensing, and certainly wouldn't take any risks of subjecting their flagship
enterprise database to the GPL.

However, I'm completely uncertain as to what the legal status of Microsoft's R
implementation means in terms of libraries that one can use from CRAN (aren't
many of those GPL'ed as well?).

For me, I work in the real world, where you're not allowed to touch the GPL
with a 10-foot pole, so this is of idle curiosity only. I'm not sure if
Microsoft is trying to appeal to academia here, or if it's just a P.R. move in
general... but if they expect to sell this to business users, then they're
going to have to put a LOT more effort into clarifying its legal status.

~~~
jordigh
> I work in the real world

I must work in the imaginary world, since we are allowed to use GPL'ed stuff
at work. In fact, I got hired at my current job to improve GPL'ed stuff. And I
don't work in academia.

Whenever I hear stories about how big companies can't touch the GPL, I always
call "bullshit". Of course they can; and in fact many large companies do.
Some, sadly, are just full of inefficient bureaucracy that feeds their
employees big fat lies about how the GPL will destroy us all.

------
jokoon
I don't know a lot about R, but apart from the fact that it's made for stats,
isn't it a great language because it's also aimed to handle embarrassingly
parallel programs, and no other other language do it as well as R ?

I guess R is used in machine learning fields.

Is R opencl enabled, or is it planned to be ?

~~~
capnrefsmmat
R doesn't have built-in parallelism features that I know of. There are a lot
of packages, though:
[http://cran.r-project.org/web/views/HighPerformanceComputing...](http://cran.r-project.org/web/views/HighPerformanceComputing.html)

~~~
grayclhn
The 'parallel' package has been included in recent versions of R and has RNGs
for parallel execution and variations of the `apply` functions. For
"embarrassingly parallel" calculations these work as drop-in replacements.
It's built-in in the same way that MASS and lattice are.

------
david_seah
Anyone able to read the slides? I have difficulty accessing the slides.

~~~
facorreia
They worked for me (Chrome, Linux). Perhaps you can try the direct link:

[http://www.slideshare.net/RevolutionAnalytics/r-at-
microsoft...](http://www.slideshare.net/RevolutionAnalytics/r-at-
microsoft?ref=http://blog.revolutionanalytics.com/2015/06/r-at-microsoft.html)

------
blumkvist
I have to say I am very disappointed in the information, after the title got
me so excited.

No announcements about Excel? I really hope R lands in Excel, it will open big
opportunities for developers and businesses.

Also there was a guy from MS, who created python tools for VS and asked if
anyone would want R tools for VS. After receiving a very warm response, never
heard from him again. What's up with that? Parallel to that RA has an IDE,
which uses the VS shell. Will we have more streamlined R workflow in VS? This
is important for a lot of people...

What about R.NET?

~~~
smortaz
hi there - it took a while to make our 1st hire, but i'm happy to report that
work is underway as part of the Azure ML group. as soon the project has a
pulse i'll post on HN. current plan is to have it be free & open source like
Python Tools for Visual Studio.

i'm also hiring for the project, so if you're into dev tools (editors,
debuggers, profilers, languages, ...) and looking, pls drop me a line. looking
for devs and a technical PM.

re R.Net - no effort that i know of.

~~~
blumkvist
Glad to hear you are making progress then!

Can't help you with development. but I will be more than happy to beta test. I
make analytics applications, encapsulated in asp.net and additionally use R
for market research.

