
How I decide when to trust an R package - simplystats
http://simplystatistics.org/2015/11/06/how-i-decide-when-to-trust-an-r-package/
======
danso
Unfortunately, looking at stars/watchers as a metric is not as effective as it
is for more popular languages. R has a lot of great software and community but
the software development scene is not at all as populated as it is for
Ruby/Python/JavaScript. Look at the repo for ggplot2, which is one of R's most
generally useful and best-in-class libraries:

[https://github.com/hadley/ggplot2](https://github.com/hadley/ggplot2)

Just 1,500 stars and 200 watchers...stars obviously aren't a reflection of the
library's quality or value, but there's at least a correlation with how many
people are on Github and who are tracking the package. I don't know what it's
like to develop for CRAN, and it possibly has an advantage in that a
maintainer has to work harder to get it on to CRAN than on to Github...but on
the other hand, it doesn't seem as conducive to getting user feedback, which
seems critical for rooting out obscure bugs or breaking changes caused by
dependencies.

edit: Speaking of R's github obscurity and Hadley Wickham's work...dealing
with time is as frustrating as it is in every other language. Many, many hours
were wasted struggling with conversions and only randomly did I stumble across
Wickham's lubridate package, which on Github [1] has 181 stars compared to
23,000 and 2,300 for moment.js and Chronic, respectively...and I didn't find
it or think to look for it on Github...I think I stumbled across it on a
random blog post. I'm not sure how discovery works on CRAN.

[1] [https://github.com/hadley/lubridate](https://github.com/hadley/lubridate)

~~~
ellisv
I think the scale of stars/watches is simply different.

> I don't know what it's like to develop for CRAN, and it possibly has an
> advantage in that a maintainer has to work harder to get it on to CRAN than
> on to Github...but on the other hand, it doesn't seem as conducive to
> getting user feedback, which seems critical for rooting out obscure bugs or
> breaking changes caused by dependencies.

Quite a few developers are publishing on CRAN and GH. So engaged users can
contribute easily on GH and pull the most recent version to fix the latest
bugs while casual consumers of the package can still get a working version
from CRAN.

------
_Wintermute
Strange that Jeff Leek trusts packages on Bioconductor more than CRAN. Unless
they've started some more stringent checks recently, there's some truly awful
code on there -- in the lines of absolute file paths pointing to the
developers 'C:/My Documents/' folder and trying to import packages that
haven't existed on CRAN for the past 5 years.

------
onetwotree
Interesting. As a developer, I've never had trouble assessing the
trustworthiness of packages on github, and I see no reason to implement some
kind of trust system for language specific repos (npm, for example). The
reason is simply that there are many ways to assess a library, such as
community activity, use in production, quality of tests, and general
reputation.

Perhaps this is more of a problem for scientists who just want to download and
use a package, but don't have the know how to assess the quality of a package
the way developers do?

~~~
scottshepard
I think trust is a bigger deal in the R community than say Ruby. With a gem
you can tell pretty quickly if it works or not, and the community has a higher
standard of testing.

With R, many of the packages are for complicated math and stats. That
particular package might pass all it's internal tests, but what if they
implemented a distribution wrong, or calculated 95% confidence incorrectly?
That is when you have to trust the developer personally.

------
ellisv
I think these are pretty common heuristics among experienced R users. However
the article missed one of the most important steps: read the code.

------
anonfunction
The scroll-jacking on that site is extremely frustrating.

~~~
asifjamil
Just came here to see if anyone else noticed it and it's funny that it's the
only comment ATM :D

I think someone should come up with a browser extension to disable custom
scroll jacking.

~~~
onetwotree
You can just disable JavaScript for the domain?

~~~
Nadya
Or better just disable `smoothscroll.js` across all domains.

It's named that 9/10 times - including this site.

