Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is R an alternative to SQL?
4 points by nyc111 on Dec 2, 2015 | hide | past | favorite | 6 comments
I'm confused why one needs R or similar software to analyze complex data. Is there anything that can be done with R but not SQL?

For instance, I think R is used by biologists, http://www.genomebiology.com/2004/5/10/R80 can the same analysis be done with SQL?




SQL is a declarative language, not imperative. IE you describe what you want, rather than a set of instructions for extracting it. Database engines have "query optimizers" that use relational algebra to find the most efficient ways to process a SQL query, and they make a huge difference when your databases get big.

What R brings to the table are numerous statistical methods packages and some good data visualization packages. With just a few lines you can apply some sophisticated techniques and produce beautiful visualizations.

SQL can indeed do some complex analytics, but they're very difficult, so R is the easier platform for statistics and machine learning. I did a Coursera class recently in which we had to do matrix multiplication with SQL and even that was a brain teaser of a puzzle. The advantages of doing analytics within a database are (a) leveraging the query optimizer to speed up your analysis and (b) you can make the analysis available to other users of the same system via a view or stored procedure, so they don't have to do the same work twice.


R and SQL are two completely different beasts;

you can use one in conjunction with the other;

R is a general programming language with a huge library of mostly statistical analysis routines.

SQL is a way to get data in/out of a database

You might want to use SQL for trivial analyses, and anything more, maybe use R. But you can use SQL from R to get the data.


Thanks, got it.


R is great for fitting models and so on as explained by others. Most R functions expect a 2d matrix or table-like input.

SQL is great for choosing which columns to get out of multiple tables, somehow combined, as well as to filter which rows. If the end result is ready for R to use in modeling, that's great.

R can struggle to index and manipulate large datasets for combining/selecting columns and filtering rows, but that's the really nice stuff in SQL. They work well enough together and it's really no big deal to set both up.


The general workflow is read something into R using SQL and do all the magic with it. Anything non-trival is extremely difficult in SQL


I'm not even sure how to react to this.

Could you? Probably. Would you want to? I highly doubt it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: