Hacker News new | past | comments | ask | show | jobs | submit login

For any database that isn't huge, a library embedded into your application is going to be faster than anything that has to communicate with a server over a socket connection. Though both execute SQL queries, SQLite is completely different than relational database servers and appropriate many places where running a full RDBMS is not. For example, you can't run MySQL or Postgres on the iPhone, but you can use SQLite.

I've found SQLite absolutely amazing for getting the power of SQL for R data frames.

Could you expand on some of your use cases? An R data frame, by definition, has to fit into memory, so it would seem that any sort of map/filter/group/fold operation would be fastest if performed in-memory, as well. And I assumed that e.g. joining data frames (where you would run out of memory really quickly if your datasets are large to begin with) would be uncommon... am I wrong?

Depends on what you mean by uncommon.

I for example often need to score/model data which doesn't fit in RAM (on my PC) so I use libraries like bigGLM which can use out-of-memory data to build the models. One of the options is SQLite, but you can use an ODBC connection.

Additionally, I can explore slices of the data, which resides only on disk. I don't even need to import it. I can use dplyr (very famous package for aggregations and slicing) which will map the R syntax to SQL which is executed by SQLite.

The set up is quite interesting. The sql data frames library will embed a dataframe in sqllite and let you use sql. It has been quite useful in transitioning our SQL data analysts to R.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact