
Redis and SQLite: Why wait for good enough? - velly
http://blog.rgbdaily.com/2011/02/03/redis-and-sqlite-why-wait-for-good-enough/
======
xd
There is a lot in this article that shows the author has very little RDBMS
knowledge. For instance:

" _What is slow are the arbitrarily complex index interactions that come with
the relational world._ "

Index's are arbitrary if arbitrary created. It has no bearing on the
relational world, no more than a bad driver makes all cars bad.

" _Given the example above I think SQLite could be a great implementation
leveraging the power of stored procedures to implement atomic datasets._ "

1, SQLite doesn't support SPs (You can use C to write extensions however) and
2, I've got over 10 years experience in RDBMS and I've never heard of an
atomic dataset. Has anyone else? I struggle to understand how a set can be
atomic.

" _Using BerkleyDB or Innostore could remove the slowness and pain of
interacting with the data through SQL._ "

This kind of comment needs to be addressed. There is NO pain in using SQL, the
pain that is perceived is the lack of understanding and skill to use it on the
part of the user. Don't call something painful simply because you haven't
taken the time to understand it.

~~~
Joakal
[MYSQL] SPs are a pain to implement, encapsulate and maintain. Looking at
other RDBMS, SPs don't appear to be very portable and doesn't seem to offer
speed advantages unless you only use simple queries.

It seems better to write SQL and make use of the DB drivers for me. Why would
you consider SPs?

~~~
sokoloff
One application is to allow very fast patching. If we need to patch an issue
and our query is in code that will take N minutes to update, perhaps compile,
and push out to every web server, it can be a tiny fraction of the time if all
you had to do was update the SP in the DB server(s).

I can do an alter proc with literally seconds of overhead (beyond the
obviously required time to actually change the SQL). If it doesn't work, or
makes things worse, I'm seconds away from reverting or taking a second bite at
the apple. If your SQL is in the app on your web servers, most places running
at any sort of scale can't match that.

(as a result of being heavily [99.8%+] SP based, we've had to develop some
compensating technology to allow us to do releases without downtime when each
release needs its own specific SPs. Solved, but took work. We're also
predominantly MS-SQL with limited mySQL, but I doubt that changes much.)

~~~
SkyMarshal
If it's not a proprietary secret, can you elaborate on the compensating
technology you guys developed?

I prefer encapsulating database logic in SPs as well, and am always interested
in learning how others have solved some of the problems incurred with them.

~~~
sokoloff
I've written about it briefly before on HN, and it's very beneficial but not
proprietary tech. Basically we create a "shim" database containing our app's
sprocs, views and functions, and additionally views to the transactional
database tables in another database.

That means that release N can be pointed at the "real" database, while release
N+1 can be pointed at the shim DB and they're both using the same
transactional data. You can run both in parallel until you commit to N+1,
whereupon you shutdown all N app servers, update the sprocs, views and
functions in the main DB, and (optionally) point your web servers back at the
main DB. There are a few other details to take care of, but that's the gist of
it, and the details are minor, or at least we found them to be.

Disclaimer: the above is my experience on MS-SQL at a pretty good sized
eCommerce site. Other RDBMSs may not work quite as well, or your app may use
heavily features that don't work in "shim mode".

~~~
SkyMarshal
Ah, very cool, thanks.

------
arnorhs
This is a favorite topic of mine, but the post is weird.

I had a bit of a problem gasping exactly what their problem was and how sqlite
(of all things) fixes them. Some of the sentences seemed like they were edited
multiple times before ending up not making sense. Take this bullet point for
instance:

    
    
      MySQL Stored Routines Library (AKA  ”Arrays,Sets,Hashes,Queues,and Stacks” implemented efficiently in )
    

implemented efficiently in what? I'm dying from suspense! :)

What problem does sqlite solve that mysql doesn't?

 _edit:_ I can see that the author has already made some edits, so the
sentence makes sense now. As in:

    
    
      MySQL Stored Routines Library (AKA  ”Arrays,Sets,Hashes,Queues,and Stacks” implemented efficiently in stored procedures )

~~~
arethuza
It says:

 _AKA "Arrays,Sets,Hashes,Queues,and Stacks" implemented efficiently in stored
procedures_

~~~
arnorhs
Yes, I see that he has made edits. great.

------
ericflo
Wait what? Is the author suggesting that antirez stop his efforts on Redis and
switch to working on MySQL stored routines? It says, "It’s just proof that
antirez has options about where to spend his time." And then it switches gears
to talking about SQLite. And then about BDB and Membase and Innostore and
Bitcask and Riak and Solr and his MIT class number.

I'm sure this author has a point he's trying to make, but I can't seem to make
out what it is.

~~~
siculars
I think the point the author is trying to make is that antirez should not be
wasting his time re-inventing the persistence wheel. As in, there are already
many persistence plays available that are proven to work within known bounds.

~~~
velly
Bingo! (from the author). I wrote this late at night at the end of a long day,
but that's the point.

------
Roboprog
PostgreSQL has to be one of the best kept open source secrets. I'm so tired of
the attitude that if MySQL does not work for you, then you have to throw out a
relational database entirely and run with one of these klugy NoSQL things.
From everything I have seen the last few years, PostgreSQL will scale across
multiple CPUs, where MySQL will not, due to differences in the locking
mechanisms. To say nothing of the lagging feature impoverishment of MySQL.
Yes, MySQL sucks less now, but how long did they simply not bother to
implement transactions at all (no rollback, no isolation)? How long until they
had stored procedures? Referential integrity to avoid dangling foreign keys?

~~~
rbranson
Open source databases would have never gotten to where they are if there
weren't two top-notch competing projects. Without MySQL, PostgreSQL developers
would have probably not as heavily focused on performance. Without PostgreSQL,
MySQL would have languished in MyISAM or broken InnoDB world forever.

~~~
moe
_MySQL would have languished in MyISAM or broken InnoDB world forever._

Ahem. Not to rain on anyone's parade but in my world InnoDB _is_ still broken.

I maintain quite a few production MySQLs and they blow up with great
regularity.

Usually it hits individual tables, sometimes (gladly more rarely since the 5
series) it suddenly feels insecure about the contents of ibdata altogether.
And no, it's not the hardware.

Anecdote needed?

Just last week we had to drop/re-create an InnoDB-table in one project because
it would not allow to add an index anymore, no matter what we tried. If some
MySQL expert around here can shed some light, I'm all ear, the error message
was:

    
    
       Mysql::Error: Incorrect key file for table 'foo'; try to repair it: CREATE INDEX [...]
    

Google revealed a handful of familiar sounding bug-reports mentioning column-
renames, which the table had indeed undergone, but nothing really helpful, so
we resorted to the drop/create sledgehammer.

Gladly that worked. This time. And every time I wonder why people can't just
use PostgreSQL, where such issues (believe it or not) simply don't exist.

~~~
rbranson
I couldn't agree more with your last statement, but I was trying to sound
unbiased ;)

------
almost
This article makes no sense whatsoever. Redis is an in memory database, that's
what it does. Obviously what this guy wants is something entirely different.
This is like complaining that your apples lack the needed orange colour and
make don't even produce orange juice!

~~~
oomkiller
Redis by itself is an in-memory database. The author is writing about Redis-
vm, which supports paging to disk for datasets larger than ram. Basically all
he's doing is recommending some alternatives to the current approach, other
than writing your own btree implementation.

------
ajessup
Apparently stone tablets make a very effective data store. They are
persistant, replicable, don't have arbitrarily complex index interactions, and
of course don't require the pain of interacting via SQL.

~~~
metageek
And you can stub your toe on them.

------
weixiyen
The point of this article is that Redis cannot be the "one true datastore". By
"one true datastore", he means that Redis is the only data store you need in
the stack, because it has an append mode that persists data in memory to disk.

The author's mistake is that he is setting up a strawman argument. I've yet to
hear Redis claiming to be a end all be all solution for data storage. In fact,
it would make sense that most production environments use Redis as the non-
persistent cache layer backed by an ACID compliant datastore for the time
being.

The author's second point is that Redis fails "at scale in production
environments". I'm curious to see some examples here of production Redis
failing and I'd like to know how those companies were using Redis. Either way,
the scalability debate should wait until 2.2 is released, because 2.2 is
supposed to address primarily multi-server Redis instances.

I think the main problem is just believing that there can be "one true
datastore". Different technologies excel at different things. You pick what
you want your app to do, and then take a look at the tools given to you and
pick the best one for the job.

Redis happens to be a great choice for many things, but ACID compliant storage
is not really one of them at the moment, although I suppose you could set up 2
Redis instances, one for fast in-memory usage and one that syncs to disk on
every command for now.

From my experience, Redis is great for the following:

\- Session storage

\- basic LRU cache

\- Publish / subscribe

\- key value store where you want to manipulate data in the form of common
data structures

\- simple / medium complex queries

Redis is not good for:

\- ACID compliant data storage AND fast in-memory access at the same time (the
holy grail)

\- Complex sql-like queries

Right now, if you try to make Redis persistent in the event of a crash (fsync
constantly), you will get a performance hit because the writes to disk will be
too frequent.

Antirez is trying to solve the problem of persistence while still keeping fast
in-memory access. The author does not feel like Antirez will succeed here, and
became "angry", which I find is a bit unreasonable.

I personally use Redis in production for what it's currently good for. If the
persistence with speed issue is solved, all the better. It's not an all or
nothing type of situation. The worst thing the author could do in this
situation is to pick SQLite as your one and only data store for the entire
stack. There's nothing wrong with having multiple types of datastores. In fact
it is the norm and trying to fight against it at this time is kind of
pointless.

~~~
lvh
Out of interest, if I'm implementing an LRU cache, why would I use Redis
instead of memcached?

~~~
mfenniak
You wouldn't. Redis doesn't support implementing an LRU cache. Redis does
function well as a cache, and is far better than memcached for any data you
don't want randomly disappearing, but it just doesn't do an LRU cache.

Redis will only evict data when it's explicitly removed or expired, and it
will store as much data as you put into it.

memcached will evict the least-recently used data based upon memory pressure,
and will only use as much memory as you configure it to.

They overlap in functionality, but I find they work better in complement to
each other than having one replace the other. Redis for data you want to
persist, possibly with a timeout -- user sessions, for example. memcached for
data you want to cache, as long you have the free memory for it.

~~~
rapind
You can configure a redis DB using the maxmemory flag instead of setting
expiries, so yes it can be used as an LRU cache.

Probably not as fast as memcached, but one situation where it would be
appropriate is if you're already using redis in your stack for something else
(fast writes). Rather than adding another piece of complexity to your stack
you can instead get double use out of redis as a cache as well.

See: <http://antirez.com/post/redis-as-LRU-cache.html>

~~~
mfenniak
My mistake, I was not aware of those configuration options. Thanks for the
correction.

~~~
rapind
Well in your defence, I don't think they're in the current stable release, but
rather the 2.2 branch (RC).

------
nwmcsween
A little offtopic but from my brief understanding of redis is that it manages
it's own memory using LRU? Why not let the operating system handle this? It
seems redundant to handle memory with LRU when Linux, etc use LRU or ARC or
something similar.

