Looking for a technical debate: Why MOT use Redis as a primary DB?

al2o3cr · on July 9, 2021

One word: rollbacks

That's not really much of a "debate", but then again neither is "you totally don't need that feature we don't implement" from the marketing department :P

https://redislabs.com/blog/you-dont-need-transaction-rollbac...

kristoff_it · on July 9, 2021

I left Redis Labs a while ago and have been openly speaking my mind when it comes to the good and bad of my experience at the company.

https://kristoff.it/blog/addio-redis/

I stand by my words and in fact a good chunk of the company was not happy with my reasoning in that post (it's "better" to be coy than upfront about these things, according to some schools of thought), that said: you don't need rollbacks in Redis, whoever argues the opposite has never spent enough time learning how to use it.

node-bayarea · on July 9, 2021

I don't think that's the spirit of that blog you pointed out. Redis is quite different from traditional databases. And so things work differently and some features may be important in RDBMS, might not be that useful in Redis because of how different they are.

I think the blog talks about that in two sections. "First reason to use rollbacks: concurrency" and "Second reason to use rollbacks: leveraging index constraints" says why it's different in Redis.

Going back to the car analogy, if you are an electric vehicle, some things are completely obsolete when compared with gasoline vehicle.

injb · on July 9, 2021

Generally the idea of caches is that they contain a small subset of all your data in a type of storage that's more expensive, but faster. The extra complexity of primary databases is largely down to the fact that they have to use disk storage.

BTW, you're way off the mark with the numbers in your car analogy! I know the precise numbers are not the point, but it's really not 90% vs 25% efficiency. The oft-quoted figure of 25% efficiency refers to thermal efficiency. It's a measure not of how much power is wasted by the car, but how much of the energy in the fuel is turned into useful work. To compare electric cars, you have to consider how the electricity is generated. It's mostly generated by burning fossil fuels. It's more efficient than petrol cars because it's done at such a large scale, but it's not 90%. It's more like 40-50% afaik.

node-bayarea · on July 10, 2021

Hi there, I agree that traditionally (and even now) that's how cache works. But the system like Redis that provides that service has gotten much more powerful over the years and so you can use it as a traditional Db.

The car efficiency numbers came from Inside EVs as the article points out. I think they did a pretty thorough job. https://insideevs.com/features/392202/ice-vs-ev-inefficient-...

softwaredoug · on July 10, 2021

What you're describing sounds a lot of why NoSQL databases came into being. A cache is usually very fast at a very narrow data problem. Caches are almost always flat bits of data with no structure. They're also eventually consistent. They are purpose built data structures for a very small piece of the problem done over and over.

An RDMS is a centralized, consistent source of truth, with more normalized data, that can answer most questions with reasonable performance.

When you remove the RDMS, what happens?

- You lose the ability to answer ad-hoc relational questions (what stuff does the user own? And in those things, which of them is located in Chicago?). This means building a new feature means building a new "cache" aka a brand new data structure _just for this one use case_ that might be a one-off

- You lose a centralized PoV on your data consistency. One view of the cache says the user's item is in Chicago. Another says its en route to LA from Chicago. How do you resolve these conflicts? Are you going to build your own consistency systems? Based on what exactly? Now you've essentially built a new kind of distributed database.

- Much of the time we don't need caches. If you always have to bust your cache, the cost of constantly rebuilding a cache, just to throw it away, can greatly exceed any value from that cache. Most people can just read from the RDMS and get what they need for 90% of the use cases in most apps.

BTW I used Redis as a primary DB for a few years for Quepid. You can read the full story here

https://www.slideshare.net/AllThingsOpen/stop-worrying-love-...

Long story short, it was fine, but Redis didn't allow for much structure. There was a lot of "logical" relational data modeled very awkwardly. It made it hard to extend beyond the original data model.

node-bayarea · on July 9, 2021

https://redislabs.com/blog/dbless-architecture-and-why-its-t...

altdataseller · on July 9, 2021

Redis, at least the early versions (haven't upgraded since 2018) don't deal with bloat very well. Which means if you keep adding/deleting items, the amount of memory it will use up could easily be twice what it actually would be if you just dumped everything to a file.

node-bayarea · on July 9, 2021

Do you mean there is some kind of memory leak? Maybe you should try upgrading to the latest Redis. It's a lot more feature rich and powerful these days.

billconan · on July 9, 2021

can you fit all data in memory?

node-bayarea · on July 9, 2021

You, if you look at very large customers of ElastiCache or Redis, or just companies like Twitter, they all use a very large Redis clusters to store Terabytes of data. You can also use"Redis on Flash" and save a lot of money when compared to DynamoDB and others for similar size of data. https://redislabs.com/redis-enterprise/technology/redis-on-f...

billconan · on July 9, 2021

how to handle node failure? what about data persistence,

data integrity without ACID?

complex queries?

taf2 · on July 9, 2021

Don’t use redis for those requirements

For failure use sentinel or cluster - works great

yuppie_scum · on July 9, 2021

You probably want Clickhouse