Hacker News new | past | comments | ask | show | jobs | submit login
Rearchitecting: Redis to SQLite (wafris.org)
360 points by thecodemonkey 15 days ago | hide | past | favorite | 121 comments



I’m really interested in this model where each application server has a copy of a SQLite database file which is then replaced on a scheduled basis.

Here it’s being used for web application firewall rules.

Another place I’ve thought about using this is feature flag configuration. Feature flags can be checked dozens of times per request and often need the kind of queries (user is a member of group A and has an IP located in country B) which could be well served by a local SQLite - and feature flags have a tolerance for updates taking a few seconds (or longer) to roll out.


> I’m really interested in this model where each application server has a copy of a SQLite database file which is then replaced on a scheduled basis.

BTW, this is also the model used by all CDNs, where the global configuration file containing the certificates, HTTP routing rules etc. for all customers will be updated into into a single-file b-tree structure*, and that "bundle" is distributed among all edge locations frequently.

* I'm yet to see someone use sqlite for this purpose, it's usually DBM style databases like LMDB or Kyoto Cabinet.


> Kyoto Cabinet

Now, that's a name I've not heard in a long time.

Are people still using Kyoto Cabinet in new projects? Are people still using DBM-style storage generally? I thought that whole branch of the evolutionary tree had sort of died out.


> Are people still using Kyoto Cabinet in new projects?

Cloudflare used to use Kyoto Cabinet[1] and moved to LMDB[1] in 2020; other implementations that I'm familiar with (but don't have a link to share) also use LMDB.

> Are people still using DBM-style storage generally?

It's fairly common in these scenarios, as well as the underlying key-value store for popular software like Consul[3].

[1] https://blog.cloudflare.com/kyoto-tycoon-secure-replication/

[2] https://blog.cloudflare.com/introducing-quicksilver-configur...

[3] https://github.com/hashicorp/consul/issues/8442


Cloudflare moved from lmdb to RocksDB for production.

https://blog.cloudflare.com/moving-quicksilver-into-producti...


Yeah that plus zeromq. Very curious.. I always thought T/K cabinet and similar would become more popular with the adaption of ORMs


I’ve worked on a project a long time ago where we did this with BerkeleyDB files. BDB was used to store configuration data that was frequently looked up.

Periodically we would run a full sync to replace the database. Between the periodic full syncs, we had a background process keep changes applied on a rolling basis.

All-in-all, it worked pretty well at the time! The full database file sync ensured a bad database was timeboxed and we got a bootstrapping mechanism for free.


The query engine in SQLite can be weak. In particular where JOINs across multiple columns are concerned. You really do need to be aware of this when designing indexes into SQLite files.

In any case, SQLite would serve this solution, but almost certainly with a measurable level of inefficiency built in.


This is the first I've heard of SQLite JOIN performance being "weak". I just spent 10 minutes scouring the annals of the web and didn't turn up anything relevant. Are there any additional links or other information you can share about this claim?

Edit: @akira2501: SQLite comes with different tradeoffs, sometimes superior and other times less so, depending on the use case. Blanket statements without evidence are unproductive in progressing the conversation..


Weeellll... I think it's safe to say SQLite's planner is simple, and trends towards "you get what you wrote" rather than having more sophisticated runtime monitoring and adjusting.

But as with all things SQLite, the answer is generally "it's easy to predict so just improve your query" and/or "there's a plugin for that" so it tends to win in the end with a bit of care.


https://www.sqlite.org/queryplanner-ng.html

SQLite's planner is anything but simple. It has a beam search over different plans, basically, it approximates full NP-hard solution search process using bounded space and time.


Seriously?

https://www.sqlite.org/optoverview.html

Chapter 2. Chapter 7. Chapter 8.

It should be _zero_ surprise to you that SQLite is not as powerful as other full database query engines.


I don’t understand how reading that documentation page makes it clear that other relational engines have better performance for joins.


I'm with you. I've encountered numerous situations where the query planner caused issues. I don't know if the alternatives are better, but it's definitely an issue. I've written multiple databases wrapping SQLite, and multiple bindings to SQLite. There are dragons.


I had to use sqlite once for something that would do a ton of read queries on a particular table. Sometimes just swapping the order of operands to an and operation generated a completely different query plan. Had to scour their optimizer document and painstakingly check the query plans of, and then manually tune, each sql statement.

But then it was quite fast.


> will be updated into into a single-file b-tree structure

I'm not knowledgeable on this, but my understanding was a b-tree is a way of sorting values that could be ordered in a certain way. Like this would be a b-tree of IDs

```

            [8]

           /   \

      [3, 5]   [10, 12]

     / | \     / | \  

  [1] [4] [6,7] [9] [11, 13]
```

You traverse by comparing your needle to the root node and going left or right depending on the results.

How is that done with configuration options? That seems like it would just be a regular hashmap which is already efficient to read. What would a b-tree of key/values even look like that wouldn't be less efficient than a hashmap?


Each number in your btree would actually be a key-value pair. So you can find the key fast, and then you have the value.

Databases including SQLite usually use b+tree for tables (a variant where only the leaves have data, the interior nodes only have keys) and regular btrees for indexes.


A hash table makes sense in memory. If it's loaded just right for fast access, it has holes - empty entries. That makes little sense if you are building a file that will be transferred to many places over the internet. Bandwith waste would be significant.

So it might seem that simply enumerating the data (sorted or not) would be a better option for a file. (After all, the receiver will read everything anyway.) I guess frequent updates make this inefficient, so a tree helps.



We used this model to distribute translations, feature flags, configuration, search indexes, etc at Airbnb. But instead of SQLite we used Sparkey, a KV file format developed by Spotify. In early years there was a Cron job on every box that pulled that service’s thingies; then once we switched to Kubernetes we used a deamomset & host tagging (taints?) to pull a variety of thingies to each host and then ensure the services that use the thingies only ran on the hosts that had the thingies.

In Ruby we called this “hammerspace” https://github.com/airbnb/hammerspace


In Kubernetes, pod affinities (not taints) are the typical and preferred mechanism used to ensure pods are scheduled on the same node as their dependencies.

1. Very cool

2. If you were making it today would you just use SQLite?


Yes, we would. And likely will switch to SQLite at some point in the future. (source, I work adjacent to these systems at Airbnb)


> Feature flags can be checked dozens of times per request

My strategy for resolving this is to fetch the flag value once, but to store it in the request object, so that a) you never have to take the expensive lookup hit more than once per request, and b) there's no risk of an inconsistent value if the flag is updated mid-request.


Where is the “session object” stored?


Apologies, I meant "request object". Corrected above.


What’s the use case for re-checking the same feature flag in a single session?

I can see why you need to check multiple different flags in a session and I understand the parent point about looking in SQLite for them (effectively a function call into a library in process address space rather than a call over the network for each flag).


Sorry, s/session/request/g; corrected above.

One example is a multistep transaction processing request. The feature flag could gate several branch points.

A memory-mapped SQLite file is great too, but the strategy I describe above is less code to write, adds no new dependencies, is quicker to implement, avoids the SQLite file distribution/availability issues, and should get you a very similar performance improvement.


Having now spent enough years thinking about feature flags during an extended migration (from our own internal system to LaunchDarkly), I've become convinced that a really useful primitive would be:

* SQLite, but every change you make increments a global version counter.

* Some way to retrieve and/or apply just the changes between version counter numbers.

Then, you could layer all manner of distribution mechanisms on top of that. You could imagine gossip networks, an explicit tree-shaped cache/distribution network, etc. Anyone who has a later version than you would be able to give you the update.

What would this get you?

* You can bundle the current version of a DB into your app artifact, but efficiently update it once your app runs. This would let you fall back to much better defaults than having no data at all. (eg. For feature flags or live config, this would let you fall back to a recent state rather than code-level defaults.)

* Any kind of client can send the global counter as an ETAG and get just the delta

* Reconnections if the network blips are really simple and cheap

* If the system also let you keep a couple of minutes of history, you could annotate the counter onto a microservice call (using W3C Baggage headers, for example), and evaluate your flags/config/data/whatever at a single version number across multiple microservices. Even without evaluate-at-time functionality, logging the generation number would help a lot when debugging what happened after the fact


Really neat ideas, anywhere something like this has been implemented?

SQLite for distribution is neat. FWIW - this is at least partially inspired by your datasette project which we may still try and do something with later on the reporting and data exploration side of things.


Years ago I had a conversation with a friend of a friend that worked at one of the big chip fabs in their lab dept. He mentioned they made very heavy use of sqlite as a file format for distribution. This was back in the "xml all the things" era and it struck me as such a smart breath of fresh air.

I'm honestly surprised it isn't more pervasive.


I've wanted to implement this on a distributed web server environment I manage. Right now there's a centralized MySQL database that the web servers read from when rendering a web page, but there can be lots of queries for a single render (page, sections, snippets, attributes, assets, etc.), and sending that all over the wire, while fast, is slower than reading from a database running on the same host. It'd be great to be able to copy the "master" database onto each web server instance, maybe once per minute, or just on-demand when a change to the data is made. I imagine this would make reads much faster.


That's how https://github.com/backtrace-labs/verneuil 's read replication is meant to be used. There's a command-line tool to recreate a sqlite DB file from a snapshot's manifest, with an optional local cache to avoid fetching unchanged pages, or you can directly use a replica in memory, with pragmas for (async) refreshes.

The write tracking needs to intercept all writes with a custom VFS, but once registered and configured, it's regular in-memory SQLite (no additional daemon).


rqlite[1] could basically do this, if you use read-only nodes[2]. But it's not quite a drop-in replacement for SQLite at the write-side. But from point of view of a clients at the edge, they see a SQLite database being updated which they can directly read[3].

That said, it may not be practical to have hundreds of read-only nodes, but for moderate-size needs, should work fine.

Disclaimer: I'm the creator of rqlite.

[1] https://rqlite.io/

[2] https://rqlite.io/docs/clustering/read-only-nodes/

[3] https://rqlite.io/docs/guides/direct-access/


(author here) I had looked at both Rqlite and the different commercial versions of this, but I didn't pursue them as they all seemed to require running an additional service on the host machines.


Yes, that is right, it would require a new service running on the host machines.

That said, I do think it depends on what you consider important, and what your experience has been in the past. I used to value simplicity above all, so reducing the number of moving pieces was important to my designs. For the purpose of this discussion let's count a service as a single moving piece.

But over time I've decided that I also value reliability. Operators don't necessarily want simplicity. What they want is reliability and ease-of-use. Simplicity sometimes helps you get there, but not always.

So, yes, rqlite means another service. But I put a lot of emphasis on reliability when it comes to rqlite, and ease-of-operation. Because often when folks want something "simple" what they really want is "something that just works, works really well, and which I don't have to think about". SQLite certainly meets that requirement, that is true.


rqlite is a great project! Sometimes I have been wondering, how hard would it be to embed it into web server process, like nginx or apache, as a module.


What would be the challenge with hundreds of read nodes?


You may want to check out LiteFS and LiteStream by benbjohnson. There was a time in 2022 where he was trending every week for his projects. The following snippet is taken from the LiteFS webpage. "LiteFS is a distributed file system that transparently replicates SQLite databases. You can run your application like it’s running against a local on-disk SQLite database but behind the scenes the database is replicated to all the nodes in your cluster."


Why sqlite but not something like Rocksdb which is optimised for NVMe.

While SQLite can be used on SSDs, it is not as heavily optimized for SSDs as RocksDB. SQLite is a general-purpose embedded database and was primarily designed for use on devices like mobile phones or lightweight embedded systems, where storage is often slower and smaller (e.g., flash storage or basic SSDs).

SQLite’s traditional B-tree structure involves more random I/O compared to RocksDB's LSM tree, which is less ideal for maximizing the performance of high-speed SSDs like NVMe.

SQLite’s limited concurrency (single-writer, multiple-readers) also means it cannot fully utilize the parallelism and high throughput potential of NVMe SSDs.


This is a high read, low write application. And SSDs offer very fast random I/O performance. SQLite is more than sufficient and the entire database may well fit into the page cache anyway.

We use something similar for our event based databases. We project millions of events into an in-memory object (usually a dictionary), and periodically persist that state as json in S3. It's guaranteed to be consistent across all service instances because the persisted state contains the event stream positions to catchup from. The only drawback of course is that it can use up to several GB of memory depending on how big the state is, but it's extremely fast and doing in-memory operations is trivial compared to using SQL or a library with api calls.


In a past life, I used this pattern in hadoop mapreduce clusters. A job would do "small-data" sql queries locally to pull configs, maps of facts related to the business domain, then pack them into sqlite db files and ship them up with the job. Hadoop already has a process called "job localization" where it can download files sent up with the job down to the PWD of the mapper/reducer. And then the mapper/reducer can use it read-only while doing big-data things.


> a SQLite database file which is then replaced on a scheduled basis.

You could look into WAL replication if you wanted an efficient way to update the copies. Something like Litestream.


litestream is great. Dead simple and rock solid IME.


I’ve utilized this in Lambda@Edge for use case half feature flag-y, half HTTP routing/proxy serving rules as mentioned in a sibling comment. Lambdas pick up ~50-200MB of data on first boot, and keep their copy through their lifecycle.

As requests come in, gather their features, convert to effectively an int vector, filter for row where match and serve request


This is the type of architecture we use for feature flagging, but it's just a JSON file.


> Feature flags can be checked dozens of times per request and often need the kind of queries (user is a member of group A and has an IP located in country B) which could be well served by a local SQLite - and feature flags have a tolerance for updates taking a few seconds (or longer) to roll out.

This doesn't sound right. A feature flag only requires checking if a request comes from a user that is in a specific feature group. This is a single key:value check.

The business logic lies in assigning a user to a specific feature group, which the simplest way means pre assigning the user and in the most complex cases takes place at app start/first request to dynamically control dialups.

Either way, it's a single key: value check where the key is user ID+feature ID, or session ID + feature ID.

I mean, I guess you can send a boat load of data to perform the same complex query over and over again. I suppose. But you need to not have invested any thought onto the issue and insisted in making things very hard for you, QAs, and users too. I mean, read your own description: why are you making the exact same complex query over and over and over again, multiple times in the same request? At most, do it once, cache the result, and from therein just do a key:value check. You can use sqlite for that if you'd like.


I've worked at places where the feature flag system was much more dynamic than that, considering way more than just membership in a group.

This meant you could roll features out to:

- Specific user IDs

- Every user ID in a specific group

- Every object owned by a specific user ID (feature flags might apply to nested objects in the system)

- Requests from IP addresses in certain countries

- Requests served by specific website TLDs

- Users who are paid members of a specific plan

- etc etc etc

It was an enormously complicated system, that had evolved over 5-10 years.

Not saying that level of complexity is recommended, but that's what we had.

Looks like I gave a talk about this back in 2014: https://speakerdeck.com/simon/feature-flags


> This meant you could roll features out to:

That's how feature flag services work.

The whole point is that this is not a querying problem. That's a group assignment problem.

Users/sessions are assigned to a feature group according to business rules. These can be as complex as anyone wants them to be.

Once a user/session is assigned to a feature group, the problem of getting features is a key-value query.


GP's comment is talking about checking multiple feature flags, not checking a single feature flag multiple times.


GP referred specifically to queries checking if "user is a member of group A and has an IP located in country B".

The number of feature flags is irrelevant. In fact, the feature flag and A/B testing services I used always returned all default treatment overrides in a single request.


Apart from network latency, one of the behaviours I've seen with Redis is that reads/write latencies are fairly linearly proportional to the amount of keys queried - which seems to be shown in your chart as well.

We had a different problem, where our monolithic app used both Postgres / Redis for different use cases and worked relatively well. However - it was a lot easier to shove new functionality in the shared Redis cluster. Because Redis is single-threaded, one inconsiderate feature that does bulk reads (100K+ keys) may start to slow down other things. One of the guidelines I proposed was that Redis is really good when we're reading/writing a key, or small fixed-cardinality set of keys at a time, because we have a lot of random things using Redis (things like locks and rate limits on popular endpoints, etc).

However, in your case, I'm guessing Redis shines in the case of a naive single-key (IP address) lookup, but also doesn't do well with more complicated reads (representing your range query representation?). Cool write up overall, I don't have a deeper understanding of how SQLite performs so well when compared to a local Redis instance, so that was unexpected and interesting to observe.


My experience with Redis is similar, where it often becomes a trap because people misunderstand it's strengths and weaknesses.

I think it's best to consider Redis a cache with richer primitives. It excels at this and used appropriately will be both fast and solid.

But then people start wanting to use it for things that don't fit into the primary rdbms. Soon you have a job queue, locks of various sorts, etc. And then it just becomes a matter of time until performance crosses a cliff, or the thing falls down for some other reason, and you're left with a pretty ugly mess to restore things, usually resulting in just accepting some data loss.

It takes some discipline to avoid this, because it happens easily by increments.

As for SQLite's performance, besides avoiding network overhead, a lot of people underestimate serialization and deserialization costs. Even though Redis uses a pretty minimalist protocol it adds up. With SQLite a lot of things boil down to an in process memcopy.


Somewhat related: for the Neon internal hackathon a few weeks ago I wrote a little Node.js server that turns Redis's wire protocol (RESP) into Postgres queries. Very fun hack project: https://github.com/btholt/redis-to-postgres


It sounds like a niche use case where SQLite does work quite well server-side without needing any replication, since the database is read-only.

Other alternatives may use static files loaded in-memory, but I'm guessing the data is more than you'd want to keep in memory in this case, making SQLite a nice alternative.


(article author here) - yes 100% and I hope that came through in the article that this is great solution given our particular use case and that it's not a 1:1 swap out of Redis or Postgres.


> Other alternatives may use static files loaded in-memory, but I'm guessing the data is more than you'd want to keep in memory in this case, making SQLite a nice alternative.

Ultimately a RDBMS like SQLite is what you'd get if you start with loading static files into memory and from that point onward you add the necessary and sufficient features you need to get it to work for the most common usecases. Except it's rock solid, very performant, and exceptionally tested out.


> Further, when we exhibited at RailsWorld 2023, there was a definite "blood in the water" vibe regarding Redis and the assumption that you'd automatically need a Redis server running alongside your Rails application.

I've only worked on one production Rails application in my career (and it did use Redis!), so I'm way out of the loop – is the ecosystem turning against Redis from a business perspective (I know there have been some license changes), or is it a YAGNI situation, or something else?

IIRC we used it mainly with Rescue to schedule asynchronous jobs like indexing, transcoding, etc., but it seemed like a neat tool at the time.


It's a little YAGNI - I think the biggest driver of Redis in community was for exactly what you described aysnc jobs and the tool most folks reached for was Sidekiq.

The 2024 Rails community survey just came out and Redis is still listed as the top datastore that people use in their apps.

FWIW - we found that while many folks are _using_ Redis in their apps, they're just using it for things like Sidekiq and not actually taking advantage of it for holding things like real time leaderboards, vector db functions, etc. so it's a little fuzzy the actual usage.


I think it's purely a simplicity thing.

Right now, most rails setups with decent traffic will have frontend boxes, a sql db, a KV store (redis or memcached), and a cache store pointed at the kv store, with, annoyingly, very different usage patterns than typical KV store usage, eg for maintaining api quotas or rate limiting.

Disk performance has gotten fast enough and SQL performance has gotten good enough that there's a movement to drop the KV store and split the usages (for traditional KV use and also backing a cache) to the sql db and disk, respectively. Plus new nvme disks are almost as fast and still much cheaper than ram so you can cache more.


I’ve found it useful in the past as basically very-smart (i.e. stuff like expiration built-in) shared memory. Potentially with clustering (so, shared across multiple machines).

In the era of k8s, and redis-as-a-service, though? It’s gonna be “shared memory” on another VM on another rack. At that point, just read & write a damn file off S3, you’ve already abandoned all hope of efficient use of resources.


More people should be using RocksDB as SSD these days are fast and you can have much larger KV store running off of it.


>While Redis is "fast" in comparison to traditional RDBMS, it's still a database that you have to manage connections, memory, processes, etc., which introduces more brittleness into the stack (the opposite of what we're trying to achieve).

Every database, Relational or Nonrelational, requires approximately the same level of management and maintenance when you start dealing with non-toy levels of transactions.

The "Fast" part is a little funny. If you don't care about joins, then row inserts and retrievals are pretty damn fast too =)


> The "Fast" part is a little funny. If you don't care about joins, then row inserts and retrievals are pretty damn fast too =)

What makes SQLite exceptionally fast in a server environment is that you do not require a network call to do the query or even retrieve the data. Your remarks about joins and transactions are meaningless once you understand you're just reading stuff from your very own local HD, which is already orders of magnitude faster.


You can connect to Redis over a local socket, no network overhead.

Modern cloud architecture hates local sockets and wants to push everything to the network, but if you care about performance and are using Redis, that’s how you’d favor deploying & connecting to it.


> You can connect to Redis over a local socket, no network overhead.

I think at some point we need to stop and think about the problem, and if an idea makes sense.

Even Redis has a disclaimer in their docs on how SQLite is faster due to the way it does not require network calls. The docs also explain how it's an apples-to-oranges comparison.

https://redis.io/docs/latest/operate/oss_and_stack/managemen...


(Nit: redis incurs at least IPC overhead, not necessarily network, which would be a ton worse—but if you’re not using Redis for IPC then I don’t know why you’re using it, so IPC overhead goes with the territory)

Redis is best for “smart” shared-memory-type resources. Queues that manage themselves, cache or locks or sessions with built-in expiration, that kind of thing. Shared smart logic for heterogenous same-host clients via extensions.

Remote Redis has never made much sense to me. Clustered-mode, sure, potentially, but with remote clients as the primary method of connection? No. It’s a really good (excellent, even) product but its sweet-spot is pretty narrow (even if the need is common) and basically nonexistent in now-typical container-heavy no-two-resources-are-required-to-be-on-the-same-actual-hardware situations.


SQLite has its vacuum operation, which is kind-of like running a garbage collection. Every time I read the docs about when to run a vacuum, I end up confused.

The last time I shipped an application on SQLite, I ended up just using a counter and vacuuming after a large number of write operations.


HashBackup author here, been using SQLite for about 15 years.

Doing a vacuum after a large number of deletes might make sense. The only real purpose of vacuum IMO is to recover free space from a database. Vacuum may also optimize certain access patterns for a short while, though I have never tested this, and it would be highly dependent on the queries used. If fragmentation is a bigger concern for you than recovering free space, you can also compute the fragmentation to decide whether to vacuum by using the dbstat table:

https://www.sqlite.org/dbstat.html

Then again, computing this will require accessing most of the database pages I'm guessing, so might take nearly as long as a vacuum. The other gotcha here is that just because db pages appear to be sequential in a file doesn't mean they are sequential on a physical drive, though filesystems do strive for that.

SQLite has pragma commands to tell you the number of total and free db pages. When the percentage of free pages is greater than x% and it's a convenient time, do a vacuum. For a highly volatile db, you can add a table containing this percentage, update it every day, and make your decision based on an average, but IMO it's easier just to check for more than 50% free (or whatever) and do the vacuum.

Vacuums used to be (circa 2019) pretty slow operations, but the SQLite team has sped them up greatly since then. Vacuuming a 3GB SQLite db on a SSD takes less than a minute these days. That's with the db 100% full; with only 50% used pages, it would be considerably faster.

Vacuums are done in a statement transaction, so you don't have to worry about a "half vacuum that runs out of disk space" screwing up your database.


> Every database, Relational or Nonrelational, requires approximately the same level of management and maintenance

I disagree with this statement. Surely there is a difference on the effort for "management and maintenance" if your database requires the operation of independent server processes.

Going to extreme examples, do you really believe it makes no difference whether you use SQLite or Oracle?


People reading this might be interested in Redka - Redis re-implemented with SQLite in Go: https://github.com/nalgeon/redka


Was interested and considering switching until I saw this part:

>According to the benchmarks, Redka is several times slower than Redis.

Still a cool project, don't get me wrong. But this kind of doesn't give me any incentive to switch.


We (keydb users; it's much faster than redis for all our cases) use redka for our dev machines; we develop everything on sqlite so there is no install of anything and in prod, we just switch to our mysql, clickhouse, redis etc cluster and it all works while having a light experience for dev.


How are you guys using sqlite in dev instead of clickhouse? (Afaik there's a good bit of difference between the two dialects so I'm surprised it's possible without hurting dx through one compromise or another)


We have our own query language based on prolog which compiles to efficient queries depending on the underlying db. We haven't caught any cases for about half a decade where humans could do better queries. We are in a niche market so this is not a catch all solution; it is specifically for our market.


And I assume you have no intention of open-sourcing that query language and its compiler? Because it sounds pretty awesome...


The slowness would be inevitable because the architecture combines the weak point of Redis (network stack) with the weak point of sqlite (disk access).

It abandons Redis' in-memory data and sqlite's in-process speed... for what?


Holy cow this is amazing, I've been looking for something like this for years!!

Thanks for sharing.


Could you clarify, please Redis usage?

v1 1) In v1 they had waf and redis on the same server 2) Client went to the admin panel to set new rules 3) Rules went to redis that is on the same server with admin panel 4) Thanks to redis internal synchronization mechanism rules were updated to all of the redises(that are stand locally with waf all over the globe) 5) When new request come to some waf, waf verified request/ip with updated redis rules Do I understand v1 correctly? Redis infrastructure was used to spread new rules by itself?

v2: 1) They deleted the redis cluster 2) Every waf server now has sqlite db 3) They made some synchronization mechanism to spread new rules from admin panel to every server that contains waf and sqlite 4) When a new request comes to some waf, waf verifies request/ip with updated sqlite rules. And that is very fast!

That is the case?


Best quote:

"SQLite does not compete with client/server databases. SQLite competes with fopen()."


The dataset is 1.2 million entries, which looks big, but really is not that much.

If this is uncompressed IPv4 addresses, it's just 4.8 MB; and with some trival compression (like a 2-level trie), it could be about 2x smaller. Even if it's uncompressed IPv6, that's still just 32 megabytes.

Does Ruby support mmap? If yes, I'd suggest direct IP list. Lots of fun to write, big speedup over sqlite, and zero startup time.


It's posts like this explaining architecture decisions in detail I am reading HN for. Thank you!


(author) - It's genuinely delightful to know that you liked it.


Might want to check into this to do your SQLite db copies:

https://www.sqlite.org/draft/rsync.html


That's interesting, but it probably wouldn't work for our use case as we'd need to ship that binary utility to the platforms (unless I'm missing something).


> Benchmarking is a dark art of deceiving yourself with highly precise numbers

.


I have a hard time believing that Redis local was beat by SQLite local unless the workload was poorly fit for Redis structures, or the integration code wasn't well written.

But always happy to see a discovery of a better solution. I agree removing the network is a win.


I do agree it is somewhat fishy of the large performance difference not being explained by comparatively fundamentally poor data access patterns.

However, Redis runs as an out of process server with marshaling and unmarshaling of data across sockets. SQLite is in process and with a prepared query is basically one library call to a purpose built data access VM. So I’m not sure why it would be hard to believe this cache and TLB friendly setup can beat Redis.


Thank you. First explanation what might be the root cause :)


In Redis, the data is a sorted-set that we forced into being lexicographically ordered by setting all the scores to 0. We went through a lot of iterations of it and to be clear it's not _slow_ it's just not as fast as essentially `fopen`

1 - Redis sorted sets - https://redis.io/docs/latest/develop/data-types/sorted-sets/


Just look at their use case. The have to store a large amount of logs of web-sites visitors or bots. Of course Redis is a very bad choice, because most servers don't have that much amount of memory. It's really useless to store logs in the RAM.


Nice post! I’m curious how the SQLite-per-instance model works for rate-limiting in the scale-out scenario. I took a cursory glance at the docs but nothing jumped out at me about how it works.


Post author and Wafris co-founder here. Conceptually "rate limiting to prevent abuse" (what we're doing here) and "rate limiting for API throttling" have different levels for tolerance.

With that in mind, it's setting higher levels of limiting and doing the math to push that out over many machines/instances/dynos. That helps for things like scraping prevention, etc.

For issues like credential stuffing attacks, you'd want a lower limit but also coupled with mitigations like IP bans, IP reputation, etc. to deal with underlying issue.


I really wish there were a compatibility layer that could sit on top of SQLite and make it pretend to be redis, so we could switch more things to use SQLite. It doesn't even need to satisfy all the distributed systems guarantess or even implement proper pub/sub, it could just do everything with polling and a single event loop. It would be great for smaller projects that want to run something like celery or any app that depends on redis without needing to install redis.


You can use smoothmq (SQS over sqlite) for a celery backend: https://smoothmq.com/



Wow amazing, thank you so much! I've spent many hours over the years looking for a project like this, but it makes sense that I haven't seen this yet as it's only 6mo old.


If you need writes, can just use second sqlite database.


It would be a small command log (a batch of requested changes from that client) with a completely different schema from the main database.

But if we are sending deltas to a central server performance isn't critical: there can be a traditional web service to call, without uploading databases.


NICE!

I have not used Redis myself, but have been using Sqlite more and more over the years.. and found a perfect application I wrote using Sqlite under the hood.

Powerful and convienient database system!


How large is the SQLite database you're syncing?

Is it even "worth" using SQLite at this point? What about a configuration file, and straight-up code that works with in-memory data structures?


This is something we seriously considered. The SQLite dbs are several hundred megabytes in size (millions of IP ranges) so while it would be technically doable to send around rules files as JSON or something more specifically suited there's still a number of wins that SQLite gives us:

- Really strong support across multiple platforms (we have clients for most of the major web frameworks)

- Efficiency, sure we have lots of RAM on servers nowdays but on some platforms it's constrained and if you don't have to burn it, we'd just rather not.

- When we started mapping this out, we ended up with something that looked like a JSON format that we were adding indexes to....and then we were re-inventing SQLite.


I don't know how it works exactly, but I believe you can have a fully in-memory SQLite database. Bun's sqlite library and SqlAlchemy both let you operate on in-memory SQLite db's which you can then write to disk.

Edit: reading the docs it looks like it operates the same way, just reading sections of the db from memory instead of disk

https://www.sqlite.org/atomiccommit.html


You can, but that's not the point. I basically asked if they should (gasp) write code that did the lookup.

See the other response from the article's author.


So is the sqlite file on disk or in memory somehow?


The sqlite db is on disk sync'd down to the clients from our service.

The client is responsible for checking with our servers and, if rule updates are found, downloading a new database file. To avoid locking and contention issues, these are each uniquely named, and which DB is "current" is just updated.

Note: This is only in "managed" mode. If you'd rather, you can distribute a SQLite database of the rules alongside your app.


> on disk or in memory somehow?

Due to the magic of the page cache, the answer to that can be "both".

If the sqlite database is being read often and not being written, the page cache will be valid and reads will pretty much never go to the filesystme.


I visited this site from safari on iOS while being in a Marriott hotel. I am blocked. So the WAF works.


I wish there was a repository with lots of posts like this one. Super useful to learn from!

A bit strange they replaced Redis with SQLite rather than LMDB or RocksDB which are key-value stores


Is the benchmark code available somewhere / open source?

Wonder if they had indexes on their SQLite tables?

Not seeing a mention of that in the article.


The answer is "yes." We had indexes - but it's also a little more complicated than that, as we're storing IPv4 and IPv6 ranges in a single table in a format _designed_ to be indexed a particular way.

In the article, we refer to this as "decimal lexical" formatting, where we're taking the IPs and making them integers but actually treating them as strings. We're doing this in both Redis with sorted sets and then in a single table in SQLite.

I was going to explain all this in the article, but it was too long already, so it will be a future blog post.


Really great article and I really appreciate seeing this "flavour" of "distributed" sqlite, think it can be useful in many no/low-write scenarios.

But about the formatting of the data, is it completely inherent to the rest of the system / unchangeable? Spontaneously I would have guessed that for example a bitfield in redis would have performed better. Did you test any other formattings?


Curious, what is the advantage of decimal? Why not base-64 or some larger and power of 2 base?


You should definitely write an article with all tricks you used to make it fast!


> Even if the SQLite performance was significantly worse (like 2x worse) in the benchmark, it would still probably be faster in the "real world" because of network latency, even to a Redis that was in the same data center/region

... Why not run redis on localhost?


For some reason everyone connects to it over the network now.

As someone who was a relatively-early adopter of it on real servers and hand-managed VMs (i.e. we also controlled the hardware and host OS for the VMs) for a higher traffic site than most of the ones that think they need auto-scaling cloud shit from day one will ever reach, and was/is very enthusiastic about redis, I have exactly no idea why this is a popular way to use it. Cargo-culting and marketing (how else are you gonna sell it as a service?) are all I can figure as the motivations.

Connecting to it over a network is a handy feature for, like, ancillary things that need to connect to it remotely (maybe for reporting or something?) but as the main or only way of using it? Yeah I don’t get it, you’ve just wiped out a ton of its benefits.


FoundationDB

Isn’t “redis to sqlite” effectively what foundationDB?

https://www.foundationdb.org/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: