Please note that it's not for every application. It is not a client/server architecture. It simply a tightly optimized piece of C code that interacts with a file (or RAM) to create a lightning fast approximation of a SQL server that is ACID compliant.
One thing I've used it for the most, is for small, ad-hoc internal web applications. With the webserver as its "single user", it is incredibly speedy and reliable and holds a shockingly large amount of data (still works pretty well when data gets up into the Terabytes).
It also has a policy of safe by default, fast at your risk (mmap is itself an example, it is disabled by default), which is the only sane attitude when working with databases.
Contrast this to "modern" databases such as MongoDB, which is crazy fast out of the box, but then you start enabling durability, synchronous updates, etc... it becomes slower than standard DBs. This makes you win benchmarks, but in the long term causes a lot of headache and drives users away.
it becomes slower than standard DBs
Think about a document with hundreds of embedded documents which equate to joins in SQL.
MongoDB queues inserts into memory before flushing it to disk. This means it's not durable in case of a power outage before the data is flushed. Turn on the safety of flushing-first and it's more or less doing the same stuff your sql software is doing.
Being inside Chrome, Android, Firefox, Skype and other places (http://www.sqlite.org/famous.html), I'd argue that SQLite is by far the most widely deployed RDBMS in the world :-)
And, BTW, I love it too.
NoSQL provides command-line programs. sqlite is embedded in the same process. Error reporting across exec boundaries is much more complicated.
NoSQL requires standard unix command-line programs, and for at least one case "This program has only been tested with GNU sort" This makes portability to other OSes, specifically MS Windows but perhaps also OpenBSD, harder.
Hence there's no "simple" about using NoSQL for my use case.
The practical answer is - Wordpress is written in a MySQL-specific way. It doesn't officially support, say, Postgres or Oracle either.
Which probably isn't true.
SQLite is bad at concurrent writes. Your average wordpress install is almost only reads. And with 3.7.0's WAL, readers and writers don't block one another anymore.
Could you reasonably write blogging software that uses SQLite? Sure. It's just that Wordpress isn't that blogging software - it's written to be paired with MySQL, for good or ill.
SQLite can be ran in two modes 1) database reader/writer lock or 2) in WAL mode where readers do not block writers and writers do not block readers. Writers block each other though so there can only be one concurrent writer. WAL mode provides concurrency superior to what MyISAM did.
1. The locks could sometimes be bypassed by inserts, but not reliably so.
But if you really need a particular table to be writable without locking the rest of the database, it's probably a fairly decent workaround.
Wordpress-the-base-install has a thin abstraction layer over the database in the $wpdb object.
But plugins and themes can, and regularly do, use PHP's MySQL functions to access and manipulate the DB directly.
(That's the official WP line anyway, I'd love Postgres support myself.)
It suddenly hit me, I wonder if the sysadmins had installed DBD::SQLite...yes they had!
Why mess with all that crap if I could just use SQLite. Reasonably fast indexing, expressive searching tools, I could dump the data onto disk and use an in memory SQLite db at the same time, caching solved. It turned weeks of agonizing work (producing fragile code) and turned it into a quick 2 week coding romp. The project was a huge success and spread to a couple other departments.
One day they asked if I could advise a group to build a more "formal" implementation of the system as they were finding the results valuable. A half dozen developers and a year later they had succeeded in building a friendlier web interface running on a proper RDBMS (Oracle 10something) with some hadoop goodness running the processing stuff on 8 or 9 dedicated machines.
In the meantime, and largely because SQLite let me query into the data in more and more useful ways that I hadn't foreseen, I had extended the original prototype significantly and it was outperforming their larger scale solution on a 3 year old, very unimpressive, desktop with the same install of perl and sqlite. On top of the raw calculations, it was also performing some automated results analysis and spitting out ready to present artifacts that could be stuck in your presentation tool of choice. Guess which ones the users actually used?
Soon I had a users group, regular monthly releases and was running a semi-regular training class on this cooky command line tool (since most users had never seen a command line). I've since finished that job up and moved on, and last I heard my little script using SQLite was in heavy use, the enterprise dev team was still struggling getting the performance of their solution up to the levels of my original script, and hadn't even touched the enhancements yet.
SQLite was rock-iron solid the entire time. I had a few things I wished it could do, but I usually found a way to manage. I wasn't using it for its feature, it existing was the feature. SQLite sprinkled magic on my one-off hack job and turned into a thing that people ended up using.
So many projects I have prototyped using SQLite (although for most I later upgrade to PostgreSQL). Such a great little tool, and building DBD::SQLite the way I did (including all the source) made it so easy for anyone to install.
Thank you thank you thank you thank you
from the bottom of my heart.
DBD::SQLite has been one of the most important libraries for perl that I've ever used. Including the SQLite libraries instead of making it another dependency to install someplace makes it so unbelievably useful.
Plus it's surprisingly easy to move code over to using a more grown up database later.
This way I matched SQLite, but then I've made it x3-x4 faster by using COPY (SELECT *....) in BINARY format (took some time to write binary parser) - and what was loading for 3-4secs from SQLite, loads for 0.8secs from PostgreSQL 9.2 (Windows 7 / 32GB / 6 core / 12 cpu machine).
e.g. - the point is - that if you care, you could go and optimize a lot. This is not a fair story though, since I've used real enums, floats and ints on PostgreSQL which saved quite a lot of space. The db was bigger than SQLite even with that, but somehow faster... though faster only when used through multiple connections.
I could've go the opposite way, and find ways to make SQLite shine in this respect.
One lesson from the game development industry is that it takes 5 years to develop the best looking games at the end of the hardware cycle. And the hardware did not change even bit for the whole time.
It takes a lot of learning, experimenting, trying out strange things until you get good. And even then, someone comes and makes your solution look stupid :) (I hope someone comes and would do that for me, so I just sit, enjoy and smile at his/her code - several times I've seen it).
I needed to synchronize product data from our web application with that scanner in order to display useful info instead of just scanned bar codes.
Thankfully I found a patch for SQLite to work with the greatly reduced stack size in ucLinux (the site is still around here: http://www.menie.org/georges/uClinux/sqlite.html)
Now, after 9 years, one user is still holding on to their device and we're still synchronizing product data to that SQLite database. We never had crashes or data losses (even though recovery would be trivial) and ever since then, every subsequent hardware solution we were using always relied on SQLite for storage.
I don't even want to start thinking about how much pain it would have been to write my own database for that slow ck1 monster. Seeing SQLite run on the device in my serial terminal was butch a lt of fun and a great relief as it just saved me countless hours of work.
Me, I love SQLite :-)
http://en.wikipedia.org/wiki/SQLite#History (SQLite was created for software on missile destroyers)
The operating system must have a unified buffer cache in order for the memory-mapped I/O extension to work correctly, especially in situations where two processes are accessing the same database file and one process is using memory-mapped I/O while the other is not. Not all operating systems have a unified buffer cache. In some operating systems that claim to have a unified buffer cache, the implementation is buggy and can lead to corrupt databases.
Which commonly-used operating systems lack a bug-free unified buffer cache?
Seems like one where they had trouble was OpenBSD.
That said, we have a test for it during configure, and either compile with full mmap support or with dodgy fake mmap support (which is shit - it just reads the entire file into real memory every time)
Does it can be build using SQLite ?
Yes => Use SQLite ;
No => Use PostegreSQL.
Using SQLite on small/medium projects is so better than alternatives : keeps all the ease of text files while giving the full power and reliability of a RDBMS.
SQLite was a fantastic embedded DB that was trivial to integrate with the JS engine in Win32. I don't recall ever having to deal with any support issues around corrupted databases.
It looks like the IE toolbar is still using SQLite to this day - you can explore stumbledata.db from any SQLite explorer.
Richard started playing with it a couple months later
but didn't get any performance benefit at the time because he was Doing It Wrong(tm).
SQLite does a lot of excessive work; this work is justified if you're using it on an embedded processor that doesn't have a full virtual-memory based OS underneath. But on any other platform it's a waste of CPU/power/memory/time. And it's been decades since the majority of embedded applications ran on bare metal without some kind of OS.
This new mmap feature in SQLite promises "up to 2x" performance improvement, while LMDB in SQLite still yields 20x improvement over stock SQLite.
mmap() is great for reads, but much tougher for writes. this is because it is hard for the kernel to determine what it is you are trying to do. madvise() helps here, but not many projects seem to use it. using plain old write[v]() for writes can be informative enough to the kernel to "do the right thing".
With mmap, you map a file into process memory and read and write to it as if it were memory. When you write to this memory there isn't really a way to indicate that an error occurred because there's no function call to return an error code. Instead, the process is sent a SIGBUS to indicate that something happened. This causes control to jump from the memory access into the signal handler.
Normally this would be fine and you can handle the error there. But sqlite is a library. It lives in-process with the "real" application using it. What if the program had a signal handler there? You don't want to intercept signals that they need to receive, and you definitely don't want to mask errors that they are getting on their own mmapped files, and you don't know how to handle the errors that they should be seeing instead. And what if they have some crazy green-threading going on that you've now ripped control away from by entering the signal handler?
In this particular case, sqlite's answer is to crash when that happens, taking down the entire host application with it. That's not okay for a lot of applications, and why it defaults to being turned off.
And maybe this particular case does or doesn't apply to sqlite, but there are always cases that your application's domain-specific knowledge of how it uses disk and memory and CPU can out-perform the OS's more general optimisations.
That's not the only issue, and the particular subtleties of mmap aren't important here. What's important is that when very smart people like the authors of sqlite choose to not do something, it's generally not as much as "no-brainer" as some know-it-all types would have you believe. It's easy to read something like Varnish's "always use mmap and leave it to the OS!" manifesto and think that you now automatically know better than all of those silly plebeians that just don't get it, but usually things are more nuanced than that.
> I mean, its trivial to use if you know how to use fopen
I seriously doubt that the difficulty of implementing it is a concern
This could be an implimentation difference, but I've always accessed mmapped files as files, but I've only ever used it on *BSD and Solaris and it was very straightforward and easy.
tl;dr is "Improves performance in some cases with certain caveats/risks, is disabled by default"
Is it always? I remember benchmark sequential file reads and writes and plain old read and write were sometimes faster or there was no difference.
I just wish doing asynchronous queries with SQLite on Android was not so obtuse and resulted in a bunch of boilerplate or using third party libraries for a core part of the API.
On both iOS and Android the default ones are terrible. And even many third party ones aren't the best.