We use SQLite as a file format for our desktop application (based on PySide) using SQLAlchemy. I decided to use SQLite after realizing I was slowly reimplementing many features of a database. I also read this [0] article entitled "SQLite As An Application File Format," which was the final straw for me to take the plunge.
Overall, the experience has been a joy. Our application is an engineering simulation application. Our core simulator is written in Matlab for historical reasons, and we can communicate data easily using the database. Writing GUI models and views over the database on the Python end of things is very straightforward, once you have your SQLAlchemy models set up.
I agree that migrations can be a pain, but thankfully our tables are usually small enough that we can alter tables simply be recreating them. I also found this [1] StackOverflow answer that explains how to easily change a column name in place.
Those looking for a GUI to view SQLite databases should check out sqlitebrowser [2]. It's the best I've seen, and the developers are very responsive to bug reports and pull requests. It is also cross platform, and is present in many Linux package repos.
That's exactly the use case for SQLite - as a part of a desktop application that is entirely local and needs features from relational databases. If you're running a web server, using Postgres is straightforward enough that you might as well start with that. Especially in a Rails environment.
From the less little known things I love is the sqlite backup api. Using it, I can "copy" the .db from the OS into the memory, work on it, and "copy" it back. This is done, since changes can be made in-memory which might not have to be written back. Off course if memory usage becomes a problem, simply switching from :memory: to some kind of temp file per process would do it.
Recently I've had some success with speeding up (up to 4 threads) on scaling reads - e.g. SELECT xx FROM yy LIMIT ll OFFSET oo ORDER BY zz - where it spreads on different threads "ll" and "oo". Now this ignores the fact that the db might be changed and wont get the same results for each thread (I can't get the same transaction number from different threads/procs as I can with PG) but still works for some tasks.
Recently introduced CTE's are killer feature, since we have a bit of parent/children derived column values, and we keep filling values from the parent into the children, while with CTE this could be evaluated everytime (at some cost, but then less data being filled overall).
I'm a huge fan of SQLite. However one limitation I've run across is a good UI to admin it or develop for it. Right now I use something called "SQLite Administrator" (Orbmu2k) which is both really nice in some ways but super buggy...
Anyone found anything better? And, yes, I know it ships with command line tools but particularly for development it is nicer to have a GUI to quickly set up tables.
Gosh I adore using Base, to the point where I choose SQLite over Postgres/MySQL more times than not, because its interface makes working with a joy. Sequel Pro and PG Commander are good, but not as good (well, the former is pretty close, but then you're using MySQL which I just find painful these days).
Firefox uses SQLite for a bunch of internal data and the bindings are accessible to extensions. So basically Firefox is a cross-OS platform that's guaranteed to be linked against SQLite, with public bindings.
It probably works better than many of the Java-based 'cross-platform' solutions to many problems...
Mozilla took a stance against WebSQL because there was essentially no "standards specification" for it. The standard has rather been defined by its implementation: SQLite. Obviously, this SQLite is a very pragmatic solution, but Mozilla wasn't just happy with something that couldn't have an alternative and therefore permanently constrained. Instead they implemented [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_A...), which is an open standard. However, indexed DB is not a relational SQL store and is therefore less familiar to developers. Instead, it is transactional indexed key-value store on which it is possible to develop an SQL or other type of store.
Using Adminer [1] and running it via `php -S localhost:8080 adminer.php` works beautifully. I also use Adminer to manage my Postgresql databases on my Mac.
I can recommend Adminer as well, it's just a single PHP file. The GUI is minimalistic but intuitive and very fast. The best part is probably the support for 8 database systems (MySQL, PostgreSQL, SQLite, MS SQL, Oracle, SimpleDB, Elasticsearch, MongoDB) and 30+ localizations, so you can use the same GUI for all your databases. Although probably not all drivers support all the features, as far as I know SQLite, Postgres and MySQL are essentially 100% covered (Don't know about others, since I don't use them).
Related... anyone know what happened to Induction? https://github.com/Induction/Induction looks like the guy who started it lost interest and no one picked it up?
Intellij IDEA Ultimate (or any of their other paid IDE tools) has a built in cross-db GUI tool for editing, creating and modeling databases. If using Intellij IDEA, it also has support for SQLite on Android, which is a huge benefit if you ever had to deal with SQLite on Android.
Aside from that, the IDE will also detect SQL queries in one's code and syntax check them. It also lets you run the query directly from there with user generated parameters without having to execute the code file itself.
P.D: Just because is a commercial product down vote the parent? In this area (Sql managers) the commercial ones are better than the open source ones, by a LOT.
Yep, sometimes one has to part with a few quid to get the job done right and friction free. That aside, my mainstream MySQL utility is HeidiSQL which is in fact free and open source (although I did bung them a donation because I think it's a neat tool):
We are using custom build of sqlite (more db columns, returned columns) and for us a ui that either calls some .exe or loads .dll is preffered.
My favourite was SqliteSpy but its statically linked, so had to switch to sqlitebrowser, and compile it with our own custom build of qt5 where "system" sqlite (our custom one) is used.
Maybe one good case where DLL/.so/.dylib shine (if only golang... :) )
> Maybe one good case where DLL/.so/.dylib shine (if only golang... :) )
Possibly, but it's mostly only useful if you can't recompile the app (i.e. don't have source).
And in that case, you've probably paid money for it, and want support. Which you won't get if you've swapped out a component for one it wasn't tested with.
[But I agree with your general point. DLL search path/LD_LIBRARY_PATH can be a useful 'speed hack' to drop in a different lib with an identical API and minor tweaks. e.g. overload malloc/free/realloc() in some misbehaving app to trace a problem without recompiling.]
I'm using SQLite for most of my projects, in production and for hobby. When Python ORM is required, I'm using peewee. One of most important features of SQLite is that it's included in Python base libs. One very important factor is that it's also faster than many other databases out there. What? Faster? Yes, because it runs in the process, and doesn't require context change to proceed. So if I make 100k queries looking up information, it's probably fastest using SQLite than other databases, which run in their own process.
Timing 100k reads from database: MongoDB: 43.3 s SQLite: 19.4 s
Same test with 4 parallel threads MongoDB: 29.9 s
SQLite: 25.1 s
So as we can see, SQLite is much faster for single thread batch processing that most of other databases. With 4 concurrent threads SQLite is still faster than MongoDB.
About web sites, when using WAL mode, I can handle easily over 200 write transactions / second using very light single core VPS server with SSD. This means that it should be trivial to handle at least a few million hits / day, each with a few write transctions. Basically other things start to block at least with that server, before the pure database lock, write, release cycle becomes the bottle neck.
Whoah what? You can compile BerkleyDB to match the SQLite API? Has anyone here ever done that in production, and if so, how does it go? I can see that being amazing
Edit:
Hold on. As per Slashdot[0] and the license itself, BerkleyDB Open Source is licensed under AGPL, which means that it will be triggered even as part of a web service. That sort of rules it out for me personally, although I'm curious how much a commercial license would cost.
if you open the Sqlite source, the Vdbe engine, the engine that process the SQL queries to the backend, use a "simple" key value store, that is the on-disk btree backend of the sqlite.. so you can create a btree api-like using other key-value store as backend, and it would work the same..
This way you can give a full SQL engine to any key-value store out there..
Thats why the Sqlite4 are being designed to be more plugable.. with a shim key-value wrapper over the storage backend that can be changed, in compile time, or even in runtime given its use of C callbacks.. (but dont know any reason someone would want to to that.. to do a runtime switch anyway)
Replacing the btree engine in SQLite3 is far from easy. BerkeleyDB did it with major surgery to the SQLite3 source tree. LMDB did the same. https://gitorious.org/mdb/sqlightning
As per the AGPL, you don't actually have to distribute stuff that is linked to an AGPLed library or other software, if you are only using it internally (and/or via a network). Because you can claim it as GPL3, which doesn't have the network distribution clause...
From clause 13 of the AGPL:
Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the work with which it is combined will remain governed by version 3 of the GNU General Public License.
It was a good write up. I've used SQLite to store small molecule structures for several research projects and it has never let me down.
One item I had hoped to read, which often is mentioned on SQLite reviews, occurs in the area of "When would SQLite not be a good choice?", specifically: "very large amount of traffic... Very large data-sets." I have always hoped for some (even wild) estimates of when that 'very large-ness' occurs. because in my hands, 17 million small molecule (inchi) structures don't even cause it to break a sweat. Will i hit a wall some day?
You are a long way from the practical limits. I have used SQLite databases with billions of rows that were still able to retrieve hundreds of arbitrary rows in less than second. We used it to store time-series performance data and generate charts. We had tens of thousands of these databases (not all with so many rows) and they performed their job admirably.
The one thing that SQLite cannot handle performantly is deleting large numbers of rows (millions) - so don't plan on deleting any data from your tables once they get that big. It appeared to me from a cursory examination of the code that B-tree rebalancing was happening after deleting each row which makes the big deletes very expensive. We got around this problem by sharding our data into a new table and a new database for each week and then mounting all of the databases necessary for a query. When we wanted to delete data we just deleted the database file with the corresponding shard. Obviously that only worked for our particular time series data.
Anyway, the bottom line is that SQLite is more scalable and has better performance than people give it credit for.
> An SQLite database is limited in size to 140 terabytes (247 bytes, 128 tibibytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes.
Very large amount of traffic:
> SQLite usually will work great as the database engine for low to medium traffic websites (which is to say, 99.9% of all websites). The amount of web traffic that SQLite can handle depends, of course, on how heavily the website uses its database. Generally speaking, any site that gets fewer than 100K hits/day should work fine with SQLite. The 100K hits/day figure is a conservative estimate, not a hard upper bound. SQLite has been demonstrated to work with 10 times that amount of traffic.
As a read-only database, sqlite doesn't have too many limits, since the storage mechanism is flexible enough to allow fast queries. The problem comes when you start writing, and reads have to wait for them.
Also, (I believe) it was this talk http://www.youtube.com/watch?v=ZvmMzI0X7fE that mentions sqlite being used in adobe lightroom, and it being faster to access thumbnails from an sqlite database.
Thank you so much for the comment, I'm glad to hear you've had good luck with SQLite in production. I switched all my personal stuff over to it a month or two ago, haven't looked back.
I can still fondly remember 11 years ago when I joined a startup and I was tasked to look for a database to use for our anti-spam product (Outlook plugin). SQLite was still very new back then but compared to other existing free and embeddable database - SQLite was better, faster and works in Windows 2000/XP. So we used SQLite and the experience with SQLite was that good that we also used SQLite as the database for our anti-spyware product.
For those that need a small, fast, embedded database, but don't need SQL, I'd strongly recommend looking at Google's LevelDB, which is under the New BSD License:
Did you submit a bug report? The Windows port of LMDB is developed on Windows7 x64. It is also (naturally enough) used in Windows x64 builds of OpenLDAP and those pass all tests.
Nope I didn't. I have to revisit what I did, since I've made a custom build rather than the official one (which in turn might've what caused the crash).
It is worth noting that recent vestions of BerkeleyDB are released under AGPLv3. This has significant legal implications, especially since the copyright holder is Oracle.
I have a couple questions about a use case. In general is it a security concern for your app that SQLite doesn't care about user permissions the way client/server DBs do? What is the best you can do? Just make sure the file perms are appropriate so that just your app can read/write it?
Also, The project I'm working on is a multi-client thing, but the vast majority of what happens would be a silo'd situation on Postgres. The webapp itself and site structure would be shared, but clients would create projects for data processing and analytics. Would it be reasonable to just create a new SQLite db file per project? In some ways that would make backups easier by project, but dumping all data for a client would kind of be a pain. Are there other gotchas about a structure like this?
The article says that SQLite may not be a good choice for large datasets. Assuming the other constraints (concurrent writes) are not a concern, why not? Assume a 750GB-1TB dataset of time series data, which fits on an SSD — what system would work better than SQLite?
Its not SQLite being slower, its a question of machine resources. When you use a Database cluster spanning across machines you are just throwing more resources at it, there no 'magic' per se going on in other tools. There is also likely better indexing which enables efficient querying.
Since the whole database is in one file though 140TB may be the filesize limit, searching through a index of 140TB data will still be a lot slower. That is the case even with client-server models or even with Hadoop.
Most people claiming to have a big data problem, actually don't have one. Its just bad understanding of SQL, coupled with NoSQL fashion which powers people to opt out of SQL. One more problem with SQL is, its a career path in itself. There is whole industry built around it, data base design, administration etc etc. And people who find this as high barrier to entry, take the easy way out and choose NoSQL based tools hoping it will act as a panacea- Only to re implement SQL badly at some point in their stack. But SQL has other advantages, it teaches you think about efficient representation of data. Which in turn leads to an overall better design of everything that connects it.
For most of your everyday so called 'Big data' problems, SQLite will work like charm. This covers most of the shops that claim to be doing big data work.
For the real big data problems, well then SQLite wasn't designed for it anyway.
> An SQLite database is limited in size to 140 terabytes (247 bytes, 128 tibibytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes.
The article mentions that databases can be up to 140TB in size, and yet it also mentions that it may not be a good choice for very large datasets - does this mean that approaching the upper limit is not recommended, or are we talking about something even larger?
I've been burnt by concurrent writes using Sqlite with Owncloud and multiple clients. Of course it may be the PHP connector and implementation, but if I remember correctly, it borked the database and Owncloud quit working. I had to re-set it up with Postgres.
But I really didn't know one could do concurrent writes with Python and Sqlite. I've always been very careful after getting burnt, so this is a good article. I love Sqlite and use it all the time and wish I could use it more. The single file format and lack of complex setup is pure joy.
We are doing super fast online analytics, mixing with traditional SQL supported by SQLite. It's very easy to extend and modify. It's a programmer's dream.
I've been using SQLite in production with a few dozen kiosk-style systems (mostly OSX Mac Minis w HDDs) for a few years and while it's generally been great to work with, with unexpected power loss, I've intermittently gotten corruption w partial/total table losses.
I looked into switching to Redis AOF or some other append-only system, but adding UPSs and switching to Linux-based systems w power loss protected SSDs seemed solve the problems.
SQLite is great for its mobile crossplatform availability. We recently used for a crossplatform app (with Xamarin) on iOS, Android and Windows Phone.
However, we found the WP version of SQLite to be much much slower than SQLite on iOS and Android, where it is built-in. On WP, SQLite cost about 3 seconds extra on application start up. Eventually we replaced SQLite with custom json serialization to get the the same performance as iOS/Android.
You clearly can use it but is there a good reason to use SQLite when you could use Postgres? I've just looked and it seems that Foreign Key constraints can now be enforced but I don't think that the type checking is as strong and there aren't the JSON capabilities.
SQLite is great and the obvious choice for structured data embedded in an application or on an embedded platform.
As you pointed out, really not the same use case.
Firefox and Chrome use SQLite. Imagine if you had to download and install PostgreSQL just to browse the Internet.
SQLite is not meant to be a multi-user shared database. It's never going to replace a database server if that's what you need.
The reverse of your question is: is there a good reason to use PostgreSQL when you can use SQLite?
Yes but this article is about web services where you clearly can use SQLite but I wanted to understand if there were good reasons to give up Postgres' advantages to do so.
Is SQLite a good choice for learning SQL? Most of the SQL books out there use MySQL as the primary vehicle but it seems like overkill to install on my laptop just to run through a few examples from a textbook.
On the other hand, if SQLite's flavor of SQL is missing some important stuff (I have no idea what) then maybe I should stick with MySQL.
As a rule of thumb, it should be fine if your goal is to learn the basics of SQL, and not-so-fine if you're looking to learn database administration.
I would not use it to learn how to alter tables, or even how to create and enforce a schema, as it doesn't do either quite like some of the other popular dbs out there.
I think the author missed one downside in that there is no master/slave replication built in. That's not the biggest deal as it's relatively trivial to dump the DB elsewhere on some regular schedule, but if you want to minimize the effects of a server dying, it's hard to beat built in replication.
That's not quite the same as you'd still need to have some way of exposing the database on a different machine. You could use NFS or some other network storage, but that's kinda hacky and doesn't deal with network failures well. (In fact, that documentation doesn't specify what happens if one database file becomes unavailable while you're attached to it at all.)
There are work arounds, of course, but I think it's a pretty big downside that it's difficult to set up robust replication with sqlite. It's certainly one of the things that cause people to say that it's not a "real" database.
You can also just (eg) copy the backing file wholesale. It's portable and endian-agnostic. I guess it depends on exactly what you're trying to do and what guarantees you're requiring.
I totally agree, I just thought it was odd that the article didn't mention it. There was a section on when SQLite might not be appropriate, but it never mentions this. Easily avoiding data loss is a huge feature for a db that's going to have any amount of production data, so if you're proposing that people use SQLite for production workloads, this should be acknowledged.
1. Can not rename column, in a fast iteration project, renaming schema is quite common
2. No easy way to upsert data. There should be an easy equivalent to MySQL's "INSERT ... ON DUPLICATE KEY UPDATE". If you use REPLACE you got PK changed.
Just today I encountered an "insert or update" situation and handled it like this:
INSERT OR IGNORE INTO table_name (item_key, item_count) VALUES (?, 0);
UPDATE table_name SET item_count = item_count+1 WHERE item_key = ?;
I'm not familiar with "upsert" so maybe there is a difference, but this worked well for my use case (which was "increment or insert then increment"). You could even reverse the operations (by adding OR IGNORE to the UPDATE statement), if you wanted to not do two operations if the pk didn't exist yet.
Postgres still doesn't have upsert either. Neither does sql server or Oracle. I hardly see that as a knock against SQLite. All of the above have various levels of hackey workaround so for this though.
Overall, the experience has been a joy. Our application is an engineering simulation application. Our core simulator is written in Matlab for historical reasons, and we can communicate data easily using the database. Writing GUI models and views over the database on the Python end of things is very straightforward, once you have your SQLAlchemy models set up.
I agree that migrations can be a pain, but thankfully our tables are usually small enough that we can alter tables simply be recreating them. I also found this [1] StackOverflow answer that explains how to easily change a column name in place.
Those looking for a GUI to view SQLite databases should check out sqlitebrowser [2]. It's the best I've seen, and the developers are very responsive to bug reports and pull requests. It is also cross platform, and is present in many Linux package repos.
[0] http://www.sqlite.org/appfileformat.html
[1] http://stackoverflow.com/a/6684034/1333514
[2] https://github.com/sqlitebrowser/sqlitebrowser