Apparently the latest and greatest SQLite is sometimes 50% faster than 3.7.17 released in July of last year. Impressive!
I also love using SQLite for testing compiler optimization passes. Their TCL test suite coverage is quite good. Unfortunately, I think you have to splurge tens of thousands of dollars to get the C-based test suite. I believe you can only test full branch coverage with the C test suite.
- web interface, native or CGI
Fossil is implemented using SQLite as a storage, which is rather unsurprising given that both are written by the same guy.
Since fossil has the ability to open a repository (which is a single sqlite database file) in any arbitary directory it nicely overlays with other versioning systems without wanting to store its meta data in hidden directories within the same tree. I often add a bunch of files from /usr/src to a new .fossil file in my home directory for temporary local version control.
You can migrate to fossil from CVS by converting to git first (using e.g. uwe's excellent git-cvs converter at https://github.com/ustuehler/git-cvs), and then generate a git-fast-export stream which you can import into fossil with 'fossil import'. I once tried this with the OpenBSD CVS repository but aborted the import after 7 days. By then it had imported history from 1995 to somewhere in the year 2000, and I was convinced that fossil can't handle trees this large (or histories this deep? Not sure which). This was with a hack to the fossil import code to make it use less sqlite transactions... without that it was even worse (sqlite can handle many queries per second but not so many transactions). So if your code base is large your should consider Subversion or (unless your code base is humongous) git instead.
Databases are a ton of fun to read and talk about. If I ever have an event in my life that means I won't have to work anymore I'll spend the majority of my time studying databases.
The next time you're cussing at Microsoft Word for being slow while you type out a paragraph, think about SQLite and how it manages to do sub-millisecond queries to something that is just a file on your file system. Wow!
There were some blog posts that were far clearer to understand the file format, but unfortunately I lost track of them.
Took me some time to get used to, and yes, you always have to read the entire paragraph.
In the unlikely case that this is intentional it's genius:
it doesn't let you get away with "what do I need to know to make this code compile/kinda-sorta-run", but almost forces you to at least notice the things you should know when using it (protecting you from dangerous half-knowledge).
I'm not sure if this is perfect for file format documentation, but I would guess that a more "traditional" presentation would answer certain questions faster - such as "oh! this looks easy" / "this will take me weeks to work through", or "no, they don't seem to use RFC455692673 index optimization". However, for actually uisng the format for something productive, the time invested would be about the same.
All in all, SQLite is a pleasure to work with, and that's something outstanding on its own.
I'd previously written code to read the lucene database format (python code that time), which was also a good learning experience. I ended up leveraging some of that knowledge for a work project.
And I've also played around with reading/writing git archives.
When you're weighing up what to implement, it certainly helps when there is an agreed upon technical specification.
An RFC that justs explains a data format is essentially useless to me. Well, the format of a file is an important part, but it's less than half the story behind what's there, because you're missing a lot of semantics (that are not easily described in english).
There are many "campfire story" class RFCs that raise my blood pressure because they take a complex subject and describe its syntax and formatting and layout pretty well, but leave the hairy algorithmic stuff in the domain of hand-wavy and loose English.
For some reason we leave critical infrastructure open to interpretation. We don't have standards that actually help implementers make sure that what they've written is correct, instead we have documents that try to describe correct behavior and maybe provide a little guidance. For some reason it's gauche to ship working code in an RFC [I've been reading standards for 35 years and I used to work for NIST, I think I know the issues with code in an RFC -- they seem easily surmountable].
There are very important RFCs that don't have reference implementations in their (HTTP, I'm looking at you. If you've ever written an HTTP proxy then you know the particular hell I'm talking about, and don't get me started on PKI or -- oh Lord -- CSS).
On thing that caught my eye here was the section on the B-tree in the sqlite file format. I've written these, they are hard to get right. But the section here is an english description of a very complex data structure, and if this appeared in an RFC and you asked ten different developers to reimplement it then you'd wind up with wildy different interpretations of the text, all non-interoperable until the world finally decided to have some kind of "interop fest".
So I'm happy that SQLite already has a publically available b-tree implementation. Because you won't have a "standard" without that code, and QED.
Complex standards need reference implementations. Just calling something an RFC doesn't make it more useful. Also, forcing ivory tower types to write working code is the best anodyne I know against unimplementable and overly complex standards.
If you fail to get the basic context of a thread, maybe the internet isn't for you.
"MIME type application/my-example is defined as a resource conforming to RFCxxxx with the following tables..."
The basic data-set RFC could leave out the internal locking and transaction support, leaving "must be zero" placeholders.