
Writing a SQLite clone from scratch in C - ingve
https://cstack.github.io/db_tutorial/
======
wruza
A bit shameless to ask a question under top HN record, but I always wondered
why do not SQL engines give access to their SQL AST to generate queries. Of
course, one can combine a query from strings with placeholders, and it takes
not so much comparing to actual querying process, but it feels so hacky. Why
not provide something like:

    
    
        Query *q = db_select(db);
        db_add_from(q, "persons");
        db_left_join(q, "companies", db_expr(DB_EQ, "per_id", "com_id"));
    
        Query *q = db_create_table(db);
        db_add_column(q, "id", DB_TYPE_INT | DB_COL_NOT_NULL | DB_COL_PRIMARY);
        db_set_table_name(q, "persons");
    

... and so on? That way we could combine and generate queries for specific
engines without that ugly string joining/parsing.

~~~
wcrichton
This is essentially what an ORM does, e.g. SQLAlchemy [1]. Except instead of
providing a slightly nicer API to generate query strings, an ORM tends to be
more tied to the host language and use its object system to abstract out most
of the SQL complexity and make declaring SQL tables/queries look like normal
operations on the data themselves instead of the database. Django's ORM [2]
for example autogenerates joins for you under certain query conditions, which
is super handy.

[1]
[http://docs.sqlalchemy.org/en/latest/orm/tutorial.html](http://docs.sqlalchemy.org/en/latest/orm/tutorial.html)

[2]
[https://docs.djangoproject.com/en/1.11/topics/db/queries/](https://docs.djangoproject.com/en/1.11/topics/db/queries/)

~~~
wruza
I'm aware of SQLAlchemy Core/ORM (and ORMs in general), my question is more
like why do we have to concatenate strings in these project's internals
instead of using some API to pass query structures without intermediate
composing/decomposing. For small queries that ORMs often do, it feels like
needless and annoying overhead.

For now, even '?' placeholder escaping is actually manually-printf'ed under
the driver's hood, afaik, so in the end we send plain text to the server.
SQLite probably avoids that via sqlite3_bind() call, which allows to pass
SQLITE3_STATIC string args if you want, but that's the benefit of in-process
engine.

~~~
anton_gogolev
I highly doubt that placeholders are interpolated into the SQL statement.
Doing so will defeat the purpose of having them in the first place. Rather,
they are tranferred out-of-band to the SQL "engine", as it were, and are used
as straight up variables.

~~~
wruza
I checked it for PostgreSQL right now, and it seems that libpq _can_ send them
OOB via PQexecParams() and networked integer buffers. But that's not the case
with OpenResty's pgmoon driver, which doesn't have such an interface. E.g.
Lapis framework built on top of that simply escape_literal()-izes all
arguments and interpolates right into query text. So, YMMW with specific (or
generic) drivers.

Placeholders are there for those who tries to concatenate queries by hand;
having them properly escaped in protocol should not defeat client-side
security purpose.

~~~
anton_gogolev
You can _never_ properly escape user input, so this escape_literal() is a
ticking timebomb.

~~~
icebraining
How come it's impossible to properly escape user input? I get that it's hard
if the format is complex, but I don't see what makes it categorically
impossible, especially for a simple format like an SQL string. You just double
every single quote and then surround the whole thing with single quotes.

~~~
anton_gogolev
> ...every single quote...

And then you come across \u0027 and you're screwed.

~~~
wruza
Do you know any decent server that interprets 6-char \uxxxx sequences as part
of its top-level query syntax? I mean, how do you safely pass '\u0027'
_literal_ to it anyway? If you can't, then its literals and escaping are
broken by design, so it is good to know from the start of using it.

Edit: removed double backslash from literal to not confuse it with host
language's escaping.

------
stevoski
If you prefer Java to C, I recommend taking a look at the source for H2, a
pure Java SQL database engine. It is surprisingly easy to follow.

I've contributed to the H2 source code a few times, and in the process I got
to learn much about how a database engine is implemented.

H2 home: [http://www.h2database.com/](http://www.h2database.com/)

H2 Github repo:
[https://github.com/h2database/h2database](https://github.com/h2database/h2database)

~~~
mlvljr
Allen Holub did something like that for his patterns book, btw :)

------
somehnreader
Hi, TUM has a lecture for this in which the students build a database in C and
learning fundamentals along the way.

I attended a few years back and learnt an awful lot, this is the course page:

[https://db.in.tum.de/teaching/ws1516/imlab/index.shtml?lang=...](https://db.in.tum.de/teaching/ws1516/imlab/index.shtml?lang=en)

Haven't found the accompanying slides yet, but they must be somewhere on their
website.

~~~
tfaruq
Updated lecture with slides
[https://db.in.tum.de/teaching/ss17/moderndbs/](https://db.in.tum.de/teaching/ss17/moderndbs/)

~~~
fork1
Thanks for sharing this!

------
mopsy
The language to access a relational database is always SQL. So much than a lot
of people doesn't make any difference between SQL and a DBMS. Aren't we in the
same situation as Javascript on the web? I always had wondered why there is no
effort like Webasm for relational database. I would love to see a more vibrant
scene with different syntax, paradigm and ideas.

What would be the rust equivalent of a relational language, the Go equivalent?

As an example, I had a lot of pleasure to read that blog : [http://www.try-
alf.org/blog/2014-12-03-what-would-a-function...](http://www.try-
alf.org/blog/2014-12-03-what-would-a-functional-sql-look-like)

~~~
belst
For rust it would literally be Iterators

------
maxpert
Still waiting for SQLite4 with it's awesome LSM engine :P (shameless plug I
ported the engine over to windows [https://github.com/maxpert/lsm-
windows](https://github.com/maxpert/lsm-windows) and it works :D). I wish
somebody can do a detailed architecture about SQLite4, that kind of knowledge
is scares and valuable.

~~~
SQLite
FWIW, I'm presenting a talk on exactly this topic tomorrow at 12:30 at the
[http://www.db-tech-showcase.com/dbts/tokyo](http://www.db-tech-
showcase.com/dbts/tokyo) \- unfortunately I do not think it will be recorded.

~~~
RainaRelanah
Slightly off topic, but thank you for your work on sqlite and fossil. I've
used both daily for years and am always impressed by how consistently
fantastic they are.

------
piinbinary
Off topic: I appreciate how fast that site renders.

~~~
hollander
To be honest, it always shocks me how long it takes for the average modern
webpage to load. All the build up of things, the time it takes, and how
unexpected it still is that you think you can click somewhere, and just 0.1
second before you click something else renders and moves everything up or
down, making you click something else, which causes another page to load etc.
I have a relatively new Mac with 8GB RAM - it should just work, but no...

------
karmicthreat
Is there anything even close to SQLite in terms of reliability? Its pretty
much the standard for embedded systems.

~~~
tyingq
Interbase is similar in purpose and has a good reputation. Not free or open
source though.

~~~
dfox
There is Firebird which is essentially an opensouce InterBase (IIRC it is
forked from IB5 source). It is probably more widely used than new commercial
versions of IB. One interesting feature is that it is probably only opensource
DBMS that not only works with multiple engines accessing same database stored
on Windows share but it is actually supported usecase.

------
Luker88
This comes right at a time when I started attempting something similar (though
I don't care that much about documenting it)

You might want to read some overviews on:

    
    
      * postgres atomicity: https://brandur.org/postgres-atomicity
    
      * postgres disk format: http://rachbelaid.com/introduction-to-postgres-physical-storage/
    

I have not read much on replication, though. Does anyone have any pointers?

~~~
flavio81
> Does anyone have any pointers?

Sure, here you are:

    
    
        void *x= malloc(16384)
    

(Couldn't resist)

~~~
mcguire
Welcome to systems programming, home of bad jokes and worse problems...

~~~
flavio81
" _I called the janitor the other day to see what he could do about my dingy
linoleum floor. He said he would have been happy to loan me a polisher, but
that he hadn 't the slightest idea what he had done with it. I told him not to
worry about it - that as a programmer it wasn't the first time I had
experienced a buffer allocation failure due to a memory error._"

\---

" _Q: Why did the concurrent chicken cross the road?

A: the side other To to get_"

(These were copied from the S.O. website)

~~~
mcguire
One piece of advice I'll never forget: Never grep a yacc by the i-node.

------
lifeisstillgood
This strikes me as an interesting way to define a syllabus for comp sci /
internship

What other build X from scratch would be needed to cover much of what we think
a grad should know?

\- build a language (compiler) from scratch

\- build a graphics engine from scratch

\- build a network stack (and firewall ?)

Err?

~~~
dfox
I built webserver from scratch as a term project at university. Due that the
fact that it actually did something self-contained and useful I got away with
turning in C99 project for C++ course :) (and probably also because I wrote my
own talloc/apr_pool-style malloc wrapper to base that on, when presenting the
thing that was about as far as I got and then was told, "well, this is C++
course, but you got A and now go away")

For another MSc. level course the whole term project assignment was: "choose
or design some language and try to somehow conjure compiler and if required VM
for that". "somehow conjure" is significant because nobody said you have to do
that from scratch, hacking tcc/gcc/lcc/llvm to accept your language was fine
and basing your language on top of some Smalltalk VM (preferrably ST/X ;)) or
Common Lisp was essentially encouraged.

------
flunhat
Very nice! If you wanted it to save to disk instead of living wholly in
memory, how would you do that in C?

~~~
kevingadd
mmap!

I mean, not really, but it's a surprisingly viable starting point for simple
problems. I've shipped it. Some production-grade software still uses mmap,
albeit with a bunch of additional complexity to make sure it works okay in
weird cases.

See this old comment thread I dug up that might be relevant:
[https://news.ycombinator.com/item?id=3982514](https://news.ycombinator.com/item?id=3982514)

It seems like Redis might currently make use of mmap for some of its data, but
I couldn't find an up-to-date source for that, just some old blog posts by the
developer.

~~~
pjc50
The trouble is that naive mmap doesn't give you the control over write
ordering that you need to implement reliability.

Sadly, about 50% of database design is trying to ensure the thing has a decent
chance of starting up again if it crashes or loses power. This mostly consists
of fighting file systems and disk caches.

~~~
icedchai
I used to work on a system that used mmap for basically _everything_ ,
including a proprietary database that processed financial transactions. The
original production system ran on a commercial Unix. (It's been about 15+
years, but I think it was AIX.)

At one point, we ported it to a different Unix platform which had slightly
different mmap semantics. It looked like everything worked, except for one
minor detail: the data was never actually synced to disk. Ever. Until
shutdown. Unfortunately, since the system ran 24/7, it effectively never
synced. First time the system had a power failure, there was massive data loss
on startup.

Oops.

(We were able to recover through log replay...)

~~~
tomcam
> (We were able to recover through log replay...)

I call that a win! What was the logging mechanism--something bespoke? What was
your solution?

~~~
icedchai
It was bespoke: a proprietary system that essentially used an append-only file
containing each transaction request and response.

~~~
tomcam
Very nice. Good planning.

------
DLA
Following with interest. I would also LOVE to see a SQLite clone in pure Go.
That would be amazing.

~~~
pzduniak
It's not a clone per se, but there's
[https://github.com/cznic/ql](https://github.com/cznic/ql)

~~~
geodel
I think this one is sqlite:
[https://github.com/cznic/sqlite](https://github.com/cznic/sqlite)

~~~
DLA
Thanks geodel. Almost a win here, but if you look in sqlite/internal/bin/
you'll see cznic embeds the _entire_ SQLite c source.

------
14113
On a related note (and this seems a good forum to ask), are there any _typed_
databases available similar to sqlite?

~~~
oblio
What do you mean by typed? :-?

~~~
striking
SQLite will ignore most type information, instead choosing to focus on "type
affinity":
[https://www.sqlite.org/datatype3.html](https://www.sqlite.org/datatype3.html)

~~~
etatoby
I didn't know SQLite was dynamically typed. That's a strange choice for a
database, to say the least.

I would look into Berkeley DB (BDB) which is one of the most widely-used
embedded databases, albeit not a SQL-based one. Otherwise take a look at the
list of embedded databases on Wikipedia:
[https://en.wikipedia.org/wiki/Embedded_database](https://en.wikipedia.org/wiki/Embedded_database)

~~~
cat199
Initially (v < 3), all data fields in sqlite were text internally - type
information was accepted for SQL compatibility, but ignored. This is actually
an improvement.

I'm not sure of the rationale; perhaps it's easier to keep the code / database
files portable amongst different systems this way, and since it was designed
for embedded use, many systems (and thus incompatibilites) were envisioned?

------
gravypod
I'm very interested in seeing DROP implemented in this. It will be cool to see
the author handle fragmentation.

------
MBCook
I don't know why but unfortunately the various parts are effectively
unreadable on an iPhone.

The screen zooms in and out as you scroll, so that when you scroll the text is
a little too tiny and when you stop the text is cut off on the left edge.

I'm going to give it a try on my iPad. Just a heads up.

Edit: works fine on an iPad. Interesting.

------
elymar
SQLite had a very rigorous testing methodology to comply with aeronautics
standards. Cloning the tests would be a decade worth of work. But cool idea
nonetheless.
[https://sqlite.org/testing.html](https://sqlite.org/testing.html)

------
corndoge
But the reference SQLite implementation is a single c file...

~~~
asveikau
It's actually written as several files and the "amalgamation" you download
from the website represents those files concatenated together.

See this explanation:
[http://sqlite.org/amalgamation.html](http://sqlite.org/amalgamation.html)

~~~
devty
> The amalgamation file is more than 180,000 lines long and over 6 megabytes
> in size.

~~~
asveikau
Yes, but the article also says that file is generated by a makefile from more
than a hundred .c and .h files, and that it's done for ease of distribution
and because optimizers do better with a single compilation unit.

It's not like the authors spend a lot of time working on the amalgamation and
that's the way the project is developed.

------
eberkund
Does anyone know what software they used for creating those colored diagrams?

~~~
Someone
They are copied from
[http://sqlite.org/arch.html](http://sqlite.org/arch.html), which is built
from (see [http://sqlite.org/download.html](http://sqlite.org/download.html))
data in Fossil at
[https://www3.sqlite.org/cgi/docsrc/timeline](https://www3.sqlite.org/cgi/docsrc/timeline),
which leads me to e.g.
[https://www3.sqlite.org/cgi/docsrc/artifact/d22a2c9642d584b8](https://www3.sqlite.org/cgi/docsrc/artifact/d22a2c9642d584b8),
which seems to contain the right strings, so presumably is the file describing
[https://www.sqlite.org/images/arch2.gif](https://www.sqlite.org/images/arch2.gif)
and its copy on
[https://cstack.github.io/db_tutorial/assets/images/arch2.gif](https://cstack.github.io/db_tutorial/assets/images/arch2.gif)

That's fig source. So, I guess this was made by some GUI that uses fig as its
native file format. My guess would be xfig
([http://mcj.sourceforge.net](http://mcj.sourceforge.net))

------
nojvek
This is exactly what I wanted. Thanks for sharing.

------
std_throwaway
Isn't SQLite an application calling for a clean implementation in Rust?

~~~
alecco
Why? SQLite is one of the highest quality codebases out there! For fun, sure.
But I think you are underestimating the size of the SQLite project and it's
tests. Just because the library is small it doesn't mean it's simple.

~~~
steveklabnik
It's massive testing means that it would be easier to implement in another
language; you'd be pretty sure of conformance with a port.

That said, it's one of the most reliable projects in existence so why rewrite
it?

~~~
pygy_
To reboot WebSQL?

IIRC Mozilla's main gripe with it was the fact that there was only one
implementation, and no standard beyond "Whatever Sqlite3 does".

~~~
acdha
How would a rewrite change that? Addressing that objection would require some
sort of open standards process and commitment to follow it, which is a rather
orthogonal concern and would be somewhat expensive as well.

~~~
pygy_
You'd also need to turn "whatever sqlitev3.x does" into a spec, indeed which
is a lot of work that will never happen as there's only one implementation
that passes the test suite.

------
zaidf
I wish someone would show how to rewrite Redux for React Native.

~~~
ralusek
I can't tell if this was a joke or not, I'm assuming that it is.

If not: Redux works the same way for React Native as it does for React or
anything else.

