
LittleD – SQL database for IoT, can run queries in 1KB - pfalcon
https://github.com/graemedouglas/LittleD
======
geedy
I am the author of LittleD.

I am more than happy to answer any questions. It has mostly been an academic
project while I was doing during my undergrad, but I am now looking at
continuing the project as part of a Ph.D. Of course, I would love to
coordinate a broader development of this project. :)

You may also be interested in another project I and some lab mates have been
working on over the last couple of years, IonDB:
[https://github.com/iondbproject/iondb](https://github.com/iondbproject/iondb).

EDIT: You might also be interested in the initial paper, which can be found
here:
[https://people.ok.ubc.ca/rlawrenc/research/Papers/LittleD.pd...](https://people.ok.ubc.ca/rlawrenc/research/Papers/LittleD.pdf)

I investigated query precompilation in another paper I am waiting to here back
on, and once everything has been a little better tested, that code will get
pushed out as well. :)

And seriously, if anybody is interested in contributing, I would love to have
some help. Get in contact with me!

~~~
krashidov
How did you learn to write a database from scratch? What books, courses,
tutorials do you recommend?

~~~
gburt
I took the same course @geedy is speaking of (with him, in fact) and there are
two good comparable resources I'm aware of:

(1) Ramon's (excellent!) lecture notes themselves are currently (but not
always) available on his website:
[https://people.ok.ubc.ca/rlawrenc/teaching/404/notes/index.h...](https://people.ok.ubc.ca/rlawrenc/teaching/404/notes/index.html)

(2) Garcia-Molina, Ullman and Widom's book and courses taught on their work:
[http://www.amazon.ca/Database-System-Implementation-
Hector-G...](http://www.amazon.ca/Database-System-Implementation-Hector-
Garcia-Molina/dp/0130402648)
[http://infolab.stanford.edu/~ullman/dbsi.html](http://infolab.stanford.edu/~ullman/dbsi.html)

------
listic
When would you actually _need_ a relational database running on such a small
platform, today?

I understand programming in general and microcontrollers; I was paid to
program one. Still, surely nowadays a system with a tiny 8-bit microcontroller
is probably considered as auxillary to some kind of larger machine, which
should be way better equipped to do the heavy lifting such as SQL databases?

~~~
Sanddancer
There are still lots of uses for tiny microcontrollers that don't connect to a
larger machine, or only do so on occasion. By having a database backend, you
make it a lot easier to query the microcontroller out in the field for a
specific subset of data, instead of it needing to dump everything out, wasting
time and battery life.

~~~
niutech
You can easily get just part of your data using a binary tree or a hashtable
and simple C functions like _for_ , _while_ , _if_. In most cases there is no
need for a full-blown SQL engine.

~~~
teraflop
On the other hand, microcontrollers typically have ample Flash storage for
code, but very limited amounts of RAM. Instead of making everyone rewrite
their own memory-efficient data structures, it might well make sense to have a
single, general-purpose implementation -- even if some of the code goes unused
in a particular application.

~~~
niutech
Keep in mind that the Flash storage has a limited number of writes, so keeping
frequently changing data in memory and choosing a right data structure can be
crucial for your data's life span. SQL is not an universal solution for all
use cases.

~~~
geedy
You are not wrong, but consider that the problem may be handled a number of
ways. Once I move to storing everything through IonDB, you will be able to
choose key-value stores that could wear level, should one be implemented. Such
algorithms have been suggested in other works. Furthermore, if the device
supports a storage API that already does wear levelling, you than a SQL layer
on top should not hurt.

------
jwiley
I haven't created large amounts of C/C++, so I'm certainly not an expert, but
I was impressed by how well the source was commented and structured. Nice to
see well documented, well thought out code, congrats

~~~
hliyan
I _have_ written large amounts of C/C++ and I too am impressed by how well
documented the code is (trust me, most C++ code isn't!)

~~~
geedy
Thank you! I actually have become more anal about code quality as I've gone
along, to the point of annoying people. IonDB represents a much better coding
standard, in my mind than LittleD. Like I say, every bit of documentation and
cleanliness helps!

------
coleifer
This looks like a very cool project. I cloned it and generated the docs, and
while the code is certainly well-document, I didn't see any type of high-level
narrative or public APIs. Was hoping to see at least a "littleD.h" or
something, but it seems you have to do some source diving or look at the tests
to figure out how to use this library.

~~~
geedy
I absolutely need to improve that side of it, there is no doubt. Thank you for
the feedback!

~~~
jasonwatkinspdx
Get a link to the paper up front in the readme.

------
SEJeff
I'd be quite interested to see this on an ESP8266 board. I use the Adafruit
huzzah version[1] which comes with a lot of extra fluff for developers, but
the core microcontroller is a 32 bit MCU with 64k of ram. It is about the size
of a US Quarter Dollar coin and can be had for as low as ~$2 USD.

[1]
[https://www.adafruit.com/products/2471](https://www.adafruit.com/products/2471)

~~~
geedy
We've actually been looking at getting some newer devices, I'll look to add
this to our list. Thanks!

------
nitrogen
What is the advantage of using a database vs. rolling your own data structures
on such a small device?

~~~
emcq
Dynamic queries and data serialization. Imagine sending just a text query (in
SQL) around to a sensor network. It would be a fair amount of work to
reproduce that on your own but seems like it can open up some nice features
for distributed IoT

~~~
nitrogen
It still seems pretty inefficient. With 4KB of RAM I'd just store my data in
an array of structs and send it in raw binary to the central server. The SQL
also leaves potential for running unwanted SQL.

I'm reminded of a commercial home automation platform that had agonizingly
slow response times because the LCD controllers were sending raw,
unauthenticated Python code to the 70MHz central controller.

------
Carrok
As someone who has spent the past year and change developing an automation
system using the targeted platform (Megas) I can say that this sounds like a
very bad way to do things.

While I applaud the effort, and it will certainly be useful for many people
for many other reasons, I don't think IoT is the best use case here. If your
data is at all valuable/useful, you don't want it sitting out somewhere on
some device that will hopefully/maybe be online when you go to query it. Plus
now just to be able to query it remotely you will still need to develop some
sort of API that lives on the device that can talk to the LittleD database.

Finally if you really are doing 'IoT' you clearly have a bunch of things that
you want to view/control from a centralized platform. When you have the
devices talking to a server, you can do this. When you have to ask each device
individually what is going on with it, this becomes much harder.

~~~
geedy
There isn't really a targeted platform. There just only happens to be one of
me at the moment, and so I've only managed to compile it for a few devices.

As you noted, its not really an IoT databases. It could be used as such, but
thats not the reason I built it.

I have also been working on the data transfer problem, because without the
ability the share the data it is in fact kind of a useless platform in many
applications. I have a job manager being developed that will allow for
scheduled or ad-hoc execution of functions. I've also written some small
library to encode LittleD results for network transmission. Using some basic
networking stack, it would be easy to assemble these pieces into something
that could be viewed/controlled from a centralized platform. LittleD could
even be modified with relative ease to actually query over other LittleD
instances!

EDIT: I would actually like to encourage you to divulge some specific
criticism for the IoT application. Are you specifically speaking to your
automation system, or to IoT at large? It seems difficult to predict exactly
how any one person might apply any given technology.

~~~
jchrisa
We've built a sync-enabled database, and even at 500 kb it's getting a lot of
uptake in IoT applications.

Open source databases:
[http://developer.couchbase.com/mobile](http://developer.couchbase.com/mobile)

Info about how GE is using it: [http://www.couchbase.com/nosql-
resources/presentations/offli...](http://www.couchbase.com/nosql-
resources/presentations/offline-first-and-how-ge-integrated-couchbase-mobile-
in-less-than-90-days.html)

Long story short: when you get the network stuff figured out, it's gonna be an
interesting product.

Feel free to contact me (info in profile) if you want to chat about how this
can fit into the industry.

~~~
jack12
When you say IoT applications, do you mean phone apps which are dealing with
data which originated from a "thing" in the IoT, but that the data has already
gotten to the phone or cloud through some other, non-Couchbase Mobile channel?

Or is it possible to run or communicate/sync with Couchbase Mobile directly
from a "thing" itself?

~~~
jchrisa
Couchbase Mobile has an optional on device rest API we use for p2p sync. You
can also use it to push data from edge devices to a handset or base station,
but typically phones will ping devices.

------
tptacek
This is really neat. What's a realistic use case for a SQL-like database
running on a microcontroller?

~~~
geedy
Author here.

Any time you want to store historical data, and query that data later.

Very similar to how you might imagine using a database for a website, just at
a smaller scale. The project that inspired this work was a water metering
project in Kelowna, BC. Basically, a friend of mine took a bunch of micro-
controllers with soil moisture sensors, shoved them in the ground, and tried
to come up with better demand-based watering schedules. It took them two
months to develop the data management code, and knew it could have been days
or hours with a proper database.

~~~
tptacek
Thanks! What would the storage look like for this?

(I have an ulterior motive for asking, which is that I will probably end up
shoplifting this code and embedding it into the next set of levels for our CTF
game, which is serverside-emulated AVR).

~~~
geedy
I did a fair amount of testing with arduino-based SD-card storage, but as long
as you had some sort of way to store bytes to your preferred medium, that
shouldn't be too hard to hack yourself. Check out
[https://github.com/graemedouglas/LittleD/blob/master/src/dbs...](https://github.com/graemedouglas/LittleD/blob/master/src/dbstorage/dbstorage.c)

EDIT: Please let me know how/if you end up using it! Or if you run into bugs!
:)

------
nickpsecurity
Interesting. Hope the author gets a usable subset there in a flexible coding
style. Might be ported to other devices and become a general thing.

Love the name, too, haha.

Note: How much of a need is there for a SQL database on 8-bitters, etc? Can't
one do that in a front-end at the client side and just have simple commands
sent over network to device? What I always did for limited or security-
critical devices. No way I'd put a whole 4GL on them lol.

~~~
geedy
My goal has always been to support a wide-variety of devices.

As for the name, I cannot take credit. There is a good story behind it though.
;)

As for the need, I sort of explained the motivation this comment:
[https://news.ycombinator.com/item?id=10622675](https://news.ycombinator.com/item?id=10622675).
Data management would drastically reduce the development effort associated
with data-intensive applications for smaller devices, in an IoT setting or
otherwise. I've actually recently travelled to the University of Michigan to
talk about this work, and got lots of good feedback. One of the professors
invited me and has asked me to study with him, because he feels there is a
need. One of his graduate students there is working on integrating LittleD
into some of his work already!

~~~
nickpsecurity
Interesting. The main drawback I saw was that your paper is one of those
paywalled behind ACM. Many end up freely available on academics' sites,
Citeseerx, etc. Yet, some portion are stuck where few will ever look at them.
Another example recently on here was OcaPIC: Ocaml on PIC MCU's. Recent paper
locked up where people can't read and improve it.

Is there a public copy of your paper for people here to read? Might increase
interest and contributions.

~~~
geedy
[https://people.ok.ubc.ca/rlawrenc/research/Papers/LittleD.pd...](https://people.ok.ubc.ca/rlawrenc/research/Papers/LittleD.pdf)

~~~
nickpsecurity
Excellent design and tradeoffs. I doubt people would've believed you two
would've taken it this far down. Keeping a copy in case techniques like this
might apply to future 8-bit work. :)

~~~
geedy
Thank you for the feedback!

I actually have a lot of thoughts of ways to improve this too, which is why I
am applying to do a PhD in this area.

~~~
nickpsecurity
Good luck on the PhD!

------
shanemhansen
I discovered my IOT camera in my home runs sqlite.

------
myth17
Do we have transaction (ACID) support? If not, have you thought about it? Is
is feasible with the IOT constraints?

------
pbreit
In the project's defense, I didn't see the size comparison anywhere other than
this submission.

------
donpdonp
What are the actually memory usage/recommended RAM sizes for LittleD?

~~~
geedy
LittleD can be run using as little as 1KB of RAM for simple queries involving
selections (WHERE clauses), projections (not just SELECT *) and joins. For
queries with more than a couple joins, 2KB of RAM should still suffice.

------
mirchiseth
This would be a good learning project in CS classes as well.

------
rattray
Is the size of SQLite really a problem? Is its reliability something that
hardware devs are willing to sacrifice?

(In other words, how big is SQLite and what are the size restrictions for IoT
devices – I haven't built one before)

~~~
trymas
+1

SQLite is probably on of the most well tested piece of software on this globe.
I do not know if this space saving is worth it.

~~~
nickpsecurity
It is if the space savings gets your product sales due to lower unit price
with big $$$ resulting. That's why there's a market for 8-16bit processors.
_Most_ developers don't want it so much as need it because getting job done
for $1 a CPU is better than using $10-$100. ;)

EDIT to add two links by Ganssle to explain it better:

[http://www.ganssle.com/rants/8bitsisdead.htm](http://www.ganssle.com/rants/8bitsisdead.htm)

[http://www.ganssle.com/articles/8and16bit.htm](http://www.ganssle.com/articles/8and16bit.htm)

I think Microchip's 8-bitters doing $1+ billion in business with better
profits and dividends vs troubles of many in 32-64 bit markets says a lot.

~~~
sitkack
Those articles are over a decade old. The size of silicon between an 8 or 16
or 32 bit ALU contained in an MCU is miniscule. Most of the space is taken up
by peripherals and this is where Microchip has historically thrived but this
is no longer so. There are a number of 32 bit MCUs that are cost competitive
with 8 bit parts.

~~~
nickpsecurity
How about this month with a market survey? ;)

[http://www.edn.com/electronics-blogs/embedded-
insights/44408...](http://www.edn.com/electronics-blogs/embedded-
insights/4440832/8-bit-isn-t-dying--it-s-growing)

Interesting details on how 8-bit companies are doing that from John Donovan
the year before:

[https://www.digikey.com/en/articles/techzone/2014/feb/is-
the...](https://www.digikey.com/en/articles/techzone/2014/feb/is-there-a-
future-for-8-bit-mcus)

We'd need a price list for the other thing. Can you get 32's with decent
memory & low watts for a few bucks a CPU in low volume? Aside from that, you
still think they're dead and useless with the new articles?

I'll add that many chips from old nodes survived for decades due to stability.
The new process nodes have all kinds of issues and break faster. So, there is
that to consider if the application is long-term, safety-critical, or
security-critical. Many in high assurance design stayed with old stuff (esp on
SOI) because manufacturing bugs were less and interference/wear issues are
lower.

~~~
Sanddancer
Yes, the tiny Freescale, Cypress, etc can be obtained for less than a dollar
in a reasonable quantity [1]. The microcontroller market is changing pretty
rapidly of late, with the number of products getting more and more broad. For
your high reliability stuff, you'd probably want to look at the cortex-r
series, which is built for the type of use cases you describe.

[1] [http://bit.ly/1NbLYov](http://bit.ly/1NbLYov) redirects to digikey
search.

~~~
nickpsecurity
Appreciate the confirmation. Outside embedded, one use for these people aren't
appreciating is offloading interrupts or security checks of I/O devices. Along
with sensors/mgmt stuff you can trust more. Cost being so low let's one do
physical partitioning of domains instead of software.

Also, thanks for the mention of Cortex-R as I hadn't heard of it. A quick
glance at their description looks good. About $8 on low end to $50 on high-
end. Getting quite cheap indeed for what 8-bits would be required for. Way
cheaper than the old champ (RCA 1802 MCU) is today ($150 w/ 1k units).

[http://www.intersil.com/content/intersil/en/products/space-a...](http://www.intersil.com/content/intersil/en/products/space-
and-harsh-environment/harsh-environment/microprocessors-and-
peripherals/CDP1802A.html?part=t&_charset_=UTF-8#overview)

Ok. So, I'll probably not have to go to 8-bit or drop large $$$ if I take-on
certain projects. Good to know. Dirt cheap chips with large ecosystem, lock-
step, real-time, and MPU's... it's like the golden era in tech for embedded
start-ups, eh? :)

