Why using a self-made lisp here ? Why not using an existing one ? Is it because regular lisps use a VM ? But then what's the tradeoff ?
On a side note, i really don't know if the savings will be worth it once the company starts growing and hiring. They'll spend a lot of time maintaining very custom code, running on a custom implementation of a language, alongside erlang codebase... Unless they open source everything to get a community to grow the user base for their tool.
Probably has something to do with this statement, repeated many times on the page:
We are able to do this, because starting our LISP-VM and doing all these processes on every
request, is still many times faster that having a VM (with a garbage collector) online all the time.
Self-made LISPs usually aren't that complicated to write, and if you're taking shortcuts like preallocating chunks of memory rather than writing a full-blown VM GC system, then things become even easier. My guess is that most of the LISP code doing the data analysis is composed of calls into primitives written in C that do all the heavy lifting so the performance hit of an interpreted or bytecode-interpreted language is minimal.
Having to maintain a small LISP interpreter in C is definitely an extra thing to keep up with, but you have to balance that with the work that would normally go into avoiding long GC pauses once you start tracking a nontrivial amount of objects/data in a VM with garbage collection. Java, for instance, doesn't support separate heaps, which means that a background piggy processing thread that is haphazardly allocating objects can cause "core" threads doing I/O to pause for several seconds for full GCs. Even with Java's fairly sophisticated heap management schemes it is still very difficult to design a system to completely avoid full GC pauses. Erlang is somewhat better in this respect but introduces its own sets of problems to the mix.
In any case, I agree with your sentiment (probably would not be the angle I would have taken), but I can kind of see why the authors might have decided to go this route.
That's for the querying part, but how did you deal with indexing ? If if understand right, you seem to precompute pretty much everything you need on the fly, so I guess that means custom data structure not relying on SQL algebra, so no pkey / fkey table like and index that would let you write new queries after you've stored the data ?
Since garbage collecting seems such a great issue, does anyone knows of an effort to have an objective-C-like language with automatic reference counting and memory retain/release (like with clang) on the server side ?
PS : i'm speaking objective-c here because it's the only language i know that does it that way, not because of its features as a language.
I'm pretty sure CPython reference counts objects. Also Objective-C -- as in GNUStep, if you aren't using a Mac server.
EDIT: There's a downside to reference counting in that either you or the runtime does the retain/release calls very often, whereas a mark-and-sweep collector would just touch each object once per collection. So reference counting might not be faster overall, but typically avoids long pauses. What you probably really want is something like Azul's JVM, which has a kernel extension so they can collect memory concurrently, resulting in shorter pauses than sweeping and faster overall time than reference counting.
We employ a technique similar to that of Obj-C in our memory handler implementation. You attach objects not to a parent object, but to a generation. We have 4 a-priori-specified generations per node (2 of them are stubborn and die during the de-initialization/clean-up phase of the node as a whole).
And this is another area where a functional paradigm makes sense. Mutations are "hidden" (and the C layer that's responsible for doing them concurrently lives in another "realm" (monad)). So, memory handling is easy and monitoring has shown us that fragmentation is kept to a minimum.
Yeap, the problem was the VM.
And we are going to opensource most of it.
We tried to find something that was fast, cheap to scale (in terms of servers) and easy to extend but we couldn't find any other solutions. The problem is having everything in memory and doing correlations really fast.
S-expressions are a nice representation for ASTs in general. At work we needed a query language for an API, so we used the data structures we had (JSON-ish) to express a little Lisp and then wrote a parser from an SQL-ish language to this Lisp.
its not just a custom lisp (which i wouldnt be too worried about) - its a custom db.
i wonder how much engineering talent is going to get sunk into writing a db and whether the management is going to get fidgety while the company's best talent is writing a db rather than the product? (especially with .. two guys)
Having a small (but very strong) R&D team helps you eliminate most of these "processes". The buzzword here is "iterations". Lot's of them.
Our clients were involved since day 0 (which I think is the key of our success btw). The first prototype (built in two days) was in Erlang using Mnesia.
The drawbacks are:
+ not having lots of engineers knowing C and Erlang
+ Outsiders having problem with LISP (that's why we implemented PIG)
Mnesia is by far the fastest (and easiest to use) thing we've ever tried. The only problem was that it was more expensive to keep data in memory and that pointer arithmetic is so damn fast. And (the most important thing) is the 2GB limit Dets has :/
I am interest in the missing part: what data structures/layouts are used inside LDB? Cause for In-Memory DB, CPU cache miss plays a key role in executing query. And how much data are touched to execute a single query in common case?
Cache misses are not compared to initializing the lisp engine. So, we don't worry so much about them. Also, since the data organization is up to us, we touch only what is necessary (well, almost! We are talking about fine-grained locking here, but we modify (mutate) only the datum that needs to be changed).
It has to do with semantics and the way you view your data. SQL is not a Turing-complete language (without abusing it!) and represents a subset of Relational Algebra. For us, relations might seem to resemble these of an RDBMS at first sight, but we work with them in a different way.
Also, the data format for input is not tightly-coupled with the LDB system. One can have different processing scripts (lisp) that may organize the data in a different way. Or you can write a C module that extends the underlying db engine to work with your implementation or even your database infrastructure (whether it includes SQL, NoSQL, etc).
I'm not sure if I understand your answer. To quote from the article again:
During our tests, we saw that SQL databases weren’t a good fit due to the fact that our data where unstructured and we needed a lot of complex “joins” (and many indexes).
I guess that you say it's unstructured because the most important parts of bug reports are stack traces and debug messages? My naive assumption is that stack traces can be stored in a relational way, and it seems that you do:
For us, relations might seem to resemble these of an RDBMS at first sight, but we work with them in a different way.
From 'unstructured', 'complex "joins"' and 'work in a different way' I infer that you do a lot of string processing on the fly, and the real problem was about precomputing everything vs. staying flexible?
Self-made Lisp, in-memory database and hacking the Linux kernel: what's not to like about it ; )
I'm wondering: how does one decide to change the TCP_TIMEWAIT_LEN from 60 seconds to 20 seconds? 20 seems an arbitrary picked value: I understand that by doing so they can drastically cut down on the number of open connections after the last FIN is received, but why 20 instead of, say, 17 or, say, 9?
I think the value of 20 seconds is arbitrary, but there's a good reason why they don't lower it any further. With a protocol like TCP that guarantees reliable delivery, sometimes packets get resent to the server. If the server receives this resent packet after the connection has been closed and a new connection has been opened, it is possible that the server will accept this packet. As such, as part of closing a TCP connection, the connection sits in a TIME-WAIT state for 2*MSL seconds (MSL = maximum segment lifetime).
If you decrease the amount of time a connection sits in a TIME-WAIT state, you will increase the probability that a new connection could receive a packet from a previous connection. Given the latency of mobile networks, I would expect the probability of receiving a packet from a previous connection to be even higher.
The question was not "how do you do it once you decide" but "how do you decide on such an arbitrary replacement".
Also: "Do not be confused by the /proc/sys/net/ipv4/tcp_fin_timeout config item. The FIN TIMEOUT is not the same as the TIMEWAIT length." -- http://www.stolk.org/debian/timewait.html
The value was picked heuristically after measuring through Ganglia and a set of CLI tools the average TCP dialog duration between a set of mobile devices and our database and then performing a set of tests with varying TIMEWAIT lengths. The 20 second period is a sweet (and round :)) spot with < 0.01% of packets / day arriving after FIN. It's a good approach all in all, since our expected payload packet set is extremely small (just 1 on broadband connections with large MTUs).
- Panagiotis Papadomitsos, Head of Infrastructure @ BugSense
On a side note, i really don't know if the savings will be worth it once the company starts growing and hiring. They'll spend a lot of time maintaining very custom code, running on a custom implementation of a language, alongside erlang codebase... Unless they open source everything to get a community to grow the user base for their tool.