
A NoSQL Database with ACID Transactions - awjr
http://www.foundationdb.com/
======
krenoten
The Flow language they implemented for this is interesting.
<http://www.foundationdb.com/white-papers/flow/>

I think we are going to see eventual consistency deflate as people relearn to
love transactions. SQL also makes users of a database a lot more productive.
The bottom line is you don't want your devs spending time and effort reasoning
about database behavior, you want them focusing on business logic. In 5 years
hopefully the world will catch up a bit with Google's Megastore and Spanner.

~~~
Dave_Rosenthal
FoundationDB founder here. Flow sounds crazy. What hubris to think that you
need a new programming language for your project? Three years later: Best
decision we ever made.

We knew this was going to be a long project so we invested heavily in tools at
the beginning. The first two weeks of FoundationDB were building this new
programming language to give us the speed of C++ with high level tools for
actor-model concurrency. But, the real magic is how Flow enables us to use our
real code to do deterministic simulations of a cluster in a single thread. We
have a white paper upcoming on this.

We've had quite a bit of interest in Flow over the years and I've given
several talks on it at meetups/conferences. We've always thought about open-
sourcing it... It's not as elegant as some other actor-model languages like
Scala or Erlang (see: C++) but it's nice and fast at run-time and really helps
productivity vs. writing callbacks, etc.

(Fun fact: We've only ever found two bugs in Flow. After the first, we decided
that we never wanted a bug again in our programming language. So, we built a
program in Python that generates random Flow code and independently-executes
it to validate Flow's behavior. This fuzz tester found one more bug, and we've
never found another.)

~~~
gruseom
How does the Python program know what the random Flow code is supposed to do?

~~~
Dave_Rosenthal
Hmm, that's the tricky bit to explain in a quick post. It has an independent
code interpreter that works on the internal representation of the fuzzed code.
The output of that internal representation in fed through the Flow compiler
(then C++). There is basically a "check([random number])" statement on every
other line of code which should be reached in a known order. We compare the
check log of the fuzz tester's interpreter to the check log of the compiled
code.

------
awjr
It's one of the 'bug bears' I find with new technologies where they try and
use it for something it wasn't really designed for. The most recent being the
idea of doing e-commerce with a no-sql db (mostly MongoDB). You are just using
the wrong type of DB for this.

It's nice to find a NoSQL solution that addresses this issue.

~~~
VexXtreme
A big part of the problem are developers who think that every new technology
is a golden hammer and who try to shoehorn those same technologies into roles
they are not well suited for.

A pretty obvious example is people using Redis and Mongo for applications that
would really benefit the most from relational databases. If you have a highly
relational model that would benefit from things like referential integrity and
normalization then use a relational database. If you need a massive highly
scalable KVP database, use NoSQL. I don't see why people try to use a single
approach as a solution to every problem.

I thought this video about NoSQL fanboys was really hilarious:
[http://highscalability.com/blog/2010/9/5/hilarious-video-
rel...](http://highscalability.com/blog/2010/9/5/hilarious-video-relational-
database-vs-nosql-fanbois.html)

~~~
jt2190

      > If you have a highly relational model that would benefit
      > from things like referential integrity and normalization
      > then use a relational database. If you need a massive 
      > highly scalable KVP database, use NoSQL. I don't see why 
      > people try to use a single approach as a solution 
      > to every problem.
    

Part of the problem is that people present this as a choice between relational
and non-relational models, when the choice is actually about how the database
will scale when the data set gets very large.

~~~
obviouslygreen
You're turning one false dichotomy into... I don't know the word for it, but
while how it scales can be part of choosing a database system, it definitely
is not the main difference between the relational and noSQL models. How your
data storage will scale is an important but tangential topic that should be
addressed as complementary to deciding which class of storage will help you
most effectively and efficiently model your data.

~~~
jt2190

      > ...[database scaling is] definitely is not the main
      > difference between the relational and noSQL models.
    

The post I was responding to strongly implies that one simply has to decide
whether they want to use a relational model or not. I contend that framing the
question this way is contributing to the misunderstandings of what the
advantages and disadvantages of the popular noSQL databases are, which become
very apparent when one thinks of what happens when the data set gets very
large. Relational databases don't work nearly as well when you have to split
the schema across multiple databases, and you can no longer use the database
provided transactions or joins.

------
mercurial
I like the emphasis on transactions. However, the fact that it doesn't support
transactions longer than 5s and puts the onus on the client to enforce is kind
of a turn-off. How is the client supposed to know how long the server will
take for a given operation at a random load?

~~~
itp
Our performance page (<http://foundationdb.com/performance/>) shows how we've
worked very hard to provide predictable and low latencies, even at saturating
workloads [1] (on the order of 1ms for a read and 10ms for a commit). This
means that transactions with 50 _serial_ reads are still going to complete in
two orders of magnitude less time than the maximum transaction duration (and
futures make it easy to parallelize most reads). This is more than enough for
client operations -- regardless of your database, you usually don't want to
have a user wait for 5s, hold a lock for 5s, or be subject to conflicts for
5s.

[1] Under saturating load, FoundationDB will queue transactions before
assigning them a read version (starting the 5 second window), so that
latencies within the transaction stay low. This explicit queuing also makes it
easy to prioritize transactions, so you can mix latency-sensitive and
saturating batch workloads safely.

~~~
mercurial
Thanks for the quick answer. Regarding long transactions, it depends on what
you are doing. Say you want to change your schema, you won't be able to do
this in a single transaction (something which, eg, Postgres lets you do).

------
makmanalp
A bit of a tangent but does anyone have any info / experience on the similar
NuoDB (NoSQL + ACID), both in terms of functionality and in general?

<http://nuodb.com/>

Does this compare?

~~~
andrewflnr
What? The page you linked says "100% SQL".

------
SeanKilleen
I see C, node, ruby, and Java support...any plans for a .NET API as well? This
seems like something that would be nice to mess around with there.

Would also be nice to understand what the intended pricing model will be (if
any). If the plans are to keep this free, that would be pretty great.

~~~
egeozcan
"We will offer FoundationDB as both a free community edition and a licensed
version with support and larger cluster capability. The community edition will
include the full capabilities of FoundationDB and will allow production
deployment.

FoundationDB licenses will have reasonable and linear pricing. The license
cost for a cluster, including support, will be similar to the operational cost
of the commodity hardware in the cluster."

From: <http://www.foundationdb.com/faq/>

~~~
EugeneOZ
"You may not deploy or use the Software in a production environment" - from
TOS <http://www.foundationdb.com/BetaLicenseAgreement.pdf>

------
ibejoeb
@Dave_Rosenthal can you address the licensing? I'd consider porting a
nontrivial application if the terms were agreeable. I'm a little confused,
though.

First:

"FoundationDB Beta 1 is ready for production use."

Then:

"The community edition will include the full capabilities of FoundationDB and
will allow production deployment."

Finally:

"FoundationDB grants you a...revocable license...for test purposes only in a
non-production environment"

~~~
Dave_Rosenthal
Yes, thanks. You are correct, the "Beta Evaluation License" only covers
evaluation use, not production. Our "GA" release will include transparent
pricing and a free community license for smaller clusters.

If you are interested in using FoundationDB Beta in production today (as some
of our customers are) you should give us a call. We can get you a license for
production use and support at a very reasonable cost.

------
rcknight
RavenDB is also ACID I beleive? The foundation site makes it sound like they
are the only one.

~~~
Dave_Rosenthal
FoundationDB founder here. We think that it's awesome that RavenDB also
supports multi-node ACID transactions. However, on page 9 of
[https://s3.amazonaws.com/daily-
builds/RavenDBMythology-11.pd...](https://s3.amazonaws.com/daily-
builds/RavenDBMythology-11.pdf) they warn against relying on this capability:

"RavenDB supports multi document (and multi node) transactions, but even so,
it isn’t recommended for common use, because of the potential for issues when
using distributed transactions."

So, the differentiator is that FoundationDB is built from the ground up to
support these type of transactions at high performance levels with no
"potential issues".

~~~
Maarten88
That remark seems a bit unfair. That document is two years old, from long
before version 1.0. RavenDB is now at version 2.0. It was also built with
transaction support from the start. The default setting for TransactionMode is
Safe (<http://ravendb.net/docs/server/administration/configuration>) so most
users will use that, if there were any issues with it that would certainly be
known by now.

~~~
fry_the_guy
The current RavenDB documentation still warns against using "system
transactions" because of performance reasons ([http://ravendb.net/docs/client-
api/advanced/transaction-supp...](http://ravendb.net/docs/client-
api/advanced/transaction-support)), so I think it is still fair to say that
RavenDB was not designed for applications requiring high performance cross
node transactions.

~~~
Maarten88
System.Transactions refers to support by RavenDB of the Distributed
Transaction Coordinator service on Windows. This makes it possible to enlist
transactions in multiple systems (MSSQL, MSMQ, RavenDB, NServiceBus) in one
single transaction with the possibility of a rollback. It is not needed for
transactions inside RavenDB, and I think foundationdb does not even support
something like that (it is Windows specific, although Mono supports it on
other platforms).

------
rcb
Very nice, and congratulations. Reading the "Data Modeling" documentation
brings to mind some similarities with the MUMPS storage model (Intersystems
Caché / GlobalsDB / GT.M).. is this a correct observation? Thanks.

~~~
Dave_Rosenthal
Absolutely. Good eye.

------
toolslive
I searched the site a bit and saw no information whatsoever on what is behind
it, so I'm going to guess:

For the local store, probably something like an append-only (aka copy on write
aka persistent) B-tree-ish data structure. This guarantees good behaviour on
SSDs, and cheap/easy transactions. It can also be scaled linearly over
multiple spindles.

For the distributed aspect, I guess they use some paxos variation (I guess
this from the separation between logging the transaction and durably applying
this)

But again: I am merely guessing.

With regard to flow: Is it just me, or does it look like a poor man's
concurrency monad.

------
xb95
I really want to be excited by this.

For better or worse, after being in this industry for a long time now all I
can say about a new database launch is: cool, I'd love to talk to you in 3-5
years when you've had time to work out the hinks and people have figured out
what your strengths and weaknesses are and where the real gotchas will getcha.

Until then: good luck. A distributed, linearly scalable, easy-to-admin NoSQL
database with transaction support and full ACID would be delicious indeed.

~~~
Dave_Rosenthal
I totally understand this perspective. FWIW, we're trying to be as open as we
can about those "gotchas". You might be interested in our:

\- Known limitations ([http://foundationdb.com/documentation/beta1/known-
limitation...](http://foundationdb.com/documentation/beta1/known-
limitations.html))

\- Anti-features (<http://foundationdb.com/white-papers/anti-features/>)

\- Performance considerations
([http://foundationdb.com/documentation/beta1/developer-
guide....](http://foundationdb.com/documentation/beta1/developer-
guide.html#developer-guide-peformance-considerations))

Talk to you in a few years :)

~~~
xb95
FWIW, those documents are a fantastic start and contain the sort of things
that I wish I had known about MongoDB and Riak before embarking on quests to
use them.

I hope you do well -- the world really needs a good amalgamation of strong
data integrity (ACID, transactions) and the ease-of-life that distributed
databases generally give you. Since I don't work in an industry where I need
microsecond reads, I'm perfectly happy to pay the cost of "it might take 10s
of milliseconds to commit this" if I can avoid having to go down the "okay,
now I have to shard my project" path for an eighth time. :-)

------
wbl
MVCC does not provide serializability of transactions without a bit more work.
Since when can you get away without giving guaranties about what you support?

~~~
Dave_Rosenthal
I agree that MVCC does not itself guarantee transaction isolation, but
FoundationDB does indeed do the "bit more work" to guarantee all of the ACID
properties.

~~~
wbl
Ah, I found the page where it claims serializability.
[http://www.foundationdb.com/documentation/beta1/developer-
gu...](http://www.foundationdb.com/documentation/beta1/developer-guide.html)

------
JoelJacobson
Any plans on releasing an SQL-layer to allow querying FoundationDB using
normal SQL? Would be nice with both SQL and ACID in the same database.

~~~
Dave_Rosenthal
No plans, but that would be nice. Others like Clustrix and NuoDB are more
directly targeting the distributed SQL database market. Often we describe
FoundationDB's core to people as more of a "storage substrate" than a
database. It is possible to build an efficient SQL database as a layer on top
of that substrate. For example, SQLite4 is choosing a transactional ordered
key-value abstraction for its internal storage engine (which exactly matches
FoundationDB's API). Of course, a SQL database is a big project involving a
lot more than just a storage engine!

------
colinhowe
Took me a while to find how FoundationDB thinks about data... it's an ordered
key-value store with nice scaling properties.

------
EugeneOZ
"APIs for C, Python, Ruby, Node.js, and Java"

And where is PHP?

~~~
itp
PHP and .NET are the clear front-runners when we asked this question on our
community site, so while I can't commit to any specific timeline, it's safe to
say we'll be adding them soon.

------
wildchild
Too complicated to just download it. I'll stay postgres instead.

------
loeg
NoSQL YesACID LBJ IRT USA LSD LSD LBJ FBI CIA LSD LBJ…

------
goloxc
could not have been more disappointed

