
InfiniSQL - ahassan
http://www.infinisql.org/
======
mtravis
A few things (I'm the author of InfiniSQL)

1) I include keystore-like stored procedures in the source. They do get/set
with integer key and string val. I haven't done thorough benchmarking, but I
expect them to outperform the other benchmark I've published, which is quite a
bit more complex workload

2) (camus2) agreed, nothing ever dies in IT. _But_ roll back the clock a few
years. How much noSQL would come into exisence if there was a free xzySQL that
scaled across nodes, was fast, etc. I believe the answer is that there'd be
very few network-based noSQL for operational workloads if that had been the
case.

3) jwatte: Yeah! Jagged edges too!

4) stephen24: Also, I intend to change the license from AGPL to GPL next time
I push out some code. No excuse not to try it out.

5) siliconc0w: There's an architectural write-up at High Scalability:
[http://highscalability.com/blog/2013/11/25/how-to-make-an-
in...](http://highscalability.com/blog/2013/11/25/how-to-make-an-infinitely-
scalable-relational-database-manag.html) \-- I believe that the actor model
architecture is distinct in InfiniSQL.

6) diwu1989: Yes and no. Yes, MemSQL is more mature. No,

(a) I'm not sure how MemSQL scales horizontally (especially since that was a
feature added after v1 of their code was released), and,

(b) MemSQL isn't free software

7) itsbits: for now InfiniSQL is mainly for hackers and early adopters--the
dependencies are pretty clearly documented but it requires some effort to work
with in its current state

~~~
jsmthrowaway
Please consider the Apache License or some other license instead of the GPL.
There are many organizations that cannot use any flavor of GPL, including
LGPL, for legal reasons. You can debate the wisdom of that amongst yourselves,
but alas, that's how it is in some places.

(And I really want to try this...)

~~~
DannyBee
"There are many organizations that cannot use any flavor of GPL, including
LGPL, for legal reasons"

To be clear, there are no legal reasons I can think of that would ever prevent
internal use of LGPL/GPL software.

You mean these companies (Apple, for example) have policies.

Policies like this often change because someone decides the cost vs risk
tradeoff is worth it.

Changing a license because of bad policies of certain companies is not a great
reason to change a license (in fact, it's, IMHO, an actively bad one).

You really should only change licenses if you find the license you chose does
not suit the _needs_ of your users (and policies are not really needs).

~~~
philwelch
I find that to be a strangely ideological response. Your prospective users'
requirements are up to them to decide, not up to you. They're the ones who are
going to decide whether or not to use your software.

~~~
DannyBee
?? Of course they are up to the users to decide, but policies and needs are
different. I'm curious, how do you think policies like this change?

Most of the developers i've seen will happily sell you a commercial license if
you don't like the software. After paying for it enough, most companies start
to ask "well, actually, how risky is this, really?", and this is how policies
change.

In any case, my other point stands - there are no actual _legal_ reasons to
not use LGPL/GPL software internally. It would have _zero_ legal impact.

~~~
philwelch
If InfiniSQL was an established incumbent where the choice was between living
with GPL and buying a commercial license I would agree with you, but it's a
newcomer where the main choice is whether to use it at all.

------
jacob019
I'm supposed to use the perl api for user and schema management? Perl holds a
special place in my heart, but I'm not too excited about managing my database
with it. How about an interactive console?

I'm currently using MySQL, how similar is the SQL syntax?

~~~
mtravis
On backlog to fix. But InfiniSQL is for hackers and early adopters at this
stage.

The SQL support is documented
([http://www.infinisql.org/docs/index/](http://www.infinisql.org/docs/index/))

~~~
coolsunglasses
Hackers and early adopters are using Perl in 2013? Sure you aren't off by
12-15 years?

~~~
mtravis
This you?
[http://favstar.fm/users/hipsterhacker](http://favstar.fm/users/hipsterhacker)

Also, the main application is in C++. A python script launches the C++
daemons. Perl scripts are quick and dirty tests and deployment scripts. The
main hacking I'm looking for is with C++, and I don't care so much if the
other stuff gets re-implemented in some other language.

~~~
coolsunglasses
Nope, just a guy that fucks with databases.

No API, got it.

------
camus2
I believe the original subtitle is "Extreme Scale Transaction Processing" .
"The NoSQL killer" is kind of childish, nothing is going to kill anything.

~~~
yeukhon
Same thought and it being at an early stage, ugh. And there goes at least a
dozen of competitors out there trying to be different than MongoDB. I am just
sort of happy that in the SQL world we usually either look at MySQL or
PostgreSQL (well, Oracle and SQL servers are probably more relevant to
corporate web service)... but I think people are trying to migrate too.

~~~
tracker1
I think that even in a NoSQL driven domain, that a classic SQL based RDBMS has
a place. It's that certain types of load have acceptable levels of relaxed
constraints.. that can increase when your data is searched/read over 1000
times for every write. That joins are expensive and even mirroring data to a
nosql store has benefits over purely rdbms.

I like document stores like MongoDB and RethinkDB and feel they are a great
fit for most scenarios. I also feel that caching layers with Redis or
Memcached can help...

Cassandra is interesting in the primary storage space as well, and imho has
resolved a lot of issues, while others remain. I'm interested to see if this
database can get there faster than Cassandra/CQL can get to more parity with
traditional SQL systems.

While I appreciate the options, there is no one solution for everything... If
you never break 100 simultaneous users, memory-mapped flat files and
map/reduce could be sufficient.

------
wimpycofounder
So...uh...how does it work? Anyone know if there is an architecture overview
somewhere? And why there isn't a link to it on the damn front page?

~~~
jfim
From their documentation:

> InfiniSQL currently is an in memory database. This means that all records
> are stored in system memory, and not written to disk. This provides very
> high performance--but it also means that InfiniSQL currently lacks the
> property of Durability. If the power goes out, all data is gone. This
> limitation is temporary.

They do mention that they'll implement persistence, but that's likely to lower
performance, as you're limited to how fast the write ahead log can be written,
even if updates to on-disk structures are batched.

They also mention:

> No sharding is necessary with InfiniSQL: it partitions data automatically
> across available hardware. Connect to any node, and all of the data is
> accessible.

I haven't looked at how joins are done across large tables that span over
multiple nodes (or if it's even supported), but that's not likely to be fast
either, for obvious reasons.

~~~
mtravis
1) persistence: battery-backed UPS and synchronous replication. No WAL
anywhere. I'm thinking about ways to do disk-based storage without synchronous
IO, to provide decent performance with higher storage capacity

2) no joins supported yet. However, the benchmark that I performed (on the
blog) involves 3 updates across random nodes. I designed InfiniSQL
specifically to perform multi-node transactions very well, because that's the
Achilles' heel of every other distributed OLTP system. I plan to implement
joins, but expect them to perform decently for the workload you describe.

~~~
jfim
Gotcha, it's for OLTP, don't know how I missed that.

Should be quite easy to do equijoins especially if you're joining a couple
thousand rows at most at a time; it only gets hairier when you're joining all
records of very large tables that don't necessarily fit in memory, which is
not very OLTP-y.

With regards to persistence, I'm really curious to hear how you're planning to
have durability without writing something to disk on every transaction. It
could work if you're relaxing the definition of durable to mean written to
memory on at least $n$ nodes, though that's likely to be surprising to someone
with a stricter definition of durable.

Edit: By the way, it's really cool that you have a C++ implementation of
actors, I'll have to look into it. Have you thought about turning that into a
library?

~~~
mtravis
For durability, check out
[http://www.infinisql.org/docs/overview/#idp37053600](http://www.infinisql.org/docs/overview/#idp37053600)

I've thought about having an actor library, or minimally, to have the actor
basis of InfiniSQL independent of specific workload, but haven't thought it
through entirely. I'd be supportive of any efforts to that effect if you want
to work on it!

------
jbellis
Last week's discussion here:
[https://news.ycombinator.com/item?id=6795263](https://news.ycombinator.com/item?id=6795263)

------
siliconc0w
Can you compare InfiniSQL to existing in-memory clustered relational database
solutions like Galera?

------
diwu1989
I see this as fairly similar to memSQL, but less mature.

------
diger44
I actually thought this was another joke at first...

------
stephen
"Not just a teaser version". Nice!

------
glibgil
It uses 2pc so it won't really scale.

~~~
mtravis
I think you mean 2PL.

It does really scale, check out the benchmark report on the blog.
[http://www.infinisql.org/blog/2013/1112/benchmarking-
infinis...](http://www.infinisql.org/blog/2013/1112/benchmarking-infinisql)

For deadlock-prone workloads, it will likely not be as good, admittedly.

I'm considering a variation on MVCC that gets around the single transactionid
bottleneck, but the currently implementation is based on 2PL.
[http://www.infinisql.org/docs/overview/#ftn.idp37098256](http://www.infinisql.org/docs/overview/#ftn.idp37098256)

For concurrency management algorithms, there are no good ones. Only those that
are less bad than others in some cases.

~~~
MichaelGG
Have you given any more thought to ... not multithreading it? Since you're
scaling across servers, apply the same concept across cores. Presto, no more
bottleneck on atomically incrementing an ID.

~~~
mtravis
Good thinking, but I think that shifts the issue--namely, that each inter-
thread message uses atomic compare and swap to create the message. I assume
there'd be a similar bottleneck on the actor that generates the transactionid
limited by the number of messages it can send & receive.

Instead, a friend and I have been thinking about how to perhaps modify MVCC to
work with distinct transactionid's per partition. Namely, I'm already
generating what I call "subtransactionid"'s for each partition involved in a
transaction. And those must be ordered for synchronous replication, so I think
the way to implement a variation on MVCC may already be mostly there.

I know I still owe you an architectural doc...fixin' ta, ya know.

------
itsbits
so many dependencies to install...

------
jwatte
Oooh! Shiny!

