

MemSQL ships 2.0, scales across hundreds of nodes, thousands of cores - nikita
http://developers.memsql.com/blog/real-time-analytics-platform/

======
SomeCallMeTim
Is this one of those, "If you have to ask how much it costs, you can't afford
it?" situations? Because "Try it now Free!" is the only thing I can see on
their site related to a cost, and "First one's free!" rarely means it will
always be free. :|

~~~
vosper
Their resource estimator starts off at 1640 cores [1] and even the bottom most
tick on the scale represents 500 cores. For people who can dedicate that
amount of hardware to their database the licensing cost is probably not a
major component.

The numbers look impressive, though.

<http://www.memsql.com/why-memsql#scale-out>

~~~
jufo
Try again - I found the bottom ticks were 8 cores and 256GB.

~~~
vosper
Fair enough - I had considered that to be on the axis since it's the minimum
value, and thus not a tick, but visually it certainly is represented as a
tick.

At any rate, the point I was trying to make is that they clearly expect you to
throw a lot of hardware at these systems.

~~~
SomeCallMeTim
Sure, but a lot of projects LIKE this have an open source project for those
people NOT in the enterprise, and who probably won't be running more than 8-24
cores or so.

On a related note, changing the amount of RAM seems to only change the amount
of RAM, not any of the other numbers. I guess that's their point, but I would
rather they just SAY that than make me try a bunch of options to show that RAM
doesn't matter. :|

------
curiousDog
I'm surprised these guys haven't been sued by Microsoft yet. I think the
founder himself worked on a similar project at MSFT called Hekaton and moved
out. Atleast that is what I heard from a couple of devs he tried to recruit
out of sql. I'd advise startups to be weary of jumping ship.

------
CurtMonash
I blogged about this in some detail this morning.

<http://www.dbms2.com/2013/04/23/memsql-scales-out/>

------
xtacy
Nice product. I'll try this out.

It seems like the isolation level is Read Committed. Are there plans to
support higher isolation levels (Serialisability)?

~~~
nikita
This is in the books. It's tricky to do it on the cluster and has perf
implications, but it is doable and we have a design for that.

------
RoboTeddy
Looks great!

Quick question: since data is uniformly spread across n leaf nodes, do queries
that require checking a number of rows >> n hit nearly every leaf? If so, does
this create latency problems when n is large? (since it'd only take one slow
request out of n to cause high latency)

~~~
ankrgyl
Yes, we will fan out queries when necessary. In a hard core oltp workload this
will affect latency across the system by flooding the network (if you send too
many of these queries at once) but we expose knobs that let you limit these
queries' parallelism to keep the rest of the system really fast.

------
brianatdts
the problem with scaling out to multi cores with a focus on ram is that in
larger datasets you end up trading disk latency for network and protocol
latency. I a not sure that is a great trade even if we are talking about fiber
channel as a medium.

~~~
dennis82
I have to disagree; disk is ancient - it's mechanical egads! - while 10GigE is
pretty commonplace now and infiniband and fiber channel are even faster.

back from my CS 101 takeaways: there are only 3 bottlenecks in a computer
system: CPU, network, and IO.

looks like MemSQL is fixing the CPU and IO bottlenecks, but physics is physics
so network is pure hardware solution haha

~~~
simcop2387
The problem is that you can end up with larger latency over the network
because it still takes a fixed amount of time for nodes to communicate. Even
with a 1TB/s link between nodes you can still have a good 30ms between them
all adding even more latency. That can be mitigated somewhat by a good
protocol that can manage that latency properly (e.g. not blocking while
waiting on ACKs and such), it can still end up with far more latency than a
few large disks would be (even better now with SSDs). That said I do imagine
that some datasets will benefit from this kind of topology (I can imagine that
geospatial stuff will do well with that, since you can locate physically close
things on a single machine and reduce the amount of talking needed).

~~~
TylerE
30ms? In anything resembling a modern datacenter? 0.3-0.5ms is more typical
these days.

------
resu
How does this compare to kdb+? This seems like a much less arcane competitor.

~~~
ericfrenkiel
kdb+ is compressed columnar in memory on a single box with a very exotic
language called Q. memsql is row-based in memory across n-machines using SQL.

~~~
resu
kdb+ is also very fast for real-time time series analysis and signal
generation. Seeing Morgan Stanley and Credit Suisse in the customer list made
me wonder if memsql could become a competitor in the niche that kdb+ currently
dominates?

~~~
CurtMonash
Yes, if we note that there are core niches where nobody will replace kdb+ any
time soon.

------
vugu
Great design on the site! Congrats to the memsql team

------
rip747
why is it that every company that has a blog either, a) doesn't have a link or
b) buries the link to their main product site on their blog?

~~~
paddy_m
Often because the blog is run on a separate software platform from the main
site.

~~~
CurtMonash
Still no excuse.

I've educated multiple vendors on the subject. E.g., see the last point on
[http://www.strategicmessaging.com/marketing-
communications-t...](http://www.strategicmessaging.com/marketing-
communications-tips/2012/12/09/) :)

------
gtrubetskoy
Where is the source code?

~~~
hackerboos
It's commercial.

------
dennis82
what are the major feature updates on this release compared to last one?

~~~
nikita
there are a lot of new features! Beyond distributing data across multiple
machines in a cluster, there's more SQL surface area, multiple levels of
redundancy for HA, and a distributed query optimizer. Some cool stuff with bi-
directional lock-free skiplists too w.r.t. indexes

