
MemSQL Launches Unlimited Community Edition - ericfrenkiel
http://blog.memsql.com/memsql-community-edition/
======
stock_toaster

      > The Community Edition is distributed as an executable 
      > binary and is a free edition of the commercial MemSQL 
      > Enterprise Edition. You are free to download and use 
      > MemSQL Community Edition within your organization.
    

So.. how long until the same thing happens as happened with FoundationDB?

~~~
capkutay
I think the FoundationDB acquisition by a company with no interest in selling
enterprise products was an anomaly. A popular, commercial enterprise storage
system that actually makes money would be an acquisition target from the likes
or Oracle, SAP, EMC, etc...in that scenario, the acquiring company would have
significant interest to increase adoption of the product and maintain the
developer community versus completely shutting the product down.

~~~
willvarfar
You mean like Oracle with MySQL? At least in that case the 'community' could
move to MariaDB, which is not an option for non-Free databases like MemSQL.

~~~
ericfrenkiel
you can do the very same - MemSQL uses the MySQL-wire protocol so it works
with any MySQL driver and tool.

~~~
lukaslalinsky
If MySQL was a drop-in replacement for MemSQL, you wouldn't need MemSQL in the
first place. The reason you chose MemSQL is probably because it offers
something that MySQL doesn't. If I can't take the source and continue using
the product, it's a very different situation from MySQL/Oracle.

------
phamilton
Cue "Call me maybe, MemSQL"

Aphyr's posts (taken with appropriate amounts of salt) have become the
authority on marketing claims. That said, many solutions are perfectly viable
with their shortcomings, but knowing what those shortcomings are is essential.

~~~
nikita
When MemSQL is configured for synchronous durability and all databases are
configured with synchronous replication, Jepsen confirms that writes to MemSQL
are durable and acknowledged appropriately. As part of testing for the 4.0
release, we used the jepsen network partition test and observed data
durability equivalent to Postgres([https://aphyr.com/posts/282-call-me-maybe-
postgres](https://aphyr.com/posts/282-call-me-maybe-postgres)) e.g. the
scalability of a cluster with the durability of a single node machine. As part
of running this test, we noticed some opportunities to do some cool
performance optimizations. Stay tuned for a blog post to follow!

~~~
seanp2k2
^ would love to see the details of this in a blog post, yeah. Please submit
that on HN when it's ready; I'm sure I'm not the only one interested :)

~~~
nikita
will do!

------
ericfrenkiel
Eric, one of the cofounders, here. happy to answer any questions on MemSQL 4
and the community edition. Some new features in MemSQL 4:

\- fully distributed joins

\- native geospatial index and datatypes

\- lots of new SQL surface area

\- concurrency improvements

\- analytic optimizer

\- Spark, HDFS, and S3 connectors

~~~
mbesto
Hi Eric!

Not sure if you remember me, but we spoke several (5?) years ago when you guys
first started. I was the SAP HANA guy and I think we were talking about the
landscape of in-memory solutions back then. First off, congrats on the success
so far. Second, a few questions:

\- How is MemSQL comparing to HANA and Vertica? My understanding is that
MemSQL provides the same infrastructure (columnar in-memory based storage) of
those solutions but will run on commodity hardware (HANA for example is
hardware-vendor locked).

\- One of the interesting topics that has come up in the HANA space is that
it's expensive to maintain and scale. Specifically, provisioning new servers
for data growth and archiving old data out of memory. Are these issues present
at all in MemSQL?

\- Lots of your customers seem to be using it for company-specific strategic
solutions. Are any using it for operations? (like financial close reporting,
or as a transactional DB)

~~~
ericfrenkiel
Of course we remember you. Please stop by our new office!

You are right about the commodity hardware. The other difference with HANA is
that MemSQL rowstores are in memory for high throughput applications and
columnstores can be stored or flash or disks. So it's economical to scale
MemSQL to very large datasets.

\- MemSQL is very easy to scale. It comes with an ops dashboard that lets you
add nodes with just a few clicks.

\- There are a lot of different use cases. Some companies use us for
operational reporting, end of day financial reporting, high throughput
counters, real-time risk analysis, etc

~~~
mbesto
I might take you up on that! Shoot me your contact details so I can set up
(just did a search on my emails and can't find anything). My contact is in my
profile.

Thanks!

~~~
nikita
eric at and nikita at memsql.

~~~
mbesto
Cheers!

------
menegattig
Our company (Simbiose) recently did a strong stress test with memSQL with
billions of JSON rows, using complex JOIN queries and the results are simply
AMAZING.

------
jansc

      > While you are free to use Community for your projects,
      > MemSQL does not support or endorse using it in production.
    

([http://blog.memsql.com/memsql-community-
edition/](http://blog.memsql.com/memsql-community-edition/))

Ehhh. Do they mean that the Community Edition is only usable for development?

~~~
ericfrenkiel
for critical deployments, the enterprise version has high availability, cross
data center replication, and more. you can use the free edition however you
please.

------
superlogical
The CloudFormation cluster generator tool they have is really cool.
([http://cloud.memsql.com/cloudformation](http://cloud.memsql.com/cloudformation))
The templates it generates are pretty complex, I wonder if that is hand
written or using some kind tool.. would anyone be able to shed some light on
how they did it? Do you think most of it is hand coded? I've been playing
around with the .NET SDK in VisualStudio 2013.. it includes a cloud formation
project type, and you can type out the JSON with Intellisense which is pretty
cool.

~~~
ankrgyl
Hi @superlogical, thank you for using it and for the kind feedback!
CloudFormation by itself provides some very basic level of conditional/looping
logic, but we found that it was not enough yet to provide a stellar
experience. So, when you fill out the form on cloud.memsql.com, we auto-
generate a template (we wrote the code to do this) that matches the parameters
you filled in, upload it to an S3 bucket on our account, and then expose it as
a download or via GET directly to your AWS account (i.e. you own the
hardware/database/data).

------
aklarfeld
This is way cool. Already spinning it up on my AWS clusters. Such a quick
setup. Super stoked to us it.

~~~
rjonesx
It is way cool. I've been an enterprise user for some time now and am excited
to get to use it on other projects for which there wasn't cost justification
in the past.

------
BradRuderman
When can we expect windowing functions, specifically row_number, rank, count,
lead, lag, by partition.

~~~
aristus
All I can say is "it's on the roadmap." :) I used to do a _lot_ of analytics
at Facebook and am now a PM at MemSQL, so it's one of my priorities.

------
matt2000
Would love to hear from any existing users on their experiences so far
(assuming that's allowed under previous licenses). Choosing a database is one
of those decisions where I tend to go with the safest, well known option, but
maybe I'm missing out.

~~~
rjonesx
I have been an enterprise user and had the luxury of using the 4 beta for the
last month or so. I run a cluster of 18 machines with 192 cores and 540GB of
RAM.

Impressions:

memSQL is remarkably stable. I actually have one machine running the old
memSQL 1.0 beta that has not rebooted in months. 4.0 has similarly stable. The
only problems happen when you run too many other processes on the aggregators
(which is really just me being stupid).

Speed is great and the wire compliance with mySQL makes it very easy to
develop for. To be honest, the "keeping the data in memory" part isn't the
best part, it is the query compiling. It is incredibly fast. Often a query
that takes 30sec to 1min to execute will compile down to fractions of a
second. It is very cool to watch and never gets old.

We are looking to literally move all of our internal stuff to memSQL community
edition while keeping our customer tools on enterprise.

------
olig15
Slightly off-topic: The font weight is too light to read properly on my PC
(Widows 8.1, Chrome). Stopped reading because it was too much effort to try
and read.

------
peterplaylyfe
Lol. I got what i asked for in the quora question?

------
DannoHung
Are there any optimizations or explicit support for proximate ordered joins?

~~~
nikita
Could you please elaborate? Do you mean approximate joins as in this talk?
[http://www2.research.att.com/~divesh/papers/ks2005-aj-
tutori...](http://www2.research.att.com/~divesh/papers/ks2005-aj-tutorial-
talk.pdf)

~~~
DannoHung
No no, sorry, it's much simpler (at least in how it works, no guarantees on
implementation complexity, of course). It's an issue that comes up in
timeseries databases pretty often.

Say I have a table full of quotes and a table full of trades. I want to know
what the quote price was at the time the trade occurred. In no-frills SQL,
that translates into something like:

    
    
        select * from t left outer join q on q.time=(select max(time) from q where time<=t.time and sym=s) and t.sym = q.sym where date = d and sym = s
    

If you have some sort of support for time proximate joining, the query engine
only has to perform a binary search (as long as the indices on the symbol and
time columns are appropriate, say symbol partitioned, then time sorted within
symbol) to find the correct row from quote to join against. If it doesn't have
such support, then a scan is required to find the maximum time value from the
quote table that satisfies the constraint on the trade time. Presumably, if
you do have support, this wouldn't be the exact query syntax, because that
would heavily imply that you _want_ to perform a table scan, or that you could
change some aspect of the subquery without affecting performance. Maybe you'd
have it be something like this:

    
    
        select * from t left outer join q on before(t.time,q.time) and t.sym = q.sym where date = d and sym = s

~~~
jchen5
If you just use the no-frills SQL query, we're able to optimize it to do a
fast index seek (instead of scan) on q.time, because we know we only have to
get the max row. This optimization isn't specific to proximate joins, a simple
query like: select max(a) from t where a < 42; will be optimized by memsql.

------
joaojeronimo
#FoundationDB

------
sql2
Goood~

