
SQL vs. NoSQL - dskang
http://www.linuxjournal.com/article/10770?page=0,0
======
badclient
I am presently researching these so-called NoSQL databases and one thing that
I keep wondering is why can't all of these database still support a _limited_
form of SQL-like language?

Almost any json document could be represented as a db table as far as I can
see. Why can't I query using a common language instead of learning each of
these NoSQL database's own way of doing simple stuff like returning documents
of User object that have a gender field value as 'male'. In SQL, it would be
something like "select * from users where gender='male". Why can't NoSQL
databases support a query similar to that? Why do they require me to describe
a similar request in their own unique syntax?

I sometimes get the feeling that coining a term like NoSQL is a marketing
gimmick which hurts people actually trying to learn the nuanced differences
but I am only getting started. Why can't we _extend_ SQL to support
"NoSQL"-specific cases instead of replacing it with _nothing_.

I get that a big part of SQL is joins and the philosophy of joins goes against
the idea of NoSQL. The solution to that is to still accept SQL but just throw
an exception when join is used with a link to educate the person on
alternative implementations.

~~~
almost
I believe some do offer SQL-like query languages.

But if all you support is lookup by key, or even just lookup by field, then
SQL really isn't that useful in my opinion. And if your lookups are based on
map-reduces that have to be pre-specified then I can't see any place for an
SQL-like language at all.

SQL is ok for specify queries to relational databases, but I don't think it
generalises to any type of store apart from maybe the very basics. And at that
point what benefit are you getting appart from a slight, and misleading, sense
of familiarity?

~~~
3amOpsGuy
Yeah Cassandra has CQL and it works pretty well, it's comfortable to use where
the cassandra-cli is quite painful for anything other than the very beginning
of a project.

I believe in earlier releases CQL was slower but now it seems fine. Also the
most popular Cassandra client, hector, supports CQL, adopting CQL should be
possible for a lot of Cassandra use cases.

------
elchief
I don't really get why HBase isn't bigger than it is. I mean, it's BigTable,
right? And Facebook dropped Cassandra for it, IIRC.

Not trying to start a war here, I am not an interested party, I am just
curious.

~~~
dj2stein9
Only a fraction of companies can ever really outgrow the capabilities of a
fully tuned RDMS. And only a fraction of those will choose HBase over its
competitors. And so only people who have worked for such companies have any
real experience with the product... and only a fraction of those are vocal
about it on the internet.

~~~
3amOpsGuy
I agree with what you say, but I think we also need to consider cost as well
as capability of the underlying product.

Introducing cost massively changes how that equation stacks up.

"How many tps can I run given a budget in the form of these 6 available
servers". Well the boxes are under spec for what you want to achieve with an
RDBMS, but yeah no problem with XXX nosql product.

------
VLM
A NoSQL discussion isn't complete without a link to the inner platform effect

<http://en.wikipedia.org/wiki/Inner-platform_effect>

Basically if your application domain is inherently relational or inherently
"SQL" no "NoSQL", then building your own RDBMS in your application absolutely
dooms you, unless you're not a "real" application writer but actually a RDBMS
author.

I have having a conversation with a guy "I wish there was a library for
(whatever nosql DB he was complaining about) I could link in to do
transactions and indexing for me" my reply "yeah, its called postgresql".
That's not trendy and buzzword compliant, he ended up annoyed with me. There's
a "pragmatic programming" book "seven databases in seven days" which is a
pretty good book and it describes polyglot database design a little toward the
end... so you "need" a key-value store and indexed transactions and there's
nothing that does both perfectly... Well, there's plenty of good free open
source DBs, so install and use two DBs... its really not that hard.

~~~
agbell
Found it: ( I hope its as good as the 7 languages one ):

[http://pragprog.com/book/rwdata/seven-databases-in-seven-
wee...](http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks)

------
Lasher
Interesting that one of the original "noSQL" databases that is still going
strong today, Pick and its variants, was not mentioned in the article at all.

Universe and JBase are both well supported on Linux and there are several
multi-billion dollar (revenue) companies running their core business on it - I
work for one of them.

~~~
itcmcgrath
Both UniVerse and UniData are equally well supported on Windows, Unix and
Linux. They have some interesting features that are still reasonably unique in
the database world. E.g. Built-in transparent record encryption, including
both key & index. [http://2012.nosql-matters.org/cgn/wp-
content/uploads/2012/06...](http://2012.nosql-matters.org/cgn/wp-
content/uploads/2012/06/Dan_McGrath_Rocket_U2_DB_Multi_Value_Model.pdf) Full
disclosure: I'm the product manager for the databases.

~~~
Lasher
You're probably familiar with the Eclipse distribution management package then
(not to be confused with the IDE), that's the app I was referring to. I've
been using Universe since the Vmark days, ADDS Mentor and Sequoia before that.
Pick type databases are sadly under-rated. Possibly because there's never
really been any kind of marketing aimed at indy developers and/or students?

------
stevencorona
stopped reading the articled referred to amazon s3 a database. instant
credibility killer. no thanks.

(p.s the amazon product you're looking for is dynamo)

~~~
genwin
What prevents S3 from being used as a highly scalable NoSQL key-value store?

~~~
agilord
latency (and costs)

~~~
jedberg
Neither of those things prevents something from being a database.

S3 is an excellent key/value store for large values. It's also publicly
available, which is nice.

For example, all the thumbnail images on reddit are stored in S3. Essentially
the client is given the key and then they can go look up the value themselves,
and since it is publicly available http, it works right there in the browser.

~~~
agilord
Your example (reddit using S3) is not a database use, it is a content delivery
network use. For that, it is good enough, but I'd choose a different key-value
store for database, and mostly because of the latency (been there, tried
that).

~~~
jedberg
Alright, I can give you a better example. Netflix uses S3 to store movies that
haven't been rendered yet. It's definitely a database. The server says "I need
to render this movie to the iPhone format, what are the movie bits" and the
"database" (S3) returns the entire movie.

Also, I would argue that static content delivery is just another form of
database. It's just a massive key/value store, there the keys are the files
and the values are the contents of the keys.

Let me ask you this: What is _your_ definition of a database?

~~~
rm999
Interesting. Would you say a file system is a database?

~~~
plorkyeran
I don't think I could come up with a coherent definition of a database which
includes low-durability NoSQL key/value stores yet excludes file systems. This
doesn't necessarily mean that file systems are a database, however; it could
be that in order for the word "database" to be meaningful it must be defined
in a way that excludes some things commonly thought of as databases, similar
to why Pluto was deplaneted.

