

NoSQL in SQL - BKCandace
http://www.hakkalabs.co/articles/nosql-sql-build-schema-free-scalable-data-storage-inside-traditional-rdbms

======
programminggeek
I recently just made what I think is a pretty neat pattern for reporting on a
project that we need time series data calculations for.

Essentially I already have a giant precalculation service for all the needed
calculations on all the data underneath a parent entity. So, I serialize that
using ruby marshall and then use lz4 to compress it before storing to db.

Its actually faster and smaller than using json strings in ruby. The whole
tree structure for each entity was 200-400k as raw json strings. It took
something like 300ms to serialize to json. I was able to do the ruby marshal
and the LZ4 HC compression in something like 20-40ms and it drops the size
down to more like 15-30k.

JSON is a pretty cool format, but it's a lot slower in ruby than you might
realize and it takes up a lot of space.

~~~
yeukhon
Interesting. I just deleted two tables from my project instead of spreading
the data into two tables I thought storing a "data" json column (I am a PSQL
user) would be better (fewer queries to make and more space efficent).

I didn't think about the size before compression.

Meanwhile, this serves well for Python:
[https://groups.google.com/forum/#!topic/google-
appengine/WPf...](https://groups.google.com/forum/#!topic/google-
appengine/WPfAvHDGNjQ)

I shall go ahead and write some tests to see how much space will be taken up
:)

------
geweke
Thanks for posting this. For what it's worth, I'm the author of this talk and
the one pictured in the video.

I'm flattered it's getting so much attention (I like having my ideas spread!),
although there's one thing that confuses me. I've never heard of Hakka Labs,
nor did I post the talk, video, or my bio on their site (although their
treatment sure looks like I did) -- as far as I can tell, they grabbed it from
the site of the SFRails Meetup at which I gave it and posted it online. I'm
grateful for the exposure, but some notification or clear notice that I'm not
affiliated with Hakka Labs (whatever/whoever they are) would have been nice.

~~~
petesoder
Hi Andrew, I'm the founder of Hakka Labs. We had permission from the SFRails
meetup orgs to record/post the talk, but I honestly apologize if this was
unknown to you. We don't consider this 'our' content in any way, and we're
happy to make it freely available for the benefit of engineers everywhere.

~~~
geweke
Thanks so much for replying. My concern wasn't about permission -- I'm far
from protective of my talk; hey, more distribution is great! -- so much as it
was the appearance of the site. (And even the video -- you overlaid your logo
on every frame of the video.)

To me, [http://www.hakkalabs.co/](http://www.hakkalabs.co/) looks a great deal
like a blog. When I see a blog -- particularly with a name and photo of the
author on an article -- I naturally assume that author either created that
content specifically for that blog, or authorized/contributed that content
specifically to that blog. If that's not the case, I think the blog needs to
make it very clear that they are republishing content taken from elsewhere,
without the author's knowledge. There's nothing _wrong_ with that (assuming
you have permission); it's just about making it clear that that is what's
actually happening.

(Underneath, it's about the perception that I am somehow "contributing to" or
"endorsing" Hakka Labs by posting content I created there. I'm not saying
anything bad about Hakka Labs at all -- I simply don't know enough to judge
either way! -- but IMHO it's not cool to create that perception without the
author's knowledge. It'd be like a startup GitHub clone suddenly hosting my
open-source code under an 'ageweke' account with my name and photo: while they
absolutely have every right to do that according to the licenses involved, it
gives the perception that I'm a user of their site and uploaded my code
there...when, in fact, I've never heard of them before.)

Anyway, don't want to derail this technical discussion any further; feel free
to reach out to me directly over email if you want to chat about anything
else. You certainly have my email address. ;)

------
alphadevx
The Memcache plugin in MySQL 5.6 provides a neat way to do NoSQL into RDBMS
storage: [http://dev.mysql.com/doc/refman/5.6/en/innodb-
memcached.html](http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached.html)

------
RVijay007
For any that are interested, Couchbase Mobile is essentially, in its current
form, a NoSQL database in SQLite. They might port to ForestDB in the future
though, but that's an implementation detail.

------
mantrax5
It's always interesting to observe my bad developer practices (such as
stuffing JSON in SQL table columns) become flexible "architectural patterns"
for building "schema-free, scalable data storage".

~~~
rebelidealist
If you watched the talk, he explains the specific kinds of that should be
stuffed into a JSON text column. It makes a difference.

This practice been done by many production sites over the years
[http://backchannel.org/blog/friendfeed-schemaless-
mysql](http://backchannel.org/blog/friendfeed-schemaless-mysql) Might not be
considered bad practice now.

~~~
mantrax5
Well I'm saying it tongue in cheek. For a long time I and many others have
stuffed JSON in SQL table columns, and I will continue to do so (heck,
databases have started supporting JSON as a result).

But every time a developer sees an interesting twist on a piece of technology
and goes for it, peers call it a bad practice.

I've been through many cycles like this, and inevitably some time passes, and
one day you wake up to see yesterday's bad practices have turned into exciting
advancements.

Moral of the story is, ignore the wisdom of the day and go for it, tiger.
Stuff that JSON in an SQL table.

~~~
mtdewcmu
I'm not sure which authority gets to pronounce which code practices are good
and which ones are bad (the Vatican?), but one thing that I've observed
plainly is that this authority can't make up its mind and contradicts itself
with regularity. My thoughts are that the whole project of trying to decide
for each possible snippet of code whether it is good or bad is foolish and
will never succeed. The fact is that everything in programming is a trade-off.
Being able to make decisions that don't lead to disaster comes down to
experience and wisdom.

