
Tiedot - Your NoSQL document database engine powered by Go - howardg
https://github.com/HouzuoGuo/tiedot
======
vladev
I'm hoping to start seeing projects in Go because Go gives them an advantage,
not for the sake of it being in Go.

~~~
hox
Does the project have to be open source?

My company currently is using Go to monitor and respond to events on hundreds
of thousands of RabbitMQ message queues. We create a goroutine for handling
each queue, and Go handles all of the concurrency and threading in the runtime
while avoiding the resource overhead of standard threads. All of this is done
in an application that took about 6 hours to write.

It really made a difference in our experience.

~~~
sbarre
Can you share any details about how many resources this program requires to
run?

~~~
hox
For the vast majority of cases that we deal with, we don't really need much
for resources to run this application.

One specific test uses approximately 20k goroutines and averages 15-20MB RAM
depending on test load. As for CPU utilization, the impact is minimal;
RabbitMQ is the biggest bottleneck, as our peak message throughput for a
single RabbitMQ broker is about 50k/messages per second, which our go process
is able to handle without much issue. The worst-case scenario that I've been
able to test for is one where those 50k messages are evenly spread across
different queues; even then, our CPU utilization wasn't any higher than
15%/core on a 12-core server.

------
AYBABTME
I'm reading over the source with my coffee this morning. I'll pseudo-CR it if
I see mistakes (commented on a typo already). Hope you don't mind! I'm
interested to see how you are doing this because I'm playing with my own toy
key-value store on the week-ends [1].

In [2], wondering why you make GOMAXPROCS=2*Cpu by default?

[1]:
[https://github.com/aybabtme/dskvs/blob/proto/](https://github.com/aybabtme/dskvs/blob/proto/)

[2]:
[https://github.com/HouzuoGuo/tiedot/blob/master/src/loveonea...](https://github.com/HouzuoGuo/tiedot/blob/master/src/loveoneanother.at/tiedot/main.go#L17)

~~~
pkulak
With GOMAXPROCS=n*CPU, n is roughly the amount of pre-emptive (vs the built-in
cooperative) multitasking that you want going on, with 1 being none. Handled
by the OS, of course. Interesting that you noticed a speed up > 1.

~~~
AYBABTME
I didn't think about that, I'll write that down in my checklist of things to
do when testing/benchmarking my projects... like another dimension to take
care of when testing. Aside from domain-range, good data, bad data, edge
cases... and other parameters - now add to that concurrency scenarios.

------
jzs
Nice to see a nosql database written in go. By the way your repository
structure is incompatible with "go get".

~~~
howardg
Much appreciated! I will soon configure my web server to be compatible with
`go get`.

------
VeejayRampay
Is there anything Golang can't do fairly well with a small code base? Damn, I
wish I were still 20 and had shitloads of time on my hands to invest in
learning the ins and outs of the language.

~~~
throwit1979
What does being 20 have to do with anything? I'm 34 and I've been deeply
acquainting myself with Go over the course of the last month.

~~~
hnriot
20 - as in before life starts to throw serious time sinks at you, wives,
children, mortgages, high stress jobs etc etc.

At 20, I just did whatever the hell I wanted, bummed around Europe before
figuring out where to do my masters. Responsibility was not paramount on my
mind. Maybe 20 something's today are different, but not the ones I know, it's
still all about having fun, learning new stuff and exploring the
possibilities.

I fail to see how anyone could not have understood what the comment meant.

------
kingmanaz
A re-write of "starbase" in Go would be a fun project for someone with the
time.

[http://hopper.si.edu/wiki/mmti/Starbase](http://hopper.si.edu/wiki/mmti/Starbase)

Starbase is pattered after "/rdb", a flat-file relational database adhering to
the Unix-philosophy, ie., piping together small, single-purpose tools. The
approach is covered in "Unix Relational Database Management" (
[http://www.amazon.com/Relational-Database-Management-
Prentic...](http://www.amazon.com/Relational-Database-Management-Prentice-
Hall-Software/dp/013938622X) ), a book which anticipated the "suckless"
movement by a couple of decades (
[http://suckless.org/philosophy](http://suckless.org/philosophy) ).

It would be nice to see something like /rdb, except with: 1\. Better
transparent support for optional indexes when querying. 2\. Automatic updating
of indexes when deleting/updating data. 3\. Scripts included in the package
written in "rc" rather than "sh". 4\. BSD license.

Perhaps something like tiedot could be built on top of the above: a single,
statically-compiled binary to expose the flat file database through a
JSON/REST interface and to honor the Unix user/group table-file permissions
through standard HTTP authentication. Forms could be designed against the web
service while system administration is handled with as much of the unix system
as possible.

Such a stack would be great for smaller start ups and where *nix experience is
available.

------
jedc
Does anyone know how this does/doesn't compare to Camlistore?
[http://camlistore.org/](http://camlistore.org/) (which is also written in Go)

~~~
howardg
according to my (limited) understanding, camlistore is a generic BLOB storage
- which is not something tiedot addresses. tiedot is a generic unstructured
data storage - more like CouchDB/Cassandra, it stores serialized JSON data
rather than BLOB.

~~~
throwit1979
Except cassandara is a columnar data store and has absolutely nothing in
common with couchdb (or tiedot) other than it stores and retrieves data.

------
farmdawgnation
I'd like to see how this compares performance-wise with MongoDB and other
JSON-based document stores, especially with data sets that are at a larger
scale. I know Mongo tends to start crumbling if it can't fit an entire index
in the available memory (which happens when you have a 4GB data set,
unfortunately). Have you done any of those comparisons?

That said, this looks really interesting. Though I can't imagine for the life
of me why you'd indent such a wonderful project with tabs. ;)

~~~
howardg
Thank you for the feedback! I noted down your recommendation here:

[https://github.com/HouzuoGuo/tiedot/issues/3](https://github.com/HouzuoGuo/tiedot/issues/3)

And actually `go fmt` prefers to use tab over spaces ;)

------
willvarfar
How does it stand on ACID, joins, redundancy, scaling etc?

Its a hobby project so I'm going to presume the worst, but I want to know what
the author intends to do for each of these.

~~~
howardg
Indeed, its wiki pages should have mentioned ACID properties, I noted the
issue down here:

[https://github.com/HouzuoGuo/tiedot/issues/4](https://github.com/HouzuoGuo/tiedot/issues/4)

Currently, tiedot's stance on ACID is similar to MongoDB's.

I have not spent enough time on the project to support redundancy in it,
sorry. I totally agree that redundancy is a must-have if someone wants to use
tiedot in serious scenarios, so I will definitely spend time on making this
feature available.

Scalability on symmetric multiprocessing architectures has been seriously
considered and implemented - basically, tiedot can demonstrate that more CPUs
= more performance. However scaling by replication has not been considered
yet.

~~~
jerf
You should decide if you want to make a serious run at being an option for
true production-quality deployment, or if this is a fun project. If it's just
a fun project, you may want to consider not worrying about
replication/redundancy; it's tricky, quirky, and if you haven't been
considering it from day one, likely to require a near-complete rewrite, which
may be an awful lot of work for a fun project. Of course, if you are going to
be serious, it is a must.

I am completely neutral on which direction you go; my point here is just that
if you are just having some fun, you may find replication will turn out to be,
well, potentially rather unfun. Educational as can be, though. It's a far, far
more subtle problem than initially meets the eye.

~~~
arethuza
Or it could be an embedded database (like SQLite) that keeps replication and
redundancy out of scope?

------
phasevar
Cool to see something like this in Go. :-)

------
nickpresta
Looks pretty interesting. I like the embedded use. I submitted a PR to enhance
your software:
[https://github.com/HouzuoGuo/tiedot/pull/6](https://github.com/HouzuoGuo/tiedot/pull/6)

------
trailfox
Might help to give it a well known open source license e.g. BSD, MIT, Apache

~~~
howardg
The license is 2-clause BSD license.

