Hacker News new | comments | show | ask | jobs | submit login
Tiedot - Your NoSQL document database engine powered by Go (github.com)
85 points by howardg 1459 days ago | hide | past | web | 34 comments | favorite



I'm hoping to start seeing projects in Go because Go gives them an advantage, not for the sake of it being in Go.


Does the project have to be open source?

My company currently is using Go to monitor and respond to events on hundreds of thousands of RabbitMQ message queues. We create a goroutine for handling each queue, and Go handles all of the concurrency and threading in the runtime while avoiding the resource overhead of standard threads. All of this is done in an application that took about 6 hours to write.

It really made a difference in our experience.


Can you share any details about how many resources this program requires to run?


For the vast majority of cases that we deal with, we don't really need much for resources to run this application.

One specific test uses approximately 20k goroutines and averages 15-20MB RAM depending on test load. As for CPU utilization, the impact is minimal; RabbitMQ is the biggest bottleneck, as our peak message throughput for a single RabbitMQ broker is about 50k/messages per second, which our go process is able to handle without much issue. The worst-case scenario that I've been able to test for is one where those 50k messages are evenly spread across different queues; even then, our CPU utilization wasn't any higher than 15%/core on a 12-core server.


We'll never know where Go excels unless we build lots of things with it first :)


Go seems like a reasonable choice for this type of thing though, doesn't it?


I'm reading over the source with my coffee this morning. I'll pseudo-CR it if I see mistakes (commented on a typo already). Hope you don't mind! I'm interested to see how you are doing this because I'm playing with my own toy key-value store on the week-ends [1].

In [2], wondering why you make GOMAXPROCS=2*Cpu by default?

[1]: https://github.com/aybabtme/dskvs/blob/proto/

[2]: https://github.com/HouzuoGuo/tiedot/blob/master/src/loveonea...


Thanks for noticing the typo, it's been fixed and will commit shortly.

Regarding point 2, I made a note here:

https://github.com/HouzuoGuo/tiedot/wiki/Embedded-Usage

I experiment with different GOMAXPROCS settings on three machines and noticed that 1CPU does not run tiedot to its full potential, 3CPU seems to be slower than 2*CPU.


With GOMAXPROCS=n*CPU, n is roughly the amount of pre-emptive (vs the built-in cooperative) multitasking that you want going on, with 1 being none. Handled by the OS, of course. Interesting that you noticed a speed up > 1.


I didn't think about that, I'll write that down in my checklist of things to do when testing/benchmarking my projects... like another dimension to take care of when testing. Aside from domain-range, good data, bad data, edge cases... and other parameters - now add to that concurrency scenarios.


Nice to see a nosql database written in go. By the way your repository structure is incompatible with "go get".


Much appreciated! I will soon configure my web server to be compatible with `go get`.


We've done a few of them here. Notably:

* http://cbgb.io/

* http://dustin.github.io/2012/09/09/seriesly.html

* https://github.com/couchbaselabs/sync_gateway

cbgb is an API compatible Couchbase implementation in go. We use it in place of couchbase when we need something tiny to play around with.

seriesly is a time series database for storing and aggregating sample data and doing things like this: http://bleu.west.spy.net/~dustin/seriesly/

sync_gateway is how our mobile team synchronizes data across all your phones and tablets and your central DB.


Is there anything Golang can't do fairly well with a small code base? Damn, I wish I were still 20 and had shitloads of time on my hands to invest in learning the ins and outs of the language.


It does take Google ~100k lines to load balance their MySQL servers, so the size of your program is still highly dependent on how simple you want to make it. I'm 21 and spending most of my time writing Go -- there aren't many "ins and outs" required to learn. I find that it is very idiomatic to write simple solutions and simply take advantage of interfaces if someone wants to implement a more specific piece of your code base. If you want to learn Go well all you have to do is read the spec[1] and the source code of at least some portion of the standard library.

As an aside, this project is interesting. I've been kinda curious of experimenting on a project like this on my own. However, I wish the author's documentation opened with what ideas from what papers inspired the project.

[1] http://golang.org/ref/spec


You should really make the effort to take a look at it, it's well worth it! If you have a 'c' background it should be relatively easy to pick it up.

This should be an easy weekend read http://www.golang-book.com/


What does being 20 have to do with anything? I'm 34 and I've been deeply acquainting myself with Go over the course of the last month.


20 - as in before life starts to throw serious time sinks at you, wives, children, mortgages, high stress jobs etc etc.

At 20, I just did whatever the hell I wanted, bummed around Europe before figuring out where to do my masters. Responsibility was not paramount on my mind. Maybe 20 something's today are different, but not the ones I know, it's still all about having fun, learning new stuff and exploring the possibilities.

I fail to see how anyone could not have understood what the comment meant.


I think he's referring to his 20-year old self, not being 20 in general. I definitely had more time to look at stuff when I was 20.


A re-write of "starbase" in Go would be a fun project for someone with the time.

http://hopper.si.edu/wiki/mmti/Starbase

Starbase is pattered after "/rdb", a flat-file relational database adhering to the Unix-philosophy, ie., piping together small, single-purpose tools. The approach is covered in "Unix Relational Database Management" ( http://www.amazon.com/Relational-Database-Management-Prentic... ), a book which anticipated the "suckless" movement by a couple of decades ( http://suckless.org/philosophy ).

It would be nice to see something like /rdb, except with: 1. Better transparent support for optional indexes when querying. 2. Automatic updating of indexes when deleting/updating data. 3. Scripts included in the package written in "rc" rather than "sh". 4. BSD license.

Perhaps something like tiedot could be built on top of the above: a single, statically-compiled binary to expose the flat file database through a JSON/REST interface and to honor the Unix user/group table-file permissions through standard HTTP authentication. Forms could be designed against the web service while system administration is handled with as much of the unix system as possible.

Such a stack would be great for smaller start ups and where *nix experience is available.


Does anyone know how this does/doesn't compare to Camlistore? http://camlistore.org/ (which is also written in Go)


according to my (limited) understanding, camlistore is a generic BLOB storage - which is not something tiedot addresses. tiedot is a generic unstructured data storage - more like CouchDB/Cassandra, it stores serialized JSON data rather than BLOB.


Except cassandara is a columnar data store and has absolutely nothing in common with couchdb (or tiedot) other than it stores and retrieves data.


I'd like to see how this compares performance-wise with MongoDB and other JSON-based document stores, especially with data sets that are at a larger scale. I know Mongo tends to start crumbling if it can't fit an entire index in the available memory (which happens when you have a 4GB data set, unfortunately). Have you done any of those comparisons?

That said, this looks really interesting. Though I can't imagine for the life of me why you'd indent such a wonderful project with tabs. ;)


Thank you for the feedback! I noted down your recommendation here:

https://github.com/HouzuoGuo/tiedot/issues/3

And actually `go fmt` prefers to use tab over spaces ;)


The formatting style is standardized by go (Enforced by gofmt) and they made the choice of tabs over spaces :)

http://golang.org/doc/effective_go.html#formatting


How does it stand on ACID, joins, redundancy, scaling etc?

Its a hobby project so I'm going to presume the worst, but I want to know what the author intends to do for each of these.


Indeed, its wiki pages should have mentioned ACID properties, I noted the issue down here:

https://github.com/HouzuoGuo/tiedot/issues/4

Currently, tiedot's stance on ACID is similar to MongoDB's.

I have not spent enough time on the project to support redundancy in it, sorry. I totally agree that redundancy is a must-have if someone wants to use tiedot in serious scenarios, so I will definitely spend time on making this feature available.

Scalability on symmetric multiprocessing architectures has been seriously considered and implemented - basically, tiedot can demonstrate that more CPUs = more performance. However scaling by replication has not been considered yet.


You should decide if you want to make a serious run at being an option for true production-quality deployment, or if this is a fun project. If it's just a fun project, you may want to consider not worrying about replication/redundancy; it's tricky, quirky, and if you haven't been considering it from day one, likely to require a near-complete rewrite, which may be an awful lot of work for a fun project. Of course, if you are going to be serious, it is a must.

I am completely neutral on which direction you go; my point here is just that if you are just having some fun, you may find replication will turn out to be, well, potentially rather unfun. Educational as can be, though. It's a far, far more subtle problem than initially meets the eye.


Or it could be an embedded database (like SQLite) that keeps replication and redundancy out of scope?


Cool to see something like this in Go. :-)


Looks pretty interesting. I like the embedded use. I submitted a PR to enhance your software: https://github.com/HouzuoGuo/tiedot/pull/6


Might help to give it a well known open source license e.g. BSD, MIT, Apache


The license is 2-clause BSD license.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: