

ThriftDB a new service from the Octopart Guys - staunch
http://www.thriftdb.com/

======
cgbystrom
Sounds awfully a lot like the old <http://code.google.com/p/thrudb/> ?

"Thrudb is a set of simple services built on top of the Apache Thrift
framework that provides indexing and document storage services for building
and scaling websites. Its purpose is to offer web developers flexible, fast
and easy-to-use services that can enhance or replace traditional data storage
and access layers."

No long under development though.

~~~
andres
I actually stumbled into thrudb a few months ago after we had already written
v1. We talked to some people at Disrupt that are doing similar things with
search and Thrift as well.

------
daviddoran
It seems like a strange thing to provide as a SAAS. Unless you're hosted in
the same datacenter, the web latency would surely make the speed of it
irrelevant. They mention they're "working on a way for developers to run
ThriftDB locally" which might make it worth looking in to. I could see it
being useful for some things, certainly, but it wouldn't provide enough
benefit as a SAAS to make calls over the web.

~~~
andres
Good point. We decided to make it available as a (free) cloud service just so
we could get hacker feedback as quickly as possible. If we had waited until we
had a version that was easy to install, it would have taken us a lot more
time. If people like it, the plan is to open source it.

~~~
daviddoran
I agree getting it ready for "easy" install takes a lot more work than
completing the coding. I wonder what feature, or combination of features, is
the differentiator here? Maybe it's the speed of search, or the loose document
schemas combined with freetext search, the REST API? Some more examples and
benchmarks would be interesting, when you find the time :)

~~~
andres
The thing that excites us the most is the flexible schema because that can cut
down on development time dramatically. The REST API is also optimized for
developer happiness.

We're working on some examples and should have them ready soon.

Benchmarking is a good idea but tricky because speed depends on the complexity
of the data and the query itself. If you have any recommendations for
benchmarks please let me know.

In the future we have plans to add machine learning features to optimize
relevancy algorithms automatically but that's still a ways off.

------
zbowling
Déjà vu :-)

I did almost this exactly last fall, except I used JSON and JSON Schema
instead instead of thrift. Called it hummingbird db. I submitted to YC but all
I got was an email that it wasn't that interesting.

~~~
andres
Would love to hear more about hummingbird db. Email me at andres@octopart.com.
We chose Thrift because the schema was flexible and that was the biggest
problem for us at Octopart.

~~~
nl
Did you consider Solr, in "schema free mode" using <dynamicField />?

Edit: I see you are using Solr internally for implementation.

------
jasonkolb
Doesn't this do the same thing as Solandra?

~~~
andres
ThriftDB uses Facebook's Thrift serialization internally so the schema is
completely flexible.

------
Meai
1\. You say your solution is extremely fast.

2\. You don't provide benchmarks.

If your solution is really so fast, then you must be making benchmarks
continuously. How else would you know if you are improving and whether you are
actually fast or just faster than [a tree | clouds | a ricecorn]. So either
you lie about your performance or you choose to purposefully hide your
incredibly well performing benchmarks.

Which explanation do you prefer?

~~~
Yzupnick
I know what you were trying to say, and I agree, but this comment would have
been a lot better without its mean and condescending tone.

------
blago
Sounds a lot like Solr, but can't imagine the search is nearly as powerful.

~~~
smock
We actually use Solr on our backend - we love it's open source community and
rich feature set.

~~~
zbailey
Sheer curiosity - why did you decide to go with solr instead of elastic search
which seems easier to scale out, with much the same feature set?

~~~
smock
That is a great question - we actually didn't consider using elastic search,
we went with Solr because we use it for Octopart and are experienced with it
so it made developing ThriftDB easier. We're evaluating other options now and
will have a look.

------
fleaflicker
Did you consider google protocol buffers? If so, why thrift?

~~~
andres
We use Python server-side and the Python implementation of Google protocol
buffers is extremely slow.

~~~
sigil
Here are two alternative Python protobuf implementations. They're each about
15x faster than the pure Python implementation from Google.

 _fast-python-pb_ \- codegen wrapping Google's C++ protobuf implementation.
<https://github.com/Greplin/fast-python-pb>

_lwpb_ \- non-codegen using a protobuf implementation in C. (disclaimer: I'm
an author) <https://github.com/acg/lwpb>

------
tedjdziuba
Oh cool, so now I can host my app in one data center and have it make DB calls
across the open internet to another DB server! But wait, there's more! It's
over a stateless protocol: HTTP, with really poor multiplexing/pipelining
support.

Latency is a feature, right? Like "slow your roll, cowboy, let's not have a
heart attack here".

