Hacker News new | past | comments | ask | show | jobs | submit login
Fabric – A simple triplestore written in Go (github.com)
92 points by mooreds 48 days ago | hide | past | web | favorite | 26 comments



Nice to see more triple store implementations. What’s the industry take up of data stores like this? It always struck me as much more academic and not used as much in industry. I saw a SaaS triple store a while ago but it seemed to disappear or never really take off.


Nubank recently raised $400M at a $10B valuation and they depend on Datomic [1] heavily for their core systems [2].

The RDF database marketplace is very established [3], and the likes of MarkLogic, DB2, and Oracle have clearly encountered profitable reasons to add RDF support. I believe RDF has good traction in knowledge-intensive industry domains such as clinical research and life sciences.

Disclosure: I work on Crux [4] which adds bitemporal versioning and eviction to a document->triplestore model running on top of Kafka.

[1] https://www.datomic.com/customers.html

[2] https://www.datomic.com/nubanks-story.html

[3] https://en.wikipedia.org/wiki/Comparison_of_triplestores

[4] https://github.com/juxt/crux


at topsy almost a decade ago, we had a triple store written in perl and backed with innodb for storing our copy of twitter.

that 80 node cluster of dl380s was a beast to operate, but damn was it spiffy when it was working well

rip ux8


+1 I got a lot of mileage out of the triple model when working with social media data. You just don’t know what data patterns you will find when you start looking, and need to support generic queries.


Happy Datomic user here – simple, flexible, powerful. I've recently heard good things about https://www.stardog.com/ which is a real triple store (Datomic adds a time dimension)


This data stores are the ONLY way to create decent chat bots (ones that know what they are talking about and can engage in a conversations).


eBay actually has an open source distributed triple store they released recently: https://github.com/eBay/akutan

Seems to me they'd be using it internally.


Are there any differences between a KV-store in the form of "Bob:Knows" — "John" and a triple store in the form of "Bob" — "Knows" — "John"? Redis, for example, can query the first one easily by scanning.

Bonus question: What are some real-life use cases for triple stores?


For one you could easily ask "what is the relationship between Bob and John?" --> Knows, which you can't without text manipulation in the KV format.

Sounds contrived, but can be handy in many uses cases.

Triple relationships like the above are the kind of queries a Prolog engine can answer well (and far more).

Semantic Web's RDF is also like this.


A slightly more plausible example is “Who knows John?”. I think about turning to triple stores when I’m still exploring an application domain and don’t know what data access patterns will look like yet. Something like a hexastore that maintains full indexes for all query orders seems like a reasonable compromise for read-heavy applications in the prototype stage.


It is not unusual to implement a triple store with multiple indexes, so you could build k-v stores with

s-p -> o p-o -> s s-o -> p

and then you have indexes which are good for those triple patterns.

Let's see.

The core table in the salesforce.com system consists of triples, but salesforce.com will materialize whatever indexes and views are necessary to make things fast based on automatic run-time profiling. Their patent on this should run out just about now, so this feature may turn up in real-life triple stores where it would make a big difference in practicality.

The NSA has been shopping around for a triple store which could ingest around 1 trillion triples per day.

The BBC made a nice web site for the world cup which used forward chaining inference in a triple store to determine the consequences of each goal, so the tables would all adjust whenever anything happened.


Do you have links for any of these lying around? Seems like some pretty cool examples.


How you store your triples affects performance, but, conceptually, is only an implementation detail.

But then, why stop at a KV-store? A set with entries “Bob:Knows:John” will work just as well, if you ignore performance.

But then, why stop at a set? A string “Bob:Knows:John;Bob:Loves:John;John;Is;vegetarian” works just as well (conceptually!)

IMO, a major real-life use case is as a means to produce PhD’s :-). The concept is enticing and easily grasped, but there are zillions of papers to write on query planning, automatic storage optimization, discovering heuristics, etc. It’s just like the early days of SQL: you don’t have to read decades of papers to move to the front of development.


Unless I’m missing something, the in-memory backend here appears to actually use the set solution: all of the triple fields are concatenated together and used as a dictionary key. Queries iterate through the dictionary entries until a sufficient number of results have been located.

On the other hand, I’m not really familiar with Go, so I may be reading it wrong.


you might want to search based on the predicate, or you might have versioned triples where the predicate changes

in one version, you have these two ideas represented:

[bob,likes,cake]

[ann,suspects,[bob, likes, cake]]

this may change in another version:

[bob,likes,cake]

[ann,knows,[bob,likes,cake]]

a decent triple store will allow you to version these ideas and explore how they change over time, or maybe query by predicate


>Are there any differences between a KV-store in the form of "Bob:Knows" — "John" and a triple store in the form of "Bob" — "Knows" — "John"? Redis, for example, can query the first one easily by scanning.

A triple store can more quickly answer queries about triples. The reason to use triples is that it is what you naturally get when you try to store structured relational data where the schema changes quickly.


There's a bunch of software products with the name fabric.


Hardest problem in computer science.


I mean, I'm working in UI Fabric right now (MS). Clearly there needs to be some sanity checks when naming software these days.


I agree. Come on.

The Python fabric package is really popular.


Could you compare this to eBay's distributed triple store https://github.com/eBay/akutan ?

Looks like it is written in Go too. I can see your's being much simpler to get up and running initially though. Looks like akutan isn't as simple since its built on docker and is a daemon.


What's the use-case for a triple store vs a graph DB? Social networks?


People also call triple stores graph DBs.

Triple stores support a disciplined set of primitive types that come from xml schema, so you have "xsd:integer", "xsd:datetime", "xsd:decimal", really the critical things that are missing in JSON. That is, there is a kind of fact where the object is a literal.

Triple stores also support facts where the object is an identifier for another object. That could be a URI which names it, or it could be an internal "blank node" identifier.

Other kinds of "graph database" have different semantics, for instance they might not have support for literals, or have a different set of literal data types, or they might let you attach facts to the edges (hypergraph, property graph, ...)


I thought this was going to be about the Python SSH library - a bit confusing.


Its title specifying that it’s a “simple triple store written in Go” wasn’t enough to tell you that it wasn’t?


Nice job! Perhaps consider adding it to awesome-go?

https://github.com/avelino/awesome-go

I looked at the report card already and seems like you've done a great job, so you'll have no trouble getting added!

https://goreportcard.com/report/github.com/spy16/fabric




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: