Hacker News new | past | comments | ask | show | jobs | submit login

Thanks for writing this up! I worked with the Knowledge Graph as a contractor at Google in 2013. My manager had a neat idea for adding our own Schema and triples (actually quads) for a specific application.

It surprises me how many large companies do not have a ‘knowledge graph strategy’ while everyone is on board with machine learning (which is what I currently do, managing a machine learning team). I would argue that a high capacity, low query latency Knowledge Graph should be core infrastructure for most large companies and knowledge graphs and machine learning are complementary.

I agree. And I saw how averse companies were to graph databases because of the perception that they are "not reliable." So, we built Dgraph with the same concepts as Bigtable and Google Spanner, i.e. horizontal scalability, synchronous replication, ACID transactions, etc.

Once built, we engaged Kyle and got Jepsen testing done on Dgraph. In fact, Dgraph is the first graph database to be Jepsen tested. http://jepsen.io/analyses/dgraph-1-0-2 (all pending issues are now resolved).

Dgraph is now being used as a primary DB in many companies in production (including Fortune 500), which to me is an incredible milestone.

Can you share more about the kinds of use cases and qps that your customers have for dgraph?

Dgraph usages are quite wide-spread. It is being used for typical graph cases, like recommendation engines, real-time fraud detection, adtech, fintech uses, etc.

The design is very well suited for just building apps (scalable, flexible schema) as well, given the early choices to use GraphQL (modified) as the query lang and JSON as the response. So, we see even Fortune 500 companies using Dgraph to build mobile apps.

Most open source users use the simplest 2-node cluster, but we easily see enterprise customers use 6-node (High Availability) cluster or 12-node cluster (HA + Sharding). Given synchronous replication, query throughput can scale out linearly as you add more replicas/machines (each replica can reply without worrying about issues with typical eventual consistency models. Dgraph provides linearizable reads).

Write throughput wise, Dgraph can sustain XXX,XXX records/sec in the live path (and millions in the offline path). See my recent commit: https://github.com/dgraph-io/dgraph/commit/b7189935e6ec93aec...

Some recent public usage mentions of Dgraph: https://github.com/intuit/katlas https://twitter.com/pg_kansas/status/1096260809171353600


The first wave would be people building products directly on a knowledge graph, the second would be as part of support systems to augment your work and then probably a third wave to capture inside knowledge potentially. Governments would be a key client I think.

Sounds like a good Enterprise saas product idea

The beauty of RDF is it can support schema promiscuity i.e. you can have many schemas for your same data. You can do that with OWL. In typical graph databases you are fixed on nodes, properties and edges, but in RDF you can choose what should be node and properties arbitrarily. My issue with RDF has been performance works great for categorical, relationship heavy data, but not so much for numerical data.

I am also a fan of RDF and OWL. My two ‘practical semantic web’ books (one in Common Lisp and one for Jave/Clojure/Scala) are now 8 years old, but you can get free PDFs from my web site. Those books badly need second editions, the material is interesting but outdated.

Mark, are you up for helping with solid security for cl-solid? (The library now works with both Neptune and Allegrograph) https://github.com/gibsonf1/cl-solid - live use of cl-solid library here https://graphmetrix.net , sometimes very lively discussion of Solid here: https://gitter.im/solid/chat (including Tim BL)

I have no time right now. Contact me in 5 weeks and we can talk then.


What's a modern, easy to get started with RDF-compatible database?

I have thought about this also. I am retiring from my job in a month but I would like to keep writing books and also have one commercial software product. I used to sell my NLP toolkit written in Ruby but open source NLP libraries, and deep learning solutions, are so good that I don’t want to be in that business.

If you're interested in challenges, could you consider temporal graphs? Both from the perspective of tracking graph evolution (audit trail), and using a graph to model historical events, where relations occur for periods of time - it always sounds doable and then I sit down to do it :)

Throw in probability and vagueness (history examples: this happened sometime in the 1950s / we're only 50% certain that Henry VII was the father of this child) and it becomes a whole lot more complicated; yet what can be inferred increases in usefulness.

My inspiration is http://commons.pelagios.org/ and the digital humanities field.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact