
Graph database reinvented: Dgraph gets $11.5M to pursue unique, opinionated path - bryanrasmussen
https://www.zdnet.com/article/you-can-go-your-own-graph-database-way-dgraph-secures-115m-to-pursue-its-opinionated-path/
======
motohagiography
Congrats on funding round, we need more options in this space. I'm a Neo user
today, and I've said before that while I don't think most people will switch
to graph databases, at some point I do think most new projects will use them.
I chose Neo precisely because of py2neo.

The decision to adapt GraphQL instead of supporting Cypher, what would you say
were the big trade offs?

As a startup, your price point is in the 20k/y ballpark, but GrapheneDB/Neo
comes in a lot lower. What would I get for paying that much more?

I find that the audience for something that operates at the level of
abstraction that a graph does tends to be different than the user who makes
decisions based on lower-level features. e.g. design level problems vs.
optimization problems.

Would you say more that you can get Mongo and Redis (+redisgraph) users to
switch to Dgraph, or instead more that the main customers for graph databases
are still in school right now and deciding what tech skills will underpin
their careers?

Fascinating space and looking forward to trying the product.

~~~
mrjn
(Dgraph founder here)

> The decision to adapt GraphQL instead of supporting Cypher, what would you
> say were the big trade offs?

Cypher was built around SQL, with the idea that people are used to SQL.
Therefore, by having something similar, people would find it easier to adapt.
Many years later, I'd say it still hasn't gained as much popularity beyond
Neo4j as sometimes projected.

We bet upon GraphQL in 2015, because it was easy to understand, worked with
JSON and resulted in sub-graphs. While Cypher and Gremlin return lists of
things. One can go from subgraph to lists, but not the other way around,
because relationships are lost.

Within 3 years of draft spec, GraphQL has taken the developer world by storm.
Developers find it easy to wrap their heads around and enjoy working with it.
In fact, GraphQL+- (our language) has become one of the USPs of Dgraph.

> Would you say more that you can get Mongo and Redis (+redisgraph) users to
> switch to Dgraph, or instead more that the main customers for graph
> databases are still in school right now and deciding what tech skills will
> underpin their careers?

We have heard from many user who would have used MongoDB or SQL if not for
Dgraph. With Mongo, you get scalability, but at the cost of query complexity
and transactionality (/correctness). Keys and Docs are nice, but they don't
let you use relationships to interact with a wider dataset. With Dgraph, you
get scalability, while also gaining a much richer and sophisticated querying
ability, transactionality (distributed txns), correctness without losing query
performance. My blog post says more about this.

Redis is a cache, so people are not directly comparing it against Dgraph, a
persistence-first database. One could use Redis to perhaps store Dgraph
results.

~~~
TimFogarty
Hey Manish! Congratulations to you and your team on the round! It's a
phenomenal achievement.

You mentioned transactions... I work for MongoDB so I just wanted to add that
MongoDB does have ACID transactions since version 4.0. Distributed
transactions will be included in the upcoming 4.2 release. There's more info
on distributed transactions in the 4.2 release here if you're interested:
[https://docs.mongodb.com/master/core/transactions/](https://docs.mongodb.com/master/core/transactions/)

By the way, it's awesome to see you're using Go. I just got into it recently
and it's such a nice language to work with.

~~~
mrjn
Good to hear Mongo would have distributed transactions. Go is awesome. I
picked a t-shirt from Mongo's booth at Gophercon, nice stuff!

~~~
TimFogarty
Oh I was just at GopherCon! I was on the Mongo booth.

------
campoy
Hi there, I lead product at Dgraph and I'll be happy to answer any questions
you might have.

We're very happy we're finally able to share our fundraise, and this is just
the beginning of many more features and improvements!

PS: we're hiring ;) [https://dgraph.io/careers](https://dgraph.io/careers)

~~~
richardkmichael
I work with hypergraphs, where edges have an arbitrary number of nodes. I'm
quickly looking through your documentation, Discuss, and GitHub issues/roadmap
to try and find out if edges may have more than two nodes.

This has been asked about already: [https://github.com/dgraph-
io/dgraph/issues/1#issuecomment-40...](https://github.com/dgraph-
io/dgraph/issues/1#issuecomment-404801622)

While we have your attention on HN, could you comment? (Sorry if I missed this
information elsewhere.)

Thank you!

~~~
moab
Hi Richard,

What applications/problems are you solving using hypergraphs? There is
definitely a dearth of high performance hypergraph processing engines/systems.

------
ben_jones
Serious question: what are some practical examples of graph-database backed
features providing significant value to “common” software applications
(e-commerce, CRUD, CMS, CRM, etc.)? I lack a strong understanding of graph
databases and their usecase and tend to learn best by seeing them in use in
domains that I do understand.

~~~
rficcaglia
Using neo4j in a commercial app for healthcare data as a primary store, using
cypher and graphql

~~~
johnymontana
Do you use the Neo4j GraphQL integration? [1]

Or some other approach for building your GraphQL layer on top of Neo4j?

[1] [https://grandstack.io/docs/neo4j-graphql-js-
quickstart.html](https://grandstack.io/docs/neo4j-graphql-js-quickstart.html)

------
electriclove
The blog post announcing this is quite informative:
[https://blog.dgraph.io/post/how-dgraph-labs-raised-
series-a/](https://blog.dgraph.io/post/how-dgraph-labs-raised-series-a/)

~~~
electriclove
The post states "Most other graph databases or layers are built as sidekicks."

What specifically falls into that? As someone new to the space, I'd like to
understand if he is referring to things like Neo4j, ArangoDB, etc..

~~~
campoy
Most graph databases are being used as secondary systems, often used as
indexing for other databases, etc. We believe this is due to a lack of
features or performance at scale.

Our goal at Dgraph is to be used as main storage similarly to how people use
Postgres or Mongo.

Happy to ask Manish (the author of the post and CEO of Dgraph) for more
details!

~~~
aforwardslash
Congrats on the funding! I've been looking into Dgraph (and also played a bit
with Badger), as well as other graph databases as a way to store chronographic
event data, while enabling rich relationships between the observed artifacts
belonging to each event. The problem is that the solutions I find seem an ugly
hack compared to a relational solution. Can you point me to any specific
Dgraph documentation or case studies for these kinds of workloads?

~~~
campoy
Let's chat! We're working on improving our docs specifically regarding the
data modeling aspects.

If your use case is open source we could even use it as one of our case
studies :)

francesc@dgraph.io

------
treelovinhippie
Their non-standard GraphQL query language is a huge problem to adoption:
[https://github.com/dgraph-io/dgraph/issues/933](https://github.com/dgraph-
io/dgraph/issues/933)

Cloud managed Dgraph instances would also be nice.

~~~
mrjn
We're working on that for sure. ETA end-Q3 2019.

~~~
mleonard
Working on standardising dgraph's graphql? Or working on a managed cloud
version?

~~~
campoy
GraphQL+- will still be a thing, since GraphQL doesn't provide all of the
things we need to manage a database.

That said, we're working on a pure GraphQL support, which will help with any
kind of integration with other GraphQL projects. This is really exciting and
will hopefully help many that do not want to spend their time learning Cypher
or Gremlin and already know GraphQL.

Another project we're working on is offering Dgraph as a service. We're
currently doing this with some of our customers and so we have the expertise
and trust on our product necessary to run this service at scale. The ETA for
this is not as clear but I do expect to have a private alpha by the end of the
year.

I hope that helps, if you have any other questions feel free to join us on
[https://slack.dgraph.io](https://slack.dgraph.io) or
[https://discuss.dgraph.io](https://discuss.dgraph.io), or reach me directly
on francesc@dgraph.io

------
bulldoa
Can anyone recommend readings on what are the trade offs between graph
database vs key value (whether its column base cassandra, or key value base
aerospike...etc) and traditional acid compliant db (postgresql)

~~~
Graphguy
Depends how much time you have...

Fifteen minutes: “How to Choose a Database” by Ben Anderson
([https://www.ibm.com/cloud/blog/how-to-choose-a-database-
on-i...](https://www.ibm.com/cloud/blog/how-to-choose-a-database-on-ibm-
cloud))

Three hours: Jepsen analyses of distributed systems safety. Kyle tests
software ranging across the database spectrum.

One week: Designing Data-Intensive Applications by Martin Kleppman.

Disclaimer: I work with Ben and think he takes a really nice tact on this
subject, while it may be orthogonal to your immediate question regarding
trade-offs.

~~~
campoy
Thanks for sharing these, cool handle name btw :)

------
marknadal

      > ...they showed Dgraph to be 10 times or more faster than other options (naming names, which we won't), there is nothing to show for this at this point. Jain said Dgraph may release some benchmarks in the future...
    

That is pretty dubious to make claims without benchmarks.

I'm a competitor, also Open Source and VC-backed, and there is a long line of
Databases publishing benchmark for at least baseline tests:

\- [https://redis.io/topics/benchmarks](https://redis.io/topics/benchmarks)

\- [https://gun.eco/docs/Performance](https://gun.eco/docs/Performance) (mine.
Also see
[https://www.youtube.com/watch?v=x_WqBuEA7s8](https://www.youtube.com/watch?v=x_WqBuEA7s8)
for benchmark against 100M+ records/day, 100GB+ data)

\- [https://www.datastax.com/nosql-databases/benchmarks-
cassandr...](https://www.datastax.com/nosql-databases/benchmarks-cassandra-vs-
mongodb-vs-Hbase)

\-
[https://www.arangodb.com/performance/](https://www.arangodb.com/performance/)

etc.

Please please please don't hype, just publish results, even if they're basic
tests.

~~~
campoy
Hi there,

We are not hyping, we're blogging about our fundraising success while working
on our new release which will include a series of benchmarks which we'll be
happy to share with all the details, obviously including source code and
hardware setup.

Until then, feel free to get in touch with us if you have any specific
questions or need help with performance analysis!

------
sandGorgon
Thanks for a having a Docker Swarm HA deployment recipe !!
[https://docs.dgraph.io/deploy/#using-docker-
swarm](https://docs.dgraph.io/deploy/#using-docker-swarm)

quick question - im concerned about the underlying storage here. You dont seem
to be using any exotic filesystem, so what's really happening ? what's the
fault tolerance story in case a node dies ? If i add a new node to the swarm,
will it automatically recover. any way of debugging, etc ?

What's the rule of thumb for number of nodes, etc

Also..im not sure if you noticed, but the name of one of your component
"Dgraph Alpha " is poorly chosen. In case you were not aware, warning bells
start ringing in devops teams when they pattern match anything starting with
"alpha".

------
dfischer
I’ve looked into graph databases a few times. I feel like there’s a use case
I’d love to apply it towards more seriously but I always stop after not
fundamentally “getting” the “gains” out of it.

I’m doing quite a bit of NLP and similar conceptual structures of edges with
unstructured / structured relationships to content and I I’m always wondering
if there is some boon to utilizing a graph.

I guess I lack some success stories where I can see the wins.

I get a sense that there was an attempt like noSQL back in the day with graph
- and it has its specific cases but it’s not massively used for general use
cases. And a new sense that there’s an emerging thought leadership on doing
graph db’s a new way that is more general.

I love new tech! I just want to know how to use it for the unknown unknowns
:).

------
simonebrunozzi
Congrats on the round, and well done - I am quite impressed by Dgraph so far.

Big question for the founders: how are you going to fend off AWS when you will
get bigger? (see what's happening with Elastic, MongoDB, etc)

Also, what prevents you from changing the license down the road?

~~~
mrjn
> Big question for the founders: how are you going to fend off AWS when you
> will get bigger? (see what's happening with Elastic, MongoDB, etc)

Just judging by the stock price, at $100 and $150 approx today, Elastic and
Mongo stock are doing well. I don't see them going down as badly as the media
projects it to be.

Think our current license is pretty good. Can't predict far future, but at
least in the near future, I don't see a reason to change that. It's not
something we're contemplating right now.

~~~
simonebrunozzi
I don't want to sound arrogant, but I think you are underestimating how
important this topic will be in the coming years, especially if you will be
successful and your company will grow significantly.

I know the space really really well, and have worked at AWS from 2008 to 2014.
Most large enterprise customers are worried about this, especially because it
might mean that the supporting company (MongoDB, Elastic, etc) will not have
an easy way to survive in the future.

Right now you are capturing the long tail of the market, and you can grow a
lot just with that. But eventually you will need to get enterprise customers,
and that's when this topic will surface.

I don't want to be an alarmist, just saying that thinking about this ahead of
time might be important for your company.

Tangentially, I am bullish on Elastic, less on MongoDB. My 2 cents.

------
peternicky
I am curious on the language choice of Go. Why not C or C++? Apologies if this
was already answered.

~~~
mrjn
I'd love to talk about this at length. Having built distributed systems in
Google Web Search with C++ for over 6 years, I still enjoy the manual memory
management model than a GC based model. Identifying and fixing memory leaks is
easier, because that's in your control as a language user. Making a GC work
better is beyond. In fact, that'd be the biggest gripe I have with Go. Wish Go
had a manual memory control mode, I'd take it in a heartbeat.

Alright, now that I've criticized Go, time for why I chose it. Go code is
simply more manageable and readable over time. Tools like gofmt are habit
forming. Go does memory management, but doesn't put everything on heap. You
still get a lot of control over pointers, something that other GC based langs
hide from the user.

Go tooling is the best I've seen and closely replicates what we had at Google.
Go profiles are almost copies of what Google had and the fact is that they're
part of the language is incredible.

Go allows normal devs to run thousands of goroutines, utilizing all possible
cores of the server well. That is just not possible with other languages. And
not as simple with C++.

Go is a DELIGHT to work with as a programming language. If Go weren't around,
I'd be writing Dgraph in C++. Go is perhaps slightly slower on a per-thread
basis, but it more than makes up for it with easy to build concurrent systems.

In fact, both Dgraph and Badger outperform many other systems written in C++
and Java. So, you only gain performance with Go. What you do lose is ability
to tightly control memory deallocation, but that's just GC for you.

P.S. If you want to shout Rust: Sorry, I just don't know much about it. And
don't plan to switch.

~~~
peternicky
Thank you for this reply, very informative and personally, I love hearing
about design decisions and tool choices.

------
cartlidge
The Faq says "If your data doesn’t have graph structure, i.e., there’s only
one predicate, then any graph database might not be a good fit for you."

I know what a predicate is, but I don't understand how it is being used here.
Can someone explain how I can determine if my data has a graph structure? What
sort of predicate do we have here?

~~~
mrjn
Think of a JSON map as a document or an entity. Then the keys would be
predicates, and the values would be either references to another JSON map
(document/entity) or a value (like a string, int or something).

{"uid": "0xabcd", "friend": [{...}, {...}], "name": "HN user", "age": 21 }

This is a valid JSON for Dgraph. It means, the overall JSON map is an entity
of UID, 0xabcd. It has friends (other maps), name "HN user", and age 21. Here,
"friend", "name", "age" are predicates.

~~~
cartlidge
Thanks for the reply mrjn. So a predicate includes a relation, but we take it
from its logical point of view. (Rather than in an RDBMS, where we contort the
relation so that we can see it from the perspective where it has a cardinality
of 1.)

And the notion of one-ness of the predicate comes not from the fact that
there's only one relation, but the fact that there's only one _logical value
per predicate_ - here, we have two values of `friend` which cannot be
conveniently coded in an RDBMS without the use of a join table.

So do I correctly interpret your faq "when not to use Dgraph" as saying
"Dgraph is probably overkill if predicates naturally have a single value -
that is, join tables are rare and you can naturally put foreign keys in the
object they logically belong with, rather than in the table where they have a
cardinality of 1".

This makes me think my other by @thundergolfer is probably wrong (sorry) -
actually, an ecommerce site would benefit from an efficient graph db since you
have an order, and now you want to find all the items in it, so the link from
an order to an item should be associated with the order. Yet in the standard
model, as with his, once you've found the order now you have to filter through
the items to find the ones which reference the relevant orders; indeed, this
is almost the only way that relation will ever be travelled - logically
backwards.

I appreciate your time @mrjn. I like to hope you can benefit from answering my
silly questions because someone can improve that faq. I'm trying to pick the
right db for a personal project but I never expect to make money from it so I
don't feel like I can give you any direct benefit from your time.

------
person_of_color
Why did you stop supporting remote employees?

~~~
mrjn
We have generally coalesced into remote offices, which have enough folks in
there to help each other. We've also started encouraging pair programming, so
engineers can question each other more and review code together.

We still have remote employees, in fact, 4 of them are remote. But, it's
considered on a case by case basis. The biggest issue with remote comes down
to individual personality, i.e. how communicative and independent are they.
Some people make great remote workers. But, I'd say many don't.

When an engineer is at office, they are team-motivated, more likely to get
help without asking and avoid getting stuck.

~~~
rficcaglia
Management by walking around?

------
degyves
How "most advanced graph database" does not supports already standardized
SPARQL query? Have you found anything wrong with standard SPARQL?

~~~
campoy
We are happy to get new feature requests on github.com/dgraph-io/dgraph!

SPARQL is something we're considering but we haven't heard much interest from
any of our customers so it's not very high up on our roadmap.

I'd love to chat with you to figure out how adding this could help your use
case though. You can reach me at francesc@dgraph.io.

Cheers!

------
anbop
Can someone explain to a complete n00b what you need these graph databases
like Neo4J and Dgraph for? I've just built some small business SaaS always on
MySQL.

------
uzero
Did I understand the feature table correctly that backups are an enterprise
feature?

------
znpy
Somebody please look at Actian's Versant Object Database (now called "Actian
NoSQL") and just clone it. You'll make very big money.

------
kazz0302
Cool!

------
luizfelberti
Congratulations on the funding, I've been following the project for a while
(your tech blog is great) and am especially fond of the spin-off project
Badger[1]. I'd like to ask a few questions and make a few "content requests"
for the tech blog if I may, since I see @campoy is here and @mrjn will
probably stop by later =)

 _Some things I 'd love to read about on your blog_

1 - I'd love to read more on the local information retrieval strategies you
use for Badger, especially if you have use cases that are comparable to
sequential reads (something that Kafka does, but with a structure optimized
for magnetic drives) or ISAM style indexation (traditional SQL databases), and
how you leverage the metadata byte to speed things up (this has been touched
on only superficially on the blog afaik) along with other specialized
features/operations Badger supports;

2 - Some insights on how Dgraph's data model influenced the design of Badger,
and what parts you decided to explicitly generalize for other use cases, and
what use case constraints have you purposefully decided on (aside from the
obvious ones like SSD optimized, etc);

3 - Any progress you might have made regarding why some queries in Badger take
longer than in RocksDB. That so far has been the biggest cliffhanger in the
tech blog, I've been waiting for the sequel to that one for some time now :)

It's difficult to find good resources where people go in depth on these
database engineering topics, but your blog is a very good one and a joy to
read. Thanks for not simplifying things and keeping it very technical!

 _Some questions relevant to the post and Dgraph itself_

4 - When I (and I assume many others) see "Enterprise Features" on databases
that is kind of a turn off. That usually correlates to keeping separate
codebases that are being manually kept in sync (which has problems on it's
own), and it sometimes feels like it's gatekeeping some pretty crucial
features (e.g. better backups) which make people either accept a more
vulnerable architectural situation, or makes it unsuitable for a few use cases
for lack of such features. I do understand and respect the need to make money
to keep Dgraph being awesome, but have you considered something like the
CockroachDB approach (selling support + mandating a license to sell the DB as
a managed service)? If you did, why did you turn it down and what would make
you reconsider it? On a second note from a maintenance perspective, how are
you planning to handle the "feature flagging" and "repo syncing" of community
and enterprise editions?

5 - As someone who's considered using Dgraph in the past, something that I
found interesting is that graph databases in some use cases alleviate some of
the same data modelling woes that Datomic does[2][3]. How would you diff them
apart and what would you say they differently excel at?

6 - Graph databases sit in a spectrum of use cases, where some people want a
system to work similar to a traditional RDBMS but with more flexible/powerful
data modelling capabilities, and some people might want to load up big graph-
heavy datasets for analytics purposes (i.e. loading a giant product catalog to
map product relationships and either do a bunch of batch queries to feed a
secondary dataset or keep it live for realtime queries on a recommender
system). How would you place Dgraph and it's strengths/weaknesses in this
spectrum? What alternative solutions would you recommend for the places where
Dgraph would be unsuited for?

Sorry for writing too many questions (which will probably take up a lot of
time to answer) but many of these things (especially on the second half) are
not explicitly said anywhere, and are valuable knowledge for making an
informed decision on a database product.

[1] [https://github.com/dgraph-io/badger](https://github.com/dgraph-io/badger)

[2]
[https://augustl.com/blog/2018/datomic_look_at_all_the_things...](https://augustl.com/blog/2018/datomic_look_at_all_the_things_i_am_not_doing/)

[3]
[https://augustl.com/blog/2018/datomic_look_at_all_the_things...](https://augustl.com/blog/2018/datomic_look_at_all_the_things_i_am_not_doing_cont/)

~~~
lucas-wang
I'm Lucas, and I'm a backend engineer working for Dgraph Labs. A quick comment
regarding 4: Cockroach DB is also employing the "Enterprise Features" model
[https://www.cockroachlabs.com/docs/v19.1/enterprise-
licensin...](https://www.cockroachlabs.com/docs/v19.1/enterprise-
licensing.html)

~~~
rficcaglia
Security features should NOT be limited to enterprise only. Performance,
monitoring, advanced query capabilities are much better options. Security
should be table stakes for ALL software today. Please reconsider.

------
newen
Has the word opinionated become a meme recently? It seems to be everywhere
these days.

