Hacker News new | past | comments | ask | show | jobs | submit login
Retrospection and Learnings from Dgraph Labs (manishrjain.com)
106 points by mrjn on Sept 16, 2022 | hide | past | favorite | 38 comments



Good reading, I've been curious!

Some observations as a team active here.

We're adjacent / complementary because we make graph db's (and regular SQL/databricks/etc.) more actionable via rich & scaled graph viz workflows (analyst-facing) + graph automl (automation-facing), so sometimes end up working alongside graph db co's at enterprise/gov/tech customers so not just a silo'd db. I think we only saw ~1 production user of dgraph though, so entirely based on their competitors:

* Graph DB TAM: Neo4j revenue is probably around $200M/yr now, and probably another $100-200M across the other graph db vendors. (Market analysts say the graph db market today is $1B+ but seems rosy.) Importantly, because of analyst + AI use case growth in areas like recommendations, anti-fraud, cyber, etc., and streamlining of infra via cloud/docker/etc, everyone good is growing quite well, and I expect good YoY growth for everyone for at least 3 more years. The AI market likely a much bigger leap for graph, tho less clear for these graph db and especially cpu ones.

* GraphQL TAM: Agreed. But may be a bigger culture shock for a pivot. Not ready for Series B levels of expections. leading to...

* Revenue: Super risky, lacking big & growing revenue, to assume a Series A, and then a Series B (!), unless you have a special trick like having ties to the chinese government or being a successful serial founder people just trust.

* ... Tip for people looking at jobs: ask revenue / spending ratio + how many years in the bank when not profitable. If a b2b team can't make revenue work after $xM raised, they're on the path for stressful grinding, dilutive bridges & shutdown. VC treadmill grows expectations, so coming in from behind is asking for PE to take over or an acquihire where only the founders win.


An important aspect of graph database TAM is that the current market size is somewhat defined by the poor scalability and performance of existing graph databases. Many interesting applications simply don't fit within the limitations of current graph database platforms, and those limitations haven't changed much in the last decade.

I would agree that the "database" part is not that important. It is the notion of scalable graph analysis that sells a platform. However, for sufficiently large graphs (most of the interesting apps are in this class), GPUs often don't help much versus CPUs in my experience.


agreed, and we've heard heart-attack-level pricing pushback for the few db's that do scale well

likewise, for analytics, we're seeing a bit of a split:

- DB side: Traditional real-time graph analytics (pattern search, ...) can often be done on extracts from KV-store level queries, so just use those and post-process the compute ("extract 1-2 hop neighborhood, then cross-product in lang xyz"). Like the old Titan -> Cassandra days. GPU can be nice for accelerating a simple inferencer here (e.g., a T4), but not necessarily the actual fetch. Gets a bit more blended in 'real' knowledge graphs, but that's more R&D. (Edit: I believe Facebook's at-scale graph engine successfully runs on top of SQL for related reasons.)

- AI side: Massive interest in GNNs (we're active here!) as starting to eat the lunch of the result quality that traditional graph analytics can give. Basically the pendulum swinging back to graph for areas like recommendors & classifiers. Shopping carts, fraud, cyber, etc. These had gone the way of ML + AI systems for awhile now, but with GNNs becoming practical, best of both. And... GPUs matter over CPUs again.

Funny enough, we're getting into a bunch of vector search scenarios, and because of our particular scale & query richness needs... looking at OSS graph DBs and pairing with GPU nodes. /insert "why not both" meme here


"the few db's that do scale well"

Which ones?


+1 - curious


>Dgraph took a hit suddenly due to a critically wrong hire — which made us go from a “things are looking great” to “sorry, you’re out” within a week.

I don't understand what this could mean? How can a single hire be so impactful and so quickly? I guess the team was small but even then I'd be very interested to know how that could happen!


Manish allowed a new CEO, Gary Hagmueller, to take over on advice from his investors at Redpoint.

The two didn't get along and Manish was largely sidelined.

And then depending on who you talk to the company failed because of Gary (poor fundraising, strategy etc) or Manish (poor product, management).

The investors Dgraph had e.g. Redpoint, Airtree, Grok have an excellent reputation so there is definitely more going on then has been let on.


Tangential… Airtree historically has a pretty poor reputation amongst founders who have taken money off them and amongst angel investors. I don’t have first hand knowledge but at least half a dozen unique second or third hand data points => dodgy tactics even at early stage and can micromanage founders.


Just want to set the record straight here, AirTree folks were incredibly helpful. If offered, I'd take money from them again in a heartbeat.


Really appreciate you commenting.

The worst thing about not being in the US is that there isn't enough public data points about which VCs are good or not.

And founders like Manish aren't stupid/drunk enough to ever spill the beans.


> Airtree historically has a pretty poor reputation amongst founders who have taken money off them and amongst angel investors.

can you share some sources?


Maybe they hired a CFO who said on his first day: "I am sorry, but you are fucked".


Who are you, who are so wise in the ways of hiring?


I tried Dgraph a year or two before it went bust.

I can confirm the query language mistake. The Graphql-alike was different enough to be unfamiliar, and still felt somewhat awkward to use. The other query languagewas pretty badly documented and also wasn't great. Cypher isn't perfect, but it's pretty decent.

I also ran into a trivial bug around a comparison in a query not returning correct results and pretty much immediately gave up. (It don't remember the specifics, it might as well have been API misuse).

But the biggest issue with all these graph databases for traditional application development is schema. Most of them are schemaless or have some half-assed, basic schema support. Nebula is the only one I can think of with proper schemas.

This is not what you want for your primary data store! Even more so if logic is driven by JavaScript, or even Typescript since there are still plenty opportunities to mess up.

I wish there was a proper multi-modal DB that combines the best of both the relationalnand the property -graph models. The recently announced SurrealDb might fit the bill.

Graph stores are plenty popular for secondary workloads with special requirements, which is not dissimilar to Elastic search for the search domain.

I don't see that changing anytime soon.


Is Dgraph out of business? I still see their product website up?


They seem to have hired a number of people over the last few months. Could be a recapitalization + new leadership.


They have a new CEO, Akon Dey: https://www.linkedin.com/in/akon-dey-34a757/

So still in business although he seems lacking in senior management experience.


Check out EdgeDB.


The article is a nice read, I also agree that they are a contributor to the go ecosystem. That being said, a key issue didn't get mentioned in the article - the team is not fully technically competent for the products they are building.

A few years ago, I spent weeks evaluating their products, including the graph db and some of their open source libraries. I have to say that you are a brave man if you use some of their stuff in your production system. Good luck to you and your whole team is the only thing I can say.

To give you some examples, when I had obvious safety issues raised, quite often I got pushed back with all sorts of excuses, including from Manish himself. That is something more than technical - it is a culture issue from the top.

From memory, they even had an angry ex-employee exposing all sorts of bugs, including safety ones, in their products after leaving the company. At some point, they even had their issue section of one of their github repos closed as a response!

If you don't know what I am talking about, just check their code committed back between 2017-2019.


I think it was a bug bounty program. Manish claimed it was an ex employee and he deleted all github issues related to bugs and safety


I may be in the minority opinion here, but I think the biggest issue was the product constantly changed and tried to be too many things. The product-market fit seemed to be on be "The GraphQL" db. But they really just did all this other complex stuff and by the time they were going in that direction ran out of steam. I don't know Manish personally, but there seems to be a lot of negative sentiment about his management style. He has been quoted as say (to the effect) that his engineers were not as good as him and he would have to go in and "fix" stuff. That seems wrong to me. Either he is a the best programmer in the world or didn't hire well. Not really sure tbh. But I wanted to love the product but it had too many poorly working features and could have de-scoped a bit and made them good.

FWIW I am really excited by surrealdb. I think it is the sweet spot that dGraph should have been.


I believe what he said was that he can move a lot faster on things than his engineers. As the founder of a company I'm sympathetic. No one could ever understand the product as well as me, at this point, because I built almost all of the critical code (this is less and less the case, thankfully, and I'm very happy to say that there are now solid chunks of code where I am NOT the authority on code). That's not "they're worse engineers", they definitely aren't lol. But if you've spent years of your time on a codebase it's going to take years for anyone to be as quick to fix a bug.

I think this is sort of obvious. Imagine fixing a bug in your own personal project, compare that to fixing a bug in someone else's project. You can probably see the error and guess what the bug is, accurately, if it's your project. With someone else's code you're going to have to reverse engineer the system first.


Manish's style is clearly overbearing from a cursory glance at public interactions.


> He has been quoted as say (to the effect) that his engineers were not as good as him and he would have to go in and "fix" stuff.

I wonder if this sentence has similar basis:

From the blog post: "Dgraph took a hit suddenly due to a critically wrong hire — which made us go from a “things are looking great” to “sorry, you’re out” within a week."


Thanks for the perspective.

Sustainable, open, software development feels like a problem that's still not solved. How do cloud offerings, consulting, extra features ("open core") and donations compare in terms of keeping development move forward? Have there been studies around that?


It's a developing (at a slow pace) model - I'd be curious if there are any but perhaps some in progress.

One model that seems to be workin is when companies have shared infrastructure and they collaborate on it (e.g. Linux).

Another one is when they use it as a standardization layer (e.g. Kubernetes and also note the much-different-nobody-remembers-about openstak)

Then you have open-core where you have a single entity (and the drama that comes with AWS picking it up) like ElasticSearch, Kafka, etc.

The other open-source + consulting (the RedHat model) has varied results because whoever consults gets incentivised to work against community in order to make money and the consultants will be in conflict too (Cloudera/Hortonworks, Mesosphere).

Not an exhaustive list - and not an expert.

It's likely worth mapping these "patterns". For any research I'd start from incentives with a game-theory causal approach.


To be honest, the customer service and as super poor and the CEO was not interested in having conversations with a potential customer also they missed out the graph analytics train which was very surprising


> The problem was that it’s hard to convince a DevOps person to add a relatively unpopular database to the tech stack. However, the same person won’t bat an eyelid (slight exaggeration), adding a new cloud service to the stack.

Actually I hate the guts of cloud services. I hate adding another API key, account to my 1password, credit card, etc. I just want a Helm chart and be done with it. If anything, Kubernetes & GitOps is a way more ergonomic way for the vendor to get into your stack than stupid accounts are.


The cloud service is probably the biggest one. But it's not just that. JAMstack is getting more and more popular, so you'll need to support "serverless" as well as a way to query via HTTP. Which means you'll need some kind of "Data Gateway" proxy on top of robust authentication and authorization. There are more and more "scènes à faire" to be competitive in the DBaaS market and I don't think dbaas startup founders are aware of it.


> "scènes à faire"

I was unfamiliar w this French phrase; here's a definition:

obligatory scene : a plot element that is standard for a particular genre


I used Dgraph as the primary data store in a production app for over a year. I really enjoyed thinking of our model in terms of a graph, and finding creative ways to query what we needed.

The biggest pain point for me was the query language schizophrenia: Incomplete support for GraphQL, or their custom Dgraph Query Language (DQL). As Manish said, they missed the GraphQL train. They really should have gone all in on GraphQL, and only GraphQL.


GraphQL is lacking lots of expressivenes. Does it even have group by having? DQL is more powerful.


They should have used Cypher.


> We took the product from 0 to 1M ARR, proving a strong signal of the product-market fit.

There may have been a slow-&-steady path to get to a mature state. But given the expectations of tech & vc scene these days, I can't blame the path you took. Good luck with the next venture.

I am glad that DBs like Postgres/MySQL, Sqllite etc got their freedom to evolve and slowly mature before the madness of fast-growing companies caught up to them.


Dgraph is immensely powerful and flexible, especially if coupled with Apollo Federation. It could use more attention from the community.


I have never found a use for graph db but I am still a happy user of Badger. One of the best KVDB written in Go. I think plenty of folks moved to it from Bolt, due to speed. But nowadays I think maybe Pebble is gaining traction.


GraphQL kind of has the reputation of not supporting super deep expressive queries. It is meant as a frontend query language after all.

If i saw a graph db with that as its primary language i would probably assume that its targeting a very different segment of the market than most graph dbs, and doesnt support deep recursive queries, and probably move on without a second look. I wonder if this sort of attitude hurt dgraph.


Really nice read, and great insight into the startup world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: