Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: EdgeDB 1.0 (edgedb.com)
947 points by colinmcd on Feb 10, 2022 | hide | past | favorite | 323 comments



I’ll add to the positivity: this is the first time I’ve ever found a “we can do better than SQL” compelling. It’s easy to understand what it’s doing with a variety of known quantities. It solves a difficult problem elegantly. It has 100% overlap with the goals I want it to have. It’s designed to be usable with minimal fuss. It does really a good job of portraying itself as magic (it will be awesome to use) without portraying itself as magic (it explains itself in terms of already knowable tools and how it leverages their strength).

I do have a couple minor quibbles:

- Serializable transactions are expensive, and that deserves to be an explicit caveat. Not everyone knows this, and it’s an important thing to put up front.

- Some of the language in this post are in CAP theorem territory but neglect to directly address that. I’d like to see how client usage compares with direct Postgres usage (idiomatic for each insofar as such a beast exists) in a Call Me Maybe. I know that’s a lot to ask in a 1.0 announcement four years in the making, but I hope it’s a priority to get this in front of Aphyr.

Edit: oh and I definitely look forward to this being further distinguished from an ORM, because even though I can see the blue and black dress my mind keeps switching it back to gold and white.


> I’ll add to the positivity

Thank you!

> Serializable transactions are expensive, and that deserves to be an explicit caveat. Not everyone knows this, and it’s an important thing to put up front.

We've not seen a major difference in our benchmarks (though maybe our benchmarks are wrong :-)). EdgeDB tends to produce very short transactions, so that helps. EdgeDB also knows if your statements are read-only or not, so we have the ability to steer these into a read-only transaction, though this isn't implemented yet.


It’s been a few years so this is pretty fuzzy (though I just jogged my memory re-reading the docs, which are consistent with my recollection)… there are two expenses:

- The overhead discussed in the docs, which is ~negligible for lots of use case and a perfectly reasonable tradeoff for those.

- The overhead of retries, which with appropriate defensiveness can effectively become an indefinite lock in, but undetected by, the client. When automated by an abstraction layer, this can become pathological pretty easily depending on usage patterns.

The most realistic alternatives are to provide a lower level abstraction (eg “I don’t want your guarantees, I want your errors”), or to provide other isolation options (eg “I don’t want your guarantees, I want my errors”). But there may well be opportunities here because EdgeDB knows as much as it does about the schema, and positions itself as a SQL replacement rather than a companion so it can potentially optimize for at least some of those cases at query time.

That sounds complex enough to boggle my mind, but if y’all are up to it I’ll be excited to see how it goes!


The retry logic in clients is fully configurable, you can disable retries and get your TransactionSerializationError if you want that.


Awesome!


Oof I meant to specifically address

> We've not seen a major difference in our benchmarks (though maybe our benchmarks are wrong :-)).

You’ll likely not see anything noteworthy without specifically creating concurrency contention which specifically causes the kinds of pathological retry scenarios I mentioned. I’d be shocked if there isn’t at least a good starting point in either the Postgres test suite or Call Me Maybe. (Seriously though, I want to read Aphyr’s take on this project.)


Can I use this ability to direct the query to a read only replica or will EdgeDB handle this common scaling use case some other way?


Yes, the plan is to allow sending read-only queries to read replicas automatically. This isn’t implemented yet, but all the requisite pieces are there.


I would suggest framing distributed systems analysis in terms of PACELC instead of CAP: https://en.wikipedia.org/wiki/PACELC_theorem


I'm so happy for you guys, I'm wishing you great success. Since the first first version of uvloop I've been excited about MagicStack and I couldn't be more excited about EdgeDB since it has the same magical minds behind.

Here's what I think makes EdgeDB special: it's a DB that replaces the tediousness of ORMs with a better core that can be cross-language / cross-platform. I've implemented tons of APIs, first REST, then GraphQL all of them on top of ORMs (Django, Peewee, SQLAlchemy, Mongoose and more). When prototyping was great, but scaling them became quite challenging, specially if you want to have a performant outcome when retrieving data.

EdgeQL is an incredible useful abstraction that will prove itself in a few years. Long life EdgeQL. Keep up the good work!


> it's a DB that replaces the tediousness of ORMs with a better core that can be cross-language / cross-platform

Does it compare to RethinkDB in this regard? The "fluent" native query language, ReQL, was one of its best parts.


Aww, thanks!


These guys are from magic stack, they wrote the definitive async python postgres library, asyncpg. Very high quality library.

Been keeping a close eye on Edge, had even considered it as a primary database, and probably will in the future!!

As much as I adore the ergonomics improvements I really am more interested in the performance, replication, scalability story, with the likes of cockroach db reaching maturity in 2022.

But as a postgres replacement in general, I would highly consider using edge.


they also wrote uvloop [0] which is fantastic and advances the cutting edge of what can be done with modern asyncio-based Python. I saw a ~3x improvement in the throughput of a microservice I wrote when I first tried it out years ago. currently at $dayjob we just use it by default in every Python service, whether or not we expect that service to be performance-critical.

it's as close as you can get to having actual magic sprinkles that make your code go faster.

0: https://github.com/MagicStack/uvloop


Been searching for something just like this for my next moonshot project. Very excited!!!


Thank you :)


I feel like a graph database is a solution to an issue I've faced (and, continue to face) and it may just be because that I haven't spun one up and tried or that the documentation/examples don't stick out. But could someone confirm my feeling? If my feeling is correct, I'd enjoy verifying it with EdgeDB or the like.

My example/requirement: I have a user wanting to find best-matching blog posts. Every post is tagged with a given category. There could be 100+ categories in the blog system and a blog post could be tagged with any number of these system categories. A user wants to see all posts tagged with "angular", "nestjs", "cypress" and "nx". The resulting list should return and be sorted by the best matches, to those of least relevance. So, posts that include all four tags should be up top and as the user browses down the results, there are posts with less matching tags.

What I've seen with SQL looks expensive, especially if you search with more and more tags. I may just not know what to search for though, re. SQL. Is there a query against a graph database that could accomplish this?


EdgeDB employee here. I couldn't have asked for a better question to demonstrate the power of subqueries! Here's how I'd do this in EdgeQL:

  with tag_names := {"angular", "nestjs", "cypress", "nx"},
  select BlogPost {
    title,
    tag_names := .tags.name,
    match_count := count((select .tags filter .name in tag_names))
  }
  order by .match_count desc;
Which would give you a result like this:

  [
    {
      title: 'All the frameworks!',
      tag_names: ['angular', 'nestjs', 'cypress', 'nx'],
      match_count: 4,
    },
    {
      title: 'Nest + Cypress',
      tag_names: ['nestjs', 'cypress'],
      match_count: 2,
    },
    {
      title: 'NX is cool',
      tag_names: ['nx'],
      match_count: 1,
    },
  ];


Do I understand this correctly in that if the list goes on, it will also show posts with unrelated tags in tag_names? As the filter is only applied to match_count?

So to only get blog posts with matching tags we would need to add a filter „match_count > 0“, right?

Update: I am very excited about EdgeDB :)


Yep, you understand correctly!

  with tag_names := {"angular", "nestjs", "cypress", "nx"},
  select BlogPost {
    title,
    tag_names := .tags.name,
    match_count := count((select .tags filter .name in tag_names))
  }
  filter .match_count > 0
  order by .match_count desc;


So as I read TheSpiciestDev's comment, he's complaining that making his query in PostgreSQL is slow. It looks like EdgeDB is a frontend to PostgreSQL; how will it help with TheSpiciestDev's problem?


The problem sounds like something that could be solved with a GIST index. EdgeDB doesn't yet have a way to specify the index type, though, mostly because we aren't sure what would be the best way to do it without things becoming too Postgres-specific in schemas.


What do you mean by "too Postgres-specific"? Will you be supporting other DBs behind the EdgeDB interface in the future?


This is not something we plan to do in the near future, but it’s also not outside the realm of possibility. We picked Postgres because of its power, quality and unparalleled extensibility, but we are also very careful to not leak any implementation details into our interfaces.


I'm curious how this squares up with what someone linked elsewhere: https://github.com/edgedb/edgedb/discussions/3403

> EdgeDB does not treat Postgres as a simple standard SQL store. The opposite is true. To realize the full potential of the graph-relational model and EdgeQL efficiently, we must squeeze every last bit of functionality out of PostgreSQL's implementation of SQL and its schema.

I don't see how this and what you're saying can both be true at the same time. Is EdgeDB tightly coupled to the implementation of PostgreSQL, or isn't it? Is there really a chance that EdgeDB could support other databases, or not really? I don't think there's anything wrong with the answers being "yes" and "no", respectively; that's actually what I'd expect. It would be more unusual to try to do this in an implementation-agnostic way.


I think what they mean by this is that EdgeDB's query language should not be coupled to Postgres, but EdgeDB itself should use Postgres specific SQL features to maximise performance - so you couldn't drop in MariaDB without changing code in EdgeDB, but in theory you could write another backend that takes the same queries and uses MariaDB or MongoDB or something custom under the hood.


Exactly, implementation coupling vs interface coupling.


I'm feeling very disappointed to see that you're using two space-separated words for "order by" in your language. Do "order" and "by" have meanings on their own in independence such that the meaning of "order by" arises naturally via their composition/conjunction? If not then surely it should be "orderby" in your language.

SQL definitely must be replaced, but the silly pseudo English syntax is one of the things we want to get rid of, not retain.


Do you have any timeline on when i can use this with rust?

Found this: https://github.com/edgedb/edgedb-rust


select posts.name as post, count(post_tags.id) as matches from posts,post_tags where post_tags.post=posts.id and post_tags.tag in ("angular","nestjs","cypress","nx") group by post_tags.post order by matches desc; Test data @ http://pratyeka.org/hn.sqlite3


Is it though? Simple IN with an ORDER BY on the same match will return the correct ranking. More info on ranking here https://www.postgresql.org/docs/current/textsearch-controls....


Am I right in reading this as the parent comment envisioning a “post” table and a “tag” table, and you’re suggest the “post” table just have a “tag” column?


I see just one table

> Every post is tagged with a given category. There could be 100+ categories in the blog system and a blog post could be tagged with any number of these system categories.

My point is, I don’t see SQL query as expensive for this kind of use case. There are easy and native ways to do it.

In case you would like a top notch performance, Redis might be a way to do it. Even a reverse-index would achieve great performance.


That's a standard many-to-many relationship that would normally be implemented by three tables:

    +---------+-------+---------+-------+-----------+
    | post_id | title | content | other | fields... |

    +--------+------+
    | tag_id | name |

    +------------+---------+--------+
    | tagging_id | post_id | tag_id |

But it seems like the core of the request is still something like:

    SELECT post_id, count(1) AS count
    FROM taggings
    WHERE tag_id IN (3, 8, 255)
    GROUP BY post_id
    ORDER BY count DESC
(off the top of my head; I haven't checked this for any kind of correctness)

And I don't see why that query suffers as you add tags...?

------------

EDIT responding to below [HN believes I am a problem user who should only be allowed to make so many comments per day]:

< that is pretty much what I meant by “I see just one table” as you don’t need any joins

Well, assuming you're doing this because a user is interacting with your site via some kind of web interface, you can set the interface up to deliver you tag_id values directly, but you'll still need to do a join with the posts table so you can present a list of posts back to the user instead of a list of internal post_id values.

So I guess

    SELECT t.post_id, count(1) AS count, p.title, p.url
    FROM taggings t JOIN posts p ON t.post_id = p.post_id
    ...


Confusing but that is pretty much what I meant by “I see just one table” as you don’t need any joins (atleast with the same design you outline)


I do these often with standard GraphQL queries, often over Postgres. Now I’m curious about the performance difference compared to an SQL ORDERBY or similar EdgeDB implementation!


Thanks for posting this. Kind of comment that adds value to the discussion by illustrating how a piece of tech can or cannot be useful.

I just happen to have a very similar requirement to yours and was also wondering.


Maybe I misunderstand what you are saying but it sounds pretty straight forward in SQL.


I'm currently investigating whether Redis Bloom [1] could be a good tool for similar requirement.

[1] https://github.com/RedisBloom/RedisBloom


What's a good way to develop a mental model about what's happening under the hood in EdgeDB?

With SQL, I have a mental model of how things work under the hood. For instance, I think of each table as being stored separately on disk, containing "rows". And the rows are really just equally-sized data blocks that are laid out back to back. B+ trees, with leaf nodes that point to (or just are) the rows, are used for indexes.

When I'm designing SQL schemas, I use this mental model to make guesses about performance. And when my queries are slow, I look at the execution plan.

My question is, how can I develop a similar intuition about EdgeDB? Under the hood, how are types and links stored in Postgres? And if I'm having performance issues, can I see an execution plan?


> What's a good way to develop a mental model about what's happening under the hood in EdgeDB?

At the physical schema level [0] or at the conceptual schema level [1]?

This answer from edgedb CTO might clear the latter up; https://news.ycombinator.com/item?id=30291538

As for the former, I guess it is the same as however Postgres (pg) chooses to represent the edge-db tables. EdgeDB (graph on pg) sounds like Timescale (timeseries on pg [2]).

[0] https://en.wikipedia.org/wiki/Physical_schema

[1] https://en.wikipedia.org/wiki/Conceptual_schema

[2] https://blog.timescale.com/blog/timescaledb-vs-influxdb-for-...


I'm wondering about the physical level—or at least how the EdgeDB conceptual level is translated to the Postgres conceptual level. The docs, and the comment you linked to, have helped me get pretty clear about the EdgeDB conceptual level.


I talked a bit about this in my release day talk (https://www.youtube.com/watch?v=WRZ3o-NsU_4&t=8151s), but:

* Every edgedb type has a postgres table

* "single" properties and links are stored as columns in that table (links as the uuid of the target)

* "multi" properties/links are stored as a link table

So it's basically just translated to a relational database in normal form


Thanks, this is helpful!

I think what I'm trying to understand is this: if I use EdgeDB in production, how often will I end up dropping down to the SQL level to debug things? If I'm trying to debug a slow query, can I do it at the EdgeDB level? Or will I have to open a PostgreSQL terminal, see how things are laid out there, run EXPLAINs, check the slow query log, and so on?

When I use ORMs, the answer to this is "pretty often". The ORM makes my application code cleaner, but I still need to have a complete understanding of the underlying SQL representation in order to ensure good performance and debug errors. I'm curious how that compares to using EdgeDB.


The goal is to never drop down to SQL. If there's something you can express in SQL but can't express in EdgeQL, it's a bug.

We're working on proper query profiling now, but in the meantime the "slow query" problem you see with ORMs happens quite rarely with EdgeQL. For starters a lot of ORM performance issues are causes by the fact that they secretly do a bunch of roundtrips under the hood. EdgeQL queries compile to a single SQL query always. Also, since we target Postgres exclusively, we can produce queries that take full advantage of its (rather preposterous) power and performance. We extensively test the performance of things like extremely deep/wide fetching, lots of nested & complex filter expressions, computed properties, subqueries, polymorphics, etc. Obviously nothing is 100% but we're pretty confident in saying that EdgeQL performance is good.


> So it's basically just translated to a relational database in normal form

So... just an ORM ;)


In the sense that "object-relational mapping" is happening, then sure. The problem with the "ORM" term is that it comes loaded with a bunch of preconceptions that don't apply to EdgeDB. EdgeQL is a full query language with a standard library, grammar, and feature parity with SQL (almost). We've got a binary protocol. You use EdgeDB without ever needing to think about the layer beneath it—totally non-leaky. It's also not a library, which most people assume when you say "ORM".


Having worked on databases, I am not the one to discount the effort that has gone into building EdgeDB. I was but jokingly referring to this: https://www.edgedb.com/_images/_blog/40fc2a2a81483bb0979c96d...

Btw, if you folks have time, then EdgeDB should consider penning posts like the ones timescale has been doing for 3 years or so, in its march to industry leadership.


Could someone explain what a graph-relational database is? I'm not able to extract a technical definition from the paragraph below:

"What is a graph-relational database? EdgeDB is built on an extension of the relational data model that we call the graph-relational model. This model completely eliminates the object-relational impedance mismatch while retaining the solid basis of and performance of the classic relational model. This makes EdgeDB an ideal database for application development."


(EdgeDB CTO here)

In a classic relational model everything is a tuple containing scalar values. Graph-relational extends the relational data model in three ways:

- every relation always has a global immutable key independent of data (explicit autoincrement keys aren't needed)

- this enables us to add a "reference type", which is essentially a pointer to some other record (i.e. a foreign key)

- attributes can be set-valued, so you can have nested collections in queries and in your data model.

This is what lets us do `Movie.actors.name` instead of a bunch of `JOINs`, because `actors` is declared as a set-valued reference type in the `Movie` relation.


That's awesome. I think you've hit the nail on the head by trying to fix the SQL part of relational databases and not the relational part. It's been a pet peeve of mine for ages that relational databases have been described as inadequate for modelling relationships and graph databases have been described as the solution. You CAN'T fix the problem just by going from n-ary to binary relationships.

How deeply is EdgeDB integrated into Posgresql? Any chance it could be used to query other databases eventually?


put this straight onto your marketing page please!


Will do.


Hi! Thanks for taking the time to engage on HN.

I have a couple of questions around this.

Firstly, what happens to the performance when I have a sizeable resultset of set-valued data?

I've seen similar ideas implemented in the past that look fine for the Movies and Actors or Books and Authors examples but fall apart badly when you query a number of fields (20+) that have sets within them, which can happen on say, a sizeable reference database of marketing information.

Another question: How deep in the graph can I go, and how much circular reference protection is there? E.g. if I query Movie.actors.movies.actors?

I'm interested in graph databases and data modeling and while it offers some convenience I'm always skeptical but hopeful (mostly from having lost a lot of hours) that these problems have been solved sufficiently to keep performance good in practical use cases.


EdgeDB is graph-relational, not a pure graph database, and so the performance characteristics of traversing links are that of a relational JOIN. Which, of course, depends wholly on the size of each relation being joined. So, if you want to select the list of actors for _every_ movie in your database and there are lots of movies, it'd be a pretty expensive operation. If, on the other hand, you want to select some relationships on a handful of objects (or even just one), then it doesn't really matter that much how deep your link traversal is, because all of the steps would be fast index scans.

> How deep in the graph can I go,

As much as you want, though the path must be explicit, EdgeQL currently doesn't have any way to say "traverse link foo recursively".


I’ve always wished for MongoDB to have a “deepFind”, so that when I fetch a document, it will fetch the nested relations also instead of doing an aggression to do the lookup. Feel like if their objectID only included a collection name reference then somehow it should be possible. Perhaps a depth parameter would use be useful for more relational data. Congrats on the milestone! Will definitively have a look at edgeDB.


MongoDB $graphLookup might do what you want.

From the docs:

Performs a recursive search on a collection, with options for restricting the search by recursion depth and query filter.

https://docs.mongodb.com/manual/reference/operator/aggregati...


There used to be a DBLink data type containing both collection and id. But without any additional features using it.

It was considered bad practice and eventually got deprecated. Since 99% of the time the collection link in a given attribute is fixed and known in advance so it’s just duplicate information.


This is a nice example. How the data is stored physically? Does the model work for large datasets and when it could break down? What are optimal workloads? Do we still need to fiddle with indexing and such?


Data is stored relationally in Postgres in 3NF. References are indexed automatically, but you still need to index type properties if you use them in `filter`.


Wait, it stores the data in Postgres? So this is essentially a data model on Postgres?

FWIW, I do think there's a space in the market for a thin wrapper over Postgres (or MySQL) which would automate certain optimisations such as whether to index a particular table. It always struck me as perverse that that optimisation was delegated to the developer, when it's no more subjective or application-specific than a thousand other automated optimisations the engine makes. I'd be really interested if your project covered that.


It's built on Postgres, but it isn't a _thin_ wrapper. We lean hard into Postgres query machinery and type system in order to pull off EdgeQL and graph-relational efficiently.


Ah, OK, interesting! I don't have an immediate use case personally, but I wish you guys the very best. Honestly, database space needs way more competition than it has at present.

There are countless permutations of the choices that database designers face, so it's a shame there aren't mature products for more of them. I hope this particular permutation turns out to be a good one for lots of people :)


Thank you!


argh - the database space*


Does this mean that at the end it submits SQL queries to postgres? Or is the integration deeper?


We compile EdgeQL queries into SQL currently, because it makes the architecture simpler and less us run on unmodifed Postgres, but conceptually nothing stops us from targeting the query planner directly via an extension or an alternative frontend that consumes EdgeQL IR direclty.


Ah, this is interesting: so Postgres is effectively a 'backend' for you, in much the same way that e.g. InnoDB is a backend for MySQL[0]?

And you - or hypothetically the end user - could change the backend, e.g. to Cockroach for better horizontal scalability, while trusting that EdgeDB will only rely on Postgres's public API at least in meeting its own public API/contract?

[0] It's hard to make that analogy with Postgres b/c it only has one storage engine, but of course the separation still exists.


That’s exactly right.


So it's basically an ORM over Postgres (and only Postgres)?


We're working on a more comprehensive explanation of why EdgeDB isn't an ORM. Does EdgeDB do "object-relational mapping" under the hood — absolutely. The reason we try to distance ourselves from the category of ORMs is that the term "ORM" comes with a big bag of preconceptions that don't apply here.

EdgeDB has:

- Full schema model with indexing, constraints, defaults, computed properties, stored procedures

- A query language that replaces SQL. If there's something you can do in SQL that isn't possible in EdgeQL, it's a bug.

- The query language is backed by a full type system, grammar, set of functions and operators, etc.

- A set of drivers for different languages that implement our binary protocol.

By any definition, EdgeDB is a database. It's a new abstraction built on a lower-level abstraction: Postgres's query engine. Both abstractions indubitably fit any reasonable definition of "database".

Basically: just because there's a declarative object-oriented schema doesn't mean this "is just an ORM" (unless your definition is quite pedantic).


How exactly is EdgeDB run? Is it a separate process from Postgres, or some kind of plugin? Can I run it over an existing Postgres instance?

If I build a DB Schema in EdgeDB, can I interact with the underlying Postgres instance using regular SQL?


It runs as a separate (stateless) process between the client and the PostgreSQL server. There was a talk about the details of the architecture on the live stream today: https://youtu.be/WRZ3o-NsU_4?t=5294


This sounds very similar to Hasura, which compiles graphQL down to SQL. Have you considered adding the subscription feature like they have?


It's something we want to do at some point, but unlike Hasura, which operates on GraphQL which is conceptually much simpler and limited, EdgeQL would be much harder to fit onto a "pipelined polling" model that Hasura utilizes to implement subs.


So, there is, basically, no way to work with Postgres directly, as well as installing EdgeDB with the use of the existing Postgres installation and its data, right? Feels like a Postgres is a prisoner of the EdgeDB :)


If you look at the code they used to have —postgres-dsn URL, now they changed it to —backend-dsn. If you go beyond the marketing, edgedb is a postgres connection pool+orm combined together with a query language (its a huge work in itself and commendable given it takes away many dependencies and provide a consistent developer experience). It’s leveraging postgresql server which is the true database providing ACID compliance.

Edgedb is more akin to hasura than to a database in traditional sense.


Hi there! Yes, I do understand and that the EdgeDB is "a postgres connection pool+orm combined together" and the work guys did is incredible. And I do beleive it will find its niche. But for me, personaly, the idea of "postgres is now a black box for you, don't even try to go deeper into it" stops me from digging into EdgeDB right away (the tooling is amazing! migrations!) and even trying it in our production for our enterprize clients. Probably, with time, when edgeql can fully replace sql and plpgsql, we could say that EdgeDB is ready for everything. BTW, @RedCrowbar, how can I call a udf or procedure (written in sql/plpgsql) from edgeql? I didn't find it in the docs.


There is no way to call UDF SQL (or PL/pgSQL) functions from EdgeDB, because there is no way to _define_ or manage them. The only place that is allowed to do this is the standard library (and, in the near future, extensions).

We realize that the database must be extensible and flexible, so non-EdgeQL UDF will become a reality (and if things work out the way we hope they will, they'll be amazing and far beyond what you can do with plpgsql).


Re "just an ORM", it seems like the word "mapper" is the hint that it isn't, since there's no "mapping" from the object-oriented view to the big-pile-of-scalars in a traditional relational db.


It's the first time I hear about graph-relational DBs. I remember back in college learning about graph databases, but since I never touched one I don't remember much TBH.

Is a graph-relational database something completely disjointed from a graph database? Or do they share some performance improvements to some use cases? Also does EdgeDB keep the advantages of a true graph database even being based on Postgres?


> It's the first time I hear about graph-relational DBs.

This is unsurprising, because we just invented the term :-)

> Is a graph-relational database something completely disjointed from a graph database?

Graph-relational is still relational, i.e. it's a relational model with extensions that make modeling and querying graph-like data easier. And in apps everything is graph-like (hence GraphQL etc). An important point is that graph-relational, like relational is storage-agnostic, i.e. it makes no assumptions on how data is actually arranged on disk.

Pure graph databases, on the other hand, encode the assumption that data is actually _physically_ organized as a graph into their model and query languages.

I guess the word "graph" is simply too overloaded in computing.


There’s at least one German-Bulgarian company called Plan-Vision that implemented such graph-relational approach like 15 years ago. their VSQL is similarly working on the E/R conceptual level and gets translated (or compiled into) to the underlying Postgresql or Oracle. You also get a neat EcmaScript like language that works with the collections in a graph like manner.

Long before Arango, Orient etc.

The company is absolutely nowhere near to you guys in terms of marketing, but their thing works with more than 40 enterprise clients so far.

So you definitely did not ‘just’ invent the concept. A lot of companies approach the problem one way or another…


He said they invented the term, not the concept. I don't know if that is accurate either, but your missquote makes for a huge difference.


>Pure graph databases, on the other hand, encode the assumption that data is actually _physically_ organized as a graph into their model and query languages.

What would be the benefit / disadvantage in each case?


The live presentation is happening right now: https://www.youtube.com/watch?v=WRZ3o-NsU_4

Very interested in what data engineers think about this project!

I am not a developer, but the founders (including Yury Selivanov, Python Core Developer, see also https://github.com/MagicStack) and the fact that these people have been investing in the project for four years already, make me think that EdgeDB can be an important project for the database world!


Sounds pretty cool!

Questions:

1. What is the story for replication currently? Can I use EdgeDB with Posgres replication tools like Stolon or Patroni, running EdgeDB against the proxy they expose?

Or does EdgeDB plan/need to have its own replication?

Googling this, I found this previous HN post (https://news.ycombinator.com/item?id=19640689) saying:

> Tooling for that will be coming in the next few alpha releases.

2. "A builtin migration system that can reason and diff schemas automatically or interactively"

How do you deal with the fact that Posgres does not offer transactional DDL (e.g. ALTER TABLE)?

In our Posgres, we had to use advisory locks around migrations to avoid concurrent schema changes invoked by concurrently starting servers which run migration.


Won't the functional variant read better than the SQL dialect

  movie_reviews
   .filter(_.actor.name.lowercase() == "Zendaya")
   .groupBy(_.title, _.credit_order, avg(_.ratings))
   .sortBy(_.credit_order)
   .take(5)

vs.

  select
    Movie {
      title,
      rating := math::mean(.ratings.score)
      actors: {
        name
      } order by @credits_order
        limit 5,
    }
  filter
    "Zendaya" in .actors.name


It's a matter of taste. We decided to do a "looks like SQL + GraphQL" style because that's what most people are familiar when they think about a query language.

That said, the functional variant is a likely way to represent EdgeQL in programming languages.


Keep in mind that EdgeQL is a query language that can be used from any programming language (either over the official libraries, or over HTTP). Functional JS-inspired dialects aren't appealing to everybody.


I like the flow of yours, but it doesn't capture the GraphQL piece. Where do you specify the nested limit and desired fields for the inner "actors"?


Your query returns 5 rows total. The EdgeDB query returns all movies that Zendaya is in [1] and, along with each movie, the first 5 credited actors.

Also, it looks like your version also does implicit joins (like EdgeDB), but I'm not sure how they would work in that style.

[1] Or maybe only movies where Zendaya is in the first 5 credited actors; I'm not sure.


This is so exciting. We can definitely do better than SQL in 2022 and EdgeDB is a step in the right direction. Been using Prisma[0] in production in the past year and a half (which takes the same approach as EdgeDB but currently works for TypeScript and Golang) and I'm so happy. [0]: https://www.prisma.io/


Prisma did a lot of things right, and we have a ton of respect for the team over there! At the end of the day, building a good, idiomatic API for doing CRUD operations is a hard problem. But building an entirely new query language that fundamentally solves some underlying design flaws and usability issues with SQL is a whole other level of hard.

With EdgeQL, we conclusively solved a lot of these fundamental issues. Now, we're able to use EdgeQL as the foundation of our query builders, which is a huge advantage. We built the first version of the TypeScript QB in ~4 months) and it immediately leapfrogs all the major ORMs in power & expressiveness. But that's only possible because the hard work of designing EdgeQL was already done.

We'll be working on communicating how our query builder works and how it compares to ORMs in some of upcoming posts, stay tuned.


Prisma + Typescript is such a productivity boon, auto completing all query options and database fields. And the results fully typed.


EdgeDB co-founder Yury here. Ask me anything :)

Live launch stream: https://www.youtube.com/watch?v=WRZ3o-NsU_4


I have a few questions I can't find after skimming the documentation..

Can someone use Postgres extensions with EdgeDB, like TimescaleDB, PostGIS, Zombodb and Postgres_fdw?

How are sum types (called enums in Rust) modelled in EdgeDB? (like Rust's Result). Do I need to define it with inheritance, where each variant inherits from it? What about adding specific syntax for sum types?

edit: also, I see there's a WIP custom #[derive] for Rust in https://github.com/edgedb/edgedb-rust - can it store Rust types on EdgeDB? Something like http://diesel.rs/ or even https://lib.rs/crates/turbosql


Regarding extensions: we plan to wrap the most popular ones like PostGIS to give them idiomatic EdgeQL flavor. We need to do a bit of work to make that the right way.

Sum types can be modeled with inheritance. You can create an abstract base type and use that as target for the links. You can then derive types from it and write polymorphic EdgeQL queries to select/match data.

Rust client is still work on progress, not really open to tinkering unless you want to experiment.


Hey Yury, Cool stuff!

How do you folks solve the large amount of joins that are the result of graph queries? Any worst-case-optimal multi-way-join secret sauce :D?

Also, with DBs like Datomic competing in the same area, do you have an immutability/versioning story?


Thanks! The secret sauce is to pack nested shape queries into array_agg-ed SQL subquery. So we never select unnecessarily wide rows.


Sounds like this is not dissimilar from the GraphQL-to-SQL compiler in Hasura, which also brings out surprising performance for wildly nested frontend queries.


Yeah, indeed, Hasura uses json_agg, which is similar but returns you a JSON string.

We use array_agg, so we often avoid data serialization altogether. Our binary protocol just lets the data messages pass through with the original binary encoding. And because we fully control the schema, we can make all sorts of interesting optimizations, like implementing high-perf data codecs on the client side to unpack data fast.


Yeah, that‘s an appealing idea, if there never even has to be any json.

Hasura has the frontend safe API and strong authz going for it. Is that something you might also do, or are you focused on serving the backend? End-user row and column level authz gives me a lot of peace of mind when writing bigger queries.


How do we migrate from existing Postgres database to EdgeDB? Can traditional app still access all the relational as before and new app access via EdgeDB? We're making use of existing environment and extending it with our own frontend. I would like to understand how I can mix legacy and new code.


We plan to introduce a tool for assisting migration from SQL databases, but at this very moment, the only way to migrate an existing app is to re-create your DB schema in EdgeDB and then write a script to import the data. EdgeDB right now is definitely more geared towards starting new projects with it.


Maybe if you move all data to be managed by EdgeDB, then create a separate "legacy" schema that just contains views & triggers that forwards queries and mutations to the EdgeDB database.


I looked through the documentation and all I could find was how to spin this up locally. How do I run this on the cloud?

Do you offer fully managed service to host this or do I need to spin up some compute instances of my own ?


Found it: https://www.edgedb.com/docs/guides/deployment/index

Can't wait for edgeDB cloud!


What advantages/disadvantages does EdgeDB have over Dgraph.io?


Unfortunately DGraph is having some serious troubles as a company: https://discuss.dgraph.io/t/quitting-dgraph-labs/16702

Technically speaking though, using GraphQL as a data model is quite limiting: just basic scalar types and a handful of custom types implemented by DGraph (Polygon, etc). By contrast EdgeDB schemas support a much wider range of granular types, constraints, computed properties, link properties, etc.

Perhaps more importantly EdgeQL is a full query language with composable syntax, a standard library of functions and operators, for loops, primitive literal manipulations (e.g. string indexing and slicing), the ability to cast values to different types, subqueries, etc. Basically it's a complete query language contrast GraphQL is closer to an ORM in that it really only supports CRUD.

I think DQL is marginally more powerful though admittedly I'm not very familiar with it: https://dgraph.io/docs/get-started/


do you have plans to integrate with prisma? it's almost a DIRECT translation of how the query is expressed in Prisma compared to edgedb. Googling quickly, it seems there is a ticket already. I wonder if this could be a low hanging fruit and a great growth channel since prisma is very popular.

https://github.com/prisma/prisma/issues/10210


Not really, there's no point in using prisma with EdgeDB as our own query builder is more idiomatic for our product, faster, and we gape more capable.

See also this reply by Colin: https://news.ycombinator.com/item?id=30293544


Is there an eta or public wip of a python query builder?


Our topmost priority now is launching a hosted version of the product, all hands on deck. Having a proper (and potentially fully type-safe) query builder for Python is literally the next to-do.


Dumb question, how do you exit the shell? I feel like I've scoured the docs and can't find it.


  \q or \quit
If you get stuck, run \help for a list of commands.


What colin said, or (in every terminal emulator I've used), ctrl-d to send end of file, which will close it.


That's interesting, I've always though ctrl-c would close things. I tried so many things like "quit" "exit" ctrl-c "help"


Ctrl-c cancels running processes, but not shells. When a process becomes a shell I can not say, but if make an infinite loop in the Python repl you can Ctrl-C that loop.

  $ python
  >>> while True:
  ...   pass
  ... 
  ^CTraceback (most recent call last):
    File "<stdin>", line 1, in <module>
  KeyboardInterrupt


Congrats Sully and thanks for your work on mypy!


Do you support columnar storage? i.e. to make analytics queries fast.


Not yet, but theoretically we might support something like Timescale in the future.


Any plans to have a gui?


Yes, it's in the works!


Hell yeah. This is what I've wanted ever since I first learned about Apple's CoreData.

This seems great. I have a highly interrelated data model (think parsed natural language text with annotations), and writing SQL to keep all of that data aligned and synced is a pain with an ORM.

Am I right in imagining this works similarly to how Apple's CoreData does? It lets you build objects linked to each other but handles all of the joining and syncing of the data for you to keep the object model in mind.

In the same vein will you be creating a Swift client?


I'm not too familiar with the CoreData API, but from a quick googling it seems like this is some sort of an ORM on top of SQLite.

We position EdgeDB as a database server, not a library, partly because you can interact with it from different programming languages. But we design our client library with focus on API composability, check out our edgedb-js library for example: https://www.edgedb.com/docs/clients/01_js/index

As for the Swift, it would be great to have it one day. I have a counter question: how many of you use Swift to write server-side logic?


Right, but conceptually the way you interact with it is the same. I'm not trying to belittle what you've done. I'm in fact very excited about it.

And EdgeDB is very needed because it's just an Apple library currently.

Conceptually, my impression is that EdgeDb is relational tables (highly-typed) queried, combined, and modeled as nodes in a graph/tree. Is that conceptually correct?

I don't, but I would like to if I could deploy Swift code to Azure. I love Swift as a language.


That's correct! Though that's true for most ORMs as well, and we try not to get lumped into the ORM bucket...too much bad blood.

I've dabbled with CoreData myself and it's a phenomenal API. Apple comes up with a lot of great stuff. In fact, given how much server-side Swift there is these days, we'll have to look into building a Swift client library at some point.


I completely understand, but your product and CoreData go so much further than what ORMs can. It's fulfilling the promise of ORMs and relational databases that's only existed in people's minds up until recently.

I hope EdgeDB is immediately recognized for the value it will bring to developer productivity everywhere!

Now we just need an SQLite version of EdgeQL (perhaps EQLite) for local data and it can completely replace SQL in my life.


Would also love to try this in Swift!


Swift has a family of Docker containers and an (admittedly modest) ecosystem of backend frameworks. You can definitely run it on Azure :)


I know you can, but I'm talking about it being a 1st class citizen.


Why position it as a database server if you could position it as something else, like an interface server? Just curious.


> This is why we wrote (and will continue writing) full-featured first-party database client implementations for common programming languages (currently available for Python, JavaScript/TypeScript/Deno, and Go).

Why no C API? Pretty much every language has a C FFI, PostgreSQL's libpq is in C too, and then it's easy to make bindings from other languages. But with a few individual implementations in other languages it's not as easy to use from more languages. Always makes me to wonder about these decisions when I see projects heading that way.

Unrelated to the subject, but the text on the website is #B3B3B3 on #FFF, which is hard to read as is, and violates WCAG recommendations. When overwriting it with a custom CSS, it's visible that headings have weird margins, covering parts of text before them.

While looking for more, I noticed that the website suggests to `curl | sh` things in the introduction, which is quite awkward too (and a subject of light flamewars), adding another barrier.

The brief project description made me to wonder how well it abstracts out PostgreSQL (and how to work with the databases it creates via PostgreSQL itself, if it's possible at all, how to debug it when things will go wrong), but after brief skimming there's no PostgreSQL bits in sight: just a special shell, a dedicated language, its own drivers. Which is a bit scary.

Neither have I found a description of its graph-relational model, how it's built on top of PostgreSQL, how one can be sure that it'll work more or less smoothly, how things like profiling are done (or is it designed to never need explicit profiling/optimization?), is there more to it than PostgREST-like interface abstracting out the SQL bits for DDL too.

Looks like an interesting project overall though.


> We should not continue wasting our productivity with a database API architecture from the last century.

Sorry, but dissing proven technology like this just makes me vomit. Show me _how_ you can beat SQL in performance and features on the frontpage, or I'm just happy to go along with what I already have.


What a terrible attitude to have, that's truly the worst possible interpretation.

How's that dissing anything? They are saying they have a good abstraction that will make you more productive. While it might not be, this is literally what makes software the amazing tool it is, building layers of abstraction that make you more productive.

You seem to have gotten it all wrong, how could building something on top of SQL be dissing SQL? It's as much of a diss as C is a diss to assembly and so on.

> Show me _how_ you can beat SQL in performance

It is literally a postgres instance underneath, if edgedb is malleable enough, you don't need to beat anything, you'll have the query you want (sans edge cases and some minor overheads).

> I'm just happy to go along with what I already have

Go ahead bud, I'm pretty sure it won't be legally enforced to use any time soon so you're good to go.

The fact that this low effort comments get upvoted in HN drives me mad, just the contrarian attitude will get you points no matter the depth or effort. You don't seem to have even read the landing page or tried it, maybe you did but your comment doesn't reflect that, this is the product of 4 years of effort by some devs that are trying to innovate, dissing it without even fully seem to understanding makes me want to vomit.


It's very frustrating that whenever a team is showing something new they've built to HN, so many of the top comments are focused some incredibly nitpicky, personal issue that somehow negates everything else about the project.

It's an amazingly lazy and inconsiderate way to respond to people who've put a bunch of effort into building something.


It's at least partially a programmer thing, basically bikeshedding. Just imagine doing a code review with some of these commenters!


Agreed ! Usually when you look at these "haters" there is nothing of value in their "submission" sections under their profiles, which usually tells me exactly how to "read" the comment(insult).


>> Show me _how_ you can beat SQL in performance

> It is literally a postgres instance underneath,

Exactly. So. Why? Someone created a database abstraction? God damn, that has almost never happened before! ;)


Strong reaction. In general we shouldn't discourage exploration and moving the needle forward. I actually agree with you that the language is somewhat stronger and the onus is on them to live up to it, but at the same time there is a lot of competition in this space (newer DBs) so nothing wrong with setting the bar high enough. :) They are specific with their use case (APIs), so that's excellent, as the core itself will not relinquish all the learnings of the "last century". ;)


> Show me _how_ you can beat SQL

Scroll down. They do exactly that.


Yeah, we also blogged about this extensively.

Here are are some links:

Pointed critique at SQL: [1]

Benchmarks: [2] and [3]

We'll be adding a dedicated benchmarks page to our website soon.

[1] https://www.edgedb.com/blog/we-can-do-better-than-sql

[2] https://www.edgedb.com/blog/edgedb-1-0-alpha-1

[3] https://www.edgedb.com/blog/edgedb-1-0-alpha-2


In [3] it says "we are assessing the code complexity and performance of a simple IMDb-like website built with Loopback, TypeORM, Sequelize, Prisma, Hasura, Postgraphile, raw SQL, and EdgeDB" but then it goes on to only explain the results of the classic ORMs but not hasura, postgraphile and prisma. Are the full results available somewhere?

That classic ORMs are kinda slow is probably not a surprise to anyone, the others which either get to compile the full query or have a hand in controlling the schema are more interesting.

They also seem more similar to your product, running as a server and managing the schema, so most worth comparing.

Edit: The fact that "Raw SQL" ends up being a suboptimal query because of the node driver limitations, which then gives you "Way faster than even raw SQL!!!" graphs also leaves a weird taste. I guess if you are comparing programming language level solutions fair enough.


We'll post some benchmarks against Hasura et al soon.

"Faster than SQL" is, of course, relative and depends on "what SQL"? EdgeQL compiles into a single query that uses PostgreSQL-specific features. This is a guarantee. No matter how large or complex your query is, if it compiles, it compiles into a single SQL query. Manually written or ORM-generated SQL tends to be "multi-query" due to the whole "standard SQL composes badly" story. And this matters, because if the roundtrip network latency between the client and the server is 10ms, EdgeQL will get you a response in ~10ms, whereas a multi-query approach will in (~10ms X <number-of-queries>) even if every individual query is super-quick to compute.


Would love to see benchmarks! :)

But just for clarity in the discussion thread here, Hasura also compiles to a single query when only Postgres is being hit and I'd expect performance to be quite similar...

Ofcourse, if the GraphQL query requires federation across multiple Postgres databases or multiple databases or databases + other GraphQL / REST APIs, then Hasura breaks them up into multiple queries with a worst-case performance of a data-loader type set up.


Yeah, compiling a data access DSL down to SQL, potentially through some intermediate AST, is not as novel as it's being made to sound around here. I feel like I'm being gaslighted.


>> Show me _how_ you can beat SQL

> Scroll down. They do exactly that.

Where?


Makes you want to vomit? That's a bit over the top.


Not want to vomit, actually vomit. Apparently.


This seems kind of cool, but honestly the benefits described don't really solve actual problems I've run into in development. The overall article reads like marketing talk and doesn't actually describe concretely why it's better than existing solutions. For something as core as a database I would expect more rigorous descriptions and benchmarks.

> We shall do better than SQL The EdgeQL language looks cool, and I'm sure querying via a graph structure makes certain problems easier in some use cases. However as much as people have complained about SQL, it's just so ubiquitous there needs to be a very good reason to switch away from it. Not having to write joins isn't really a good enough reason, in my opinion.

> The true source of truth I'm not sure why this means EdgeDB is better. Tons of applications use a traditional or cloud SQL database as the source of truth right now. This section seems to imply with microservices you no longer have a single source of truth. But if they're trying to say a microservice system should instead us a single common database that breaks separation of concerns and moves us into an annoying situation where you have a bunch of services communicating via a shared database.

> Not just a database server It sounds like they have a solid client, which is awesome.

> Cloud-ready database APIs > The vast scale of modern application deployments requires that inelastic computing resources are managed very carefully. Until cloud-native databases reach complete functional and performance parity with traditional databases, we will have to contend with the fact that the database is a scarce resource.

This used to be true, but is definitely no longer true. Cloud-native databases are everywhere and incredibly common. See any major cloud, https://www.cockroachlabs.com/, or any of the tons of other database solutions.

It's great to see a new database coming out - innovation in the space is super important. However this announcement reads like marketing speak, and is light on the details. When I see a new product I want to hear things like: - about how it scales - what the architecture is - why is it stable and trust-worth enough to put my data on - is it multi-node? How did they make it serializable? - how fast is it? Performance is super important.

Based on their website it seems like a thin skin over postgresql. If that's the case I'll just use postgresql. If it's a clustered new and advanced database, then I'll be wary about trusting it for anything real.


Yeah, to me it also sounded like another graph plugin for a popular RDBMS. I also didn't find the examples convincing. Not having a driver for #Net/Java?

To me the problems with existing DBMS are still the same (as 15 years ago) complexities in setup/clustering/backups/rollbacks/schema updates, even setting up db clients are PITAs in many environments.

SQL is not really a feature to focus on (imho), it is simple enough even for non-tech people). We tried to get out of SQL long time ago anyways (ORM).

Anyways wishing luck, the team seems awesome!


>> We shall do better than SQL

> The EdgeQL language looks cool, and I'm sure querying via a graph structure makes certain problems easier in some use cases. However as much as people have complained about SQL, it's just so ubiquitous there needs to be a very good reason to switch away from it. Not having to write joins isn't really a good enough reason, in my opinion.

Oh, it goes much deeper than not writing joins. There's no single ORM out there that can implement a TypeScript query builder like ours, see the example in [1]. This is only possible because of EdgeQL composability, but that composability required us to rethink the entire relational foundation.

> > The true source of truth

> I'm not sure why this means EdgeDB is better. <..>

This section implies that EdgeDB's schema allows to specify a lot of meta / dynamically computed information in it. And soon your access control policies. Take a look at the work-in-progress RFC [2] [3] to see how this is more powerful, then say, Postgres' row level security.

> > Not just a database server

> It sounds like they have a solid client, which is awesome.

Also lightweight connections to the DB so that you can have thousands of concurrent ones without load balancers, built-in schema migrations engine, and many other things. In fact we have so much that it's challenging what to even highlight in a blog post like the 1.0 announcement.

> Cloud-ready database APIs

> This used to be true, but is definitely no longer true. Cloud-native databases are everywhere and incredibly common. See any major cloud, https://www.cockroachlabs.com/, or any of the tons of other database solutions.

Not to pick on CockroachDB (they have an amazing product and company, we love them), but you should benchmark local install of Postgres and Cockroach to see yourself that scalability still has a significant cost in performance.

[1] https://www.edgedb.com/blog/edgedb-1-0#not-just-a-database-s...

[2] https://github.com/edgedb/rfcs/pull/49

[3] https://github.com/edgedb/rfcs/pull/50/files


> Oh, it goes much deeper than not writing joins. There's no single ORM out there that can implement a TypeScript query builder like ours, see the example in [1]. This is only possible because of EdgeQL composability, but that composability required us to rethink the entire relational foundation.

I don't understand this claim. Can or does? This all just compiles down to SQL right?


EdgeQL looks really nice. I do think this will be somewhat hard to sell because at small scales SQL and basic ORMs mostly work and have less lock-in. At large scales folks also want a lot more operational features like scaling out, control of the underlying schema for performance reasons, relaxing some constraints for horizontal sharding, and so on.

I think EdgeQL (and the data model) has potential for being a lot more than accessing the database. For example, you could have a distributed system use this data model across various typically replicated data stores - include all caches and "materialized views" of various kinds. Extend EdgeQL to define various caches and computed values from the core schema. Then you could use it to represent querying the same data but with different freshness. Now this becomes a compelling idea. Essentially it gives you a well defined way to declare and manage various computed values, laggy caches without having to manually implement all that coordination.


To add a specific example. A common pattern is 'put some objects in memcache' in front of the DB. Imagine if you can:

1. define a view representing the object fields you want to cache 2. define eviction and properties of the cache 3. use queries to query either the cache or the db

Specifically, do all of the above using edgeql and not have to write the cache consistency or serialization/deserialization logic.


Can someone from EdgeDB explain why the SQL isn't as simple as what I have below? What am I missing? Why is that cross join lateral necessary:

  SELECT
      title,
      ARRAY_SLICE(ARRAY_AGG(movie_actors.name WITHIN GROUP (order by movie_actors.credits_order asc)),0,5)
      avg(movie_reviews.score)
  FROM movie
  JOIN movie_actors on (movie.id = movie_actors.movie_id)
  JOIN person on (movie_Actors.person_id = person.id)
  JOIN movie_reviews on (movie.id = movie_reviews.id)
  WHERE person.name like '%Zendaya%'
  group by title


Because that only gives you actor names, not records, and also because arrays aren't a universal SQL feature.


EdgeDB is already postgres specific though.


EdgeDB currently is. Graph-relational and EdgeQL are not.


Until another db is graph-relational and can be queried via edgeql those are just as postgres-specific as arrays though, right?


You can JSON_AGG to get whole records.


Unfortunately, JSON aggregation destroys type information, so you can't reason about that your SQL query actually returns anymore.


Also, this query is wrong because we want movies where Zendaya played, but _also_ other actors in order, possibly without Zendaya, so you really need to do the actors join twice.


EdgeDB employee here. We're beyond excited to finally publish a stable release of EdgeDB after 4 years of active development and 15 pre-releases. Happy to answer any questions on here!


4 years!? Wow, I can't imagine how you guys are feeling today, congrats and all the best to the project!


For the full story read another blog post: https://www.edgedb.com/blog/building-a-production-database-i... :)


Why is it called Edge though?

I immediately assumed it was some kind of distributed db running at the edge but it seems this is not the case.


Probably in the sense of a graph edge.


Yes, spot on.


It's a good sales pitch, but it's not immediately clear what the terms are. I see that there is a company behind this, that there's also a Github repo, and it appears you can install something without paying. But is it entirely open source? If not, what are they selling? What kind of business is this?


Everything is fully OSS: the database, our client libraries, our CLI. There's a company behind it (I'm an employee) which will make money with a cloud hosting platform, similar to Mongo. We're calling that EdgeDB Cloud and it's still under development. Though you can self-host too on any major cloud. [0]

[0] https://www.edgedb.com/docs/guides/deployment/index


We will run and support EdgeDB for you. Here's an expanded answer: https://github.com/edgedb/edgedb/discussions/3377


Where does full text search fall into the picture / current query language? Do I need to drop down to writing SQL for that?

When do you drop down to SQL / where does the boundary end for edgeQL?


Hmm, no JVM or .NET support ... seems problematic for any widescale adoption at this point. Are there fundamental reasons why this is problematic or just not got around to it?

(And, is it likely to be possible to build it against standard interfaces like JDBC - or is it too different?)


I wonder this too. The examples I see seem to work best with a dynamically typed language. Is there an example on how this API would work for statically typed languages? Or even languages that don’t like Nulls like Rust or Swift (making everything an optional is not great)


They have a golang client. And there was an attempt to write a Java client[1]. Someone could help ShaileshSurya (the java client maintainer)

[1] https://github.com/ShaileshSurya/edgedb-java


On https://www.edgedb.com/tutorial/basic-queries/objects, if I change "SELECT User.name;" to "SELECT User.Name;" the page crashes with, "Application error: a client-side exception has occurred." It crashes on Chrome and Edge.


We'll take a look, should be a quick fix.


Aaaaand... it's fixed!


Nope


I get a response.

> InvalidReferenceError: object type 'default::User' has no link or property 'Name' > Hint: did you mean 'name'?

Which is the expected response (same as the edgedb cli)


Congrats with 1.0. A couple of things I’m curious about:

1. How does EdgeDB compare to Supabase? Thinking both of realtime functionality and row-level security. 2. If I was to use EdgeDB instead of Django, how would I go about it? In other words, how can I set up a batteries-included, server-side web app?


I can’t find anything in the docs on security. PostgreSQL has a pretty solid authorization story with row level security etc. Does EdgeDB provide any similar authorization mechanisms to ensure only data the user is entitled to is returned in queries?


Security policy will be part of the next release. See draft RFC [1], although note it's likely not going to be the final syntax.

[1] https://github.com/edgedb/rfcs/blob/865bc48f4050ced99447bd77...


Congrats on the launch! The code seems to be Apache 2.0 which opens the door to AWS or other big clouds just hosting the service instead of people using your cloud service. Are you guys planning to change the license to prevent that?

On a tangent note, I find it is honestly annoying to have to maintain a separate account/authorization/vpc connection just because I want to use a database. I would much rather startups work with cloud providers to offer it natively or we invent a better way to interoperate with clouds. The explosion of small "cloud" that offer one service each isn't pleasant to work it especially when they don't have a terraform provider.


A lot to unpack here.

1. If other clouds want to offer a hosted EdgeDB offering, they can do so. We'll have a tight integration with the `edgedb` CLI and we're confident we can beat them on the developer experience which is our #1 priority.

2. We'll likely use GitHub for auth though this isn't set in stone yet. We're still planning out the workflows surrounding EdgeDB Cloud.

3. As for the explosion of cloud silos—that's a very real phenomenon with a boring reason: hosting is one of the few ways to make money building OSS software. There are preposterously valuable OSS frameworks + libraries that haven't made anyone a cent. We implemented "one-click" deploy buttons for the major clouds to the extent it was possible to do, but for fully open-source companies like EdgeDB there's aren't many routes to sustainability outside of paid hosting.


Fair, thanks for the answers! Honestly hopes it works out for you, but I am very cautious about these things. Elasticsearch, Mongo, Timescale all ended up changing license at some point.

Yeah thats ok for startups I guess.

Yeah sad state of affair really, though fully solvable by tech IMO if we had some protocol for integrating services in clouds. In any case, please don't underestimate the terraform integration. I really could not care less about the one-click deploy. What I care about is maintenance and integration with my existing infrastructure as code.


Thanks! Point taken re: Terraform, we'll give that some thought.


Re license we have an answer for you here: https://github.com/edgedb/edgedb/discussions/3377

Re second point you can deploy edgedb to your cloud of choice. As for working with cloud providers to make integration better for the user, it's a bit too early for us to comment.


Thanks!

Cloud of choice is great, also need region and AZ of choice too for any serious production database. What is killing performance is round trip time to the DB. The language seems to help on the number of round-trips so that's good.


> The language seems to help on the number of round-trips so that's good.

Yep, you got it! :)


first commit 2009, so we're roughly 12 years later. Whats the saying? It takes about 10 years to create a reliable DB? Good on you guys. Looks super interesting.



I always appreciate the historical journey why someone made X the way it is.

Thank you for sharing.


This looks phenomenal guys. I've read a bunch of these Show HN posts over the years and I struggle to stay engaged on most of them. They always lean towards highly technical that makes my small brain smooth over or so much marketing speak that I have no idea what's going on. Your post was short enough to comprehend in one sitting, you have an example near the beginning and you make your case clearly. I love that you used Postgres as a layer and I think that set you up for success. You have me excited and I'm pretty pumped to use edgedb in one of my idea projects just to get a feel for how it works.

Congrats!


Thanks. Please report back your experience!


Does it include (or have planned) any features relating to full text search, more specifically for mixing ranking/search with graph queries?

One use case I have (solved by some current graph DBs) is performing a complex full text search (boolean operators, stopword removal, fuzzy matching etc.), which alters search ranking based on graph properties (e.g. a node with more edges might get a reduction in score, the root of a tree might get a boost, etc.).

I'm still going through the tutorial, so haven't got to grips with what the DB is capable of yet.


One thing that can facilitate adoption is a smooth path from a working software based on Postgresql to EdgeDB:

We have a software that's a GraphQL interface to a database that's populated with a project (let's call it the indexer project) we do not control. It would be great if we could check the database schema for problems it might have to be used as an EdgeDB database. Then we would migrate our application to use EdgeDB, while the indexer keeps loading our database through direct interaction with PostgreSql.


This query language looks really nice!

I wish GraphQL were more like this and considered built-in features like where clauses and cursors, instead of having to add those over the top with loose conventions.


GraphQL is great for what it was designed for. It just was never designed to be a querying language with analytical capabilities, like EdgeQL.


Forget analytics, it can't even do pagination. That really limits the benefit of being introspectable.


Yeah, true. FWIW with EdgeQL you can also introspect all aspects of your schema.


EdgeDB is built on an extension of the relational data model that we call the graph-relational model. This model completely eliminates the object-relational impedance mismatch while retaining the solid basis of and performance of the classic relational model.

I don't see how can they retain the performance of the relational model when using graphs on top of PG unless they are using PG as storage k/v layer only.


The key insight here is that we still have schema. We have high-level object types and a first-class notion of a "reference" - we call it "link". Our other extension of the relational model is that every "row" must have a unique ID. Both things combined implement a graph or your types and a graph of your data as it's stored in the underlying tables. Everything is strictly typed and efficient.

But EdgeDB isn't ideal for storing loosely typed graph data, neo4j is built for that.


https://www.alibabacloud.com/blog/postgresql-graph-search-pr...

Can use these techniques on top of typical, existing relation model tables as well. Consider a join table, or even a polymorphic join table, that gives you the edges. But you can use unions and stuff to join up data across various tables. Can get really creative with it.



Why's that?


Poking a bit at the documentation, I see like/ilike but are there plans for text search/trigram capabilities? Recently I’ve been working with lots of different entities that have searchable properties and exploring searching across different elasticsearch indexes to do “JOIN” like operations but have been exploring Postgres (and related) solutions for better “JOIN” support out of the box


Yep, we have plans for that. We are exploring if it would be feasible to make it possible to plug-in external engines like elastic and make that integration totally automatic, enabled with a simple annotation in the schema.


Even better then I would have thought, thanks! I’ve primarily dealt with Elasticsearch/Solr in my career, so jumping into the way searches work in Postgres with various native support and plugins is, interesting to say the least. Maybe I just need to break my Lucene roots


I don't understand the benefit. It's just a query language on top of Postgres? It doesn't seem to have the performance characteristics of a graph database, while acting like it does. JOINS will still be expensive. You guys shouldn't use the word graph, misleading.


my thoughts exactly...

a graph database is not useful because of its query language, it's useful because of its performance characteristics when bringing linked data — it could never be performant on a Postgres backend.


to be fair, I don't know EdgeDB's architecture but to the degree that a system can bring its graph structures into memory, using PostgreSQL as the durable storage but not for every on-demand query, it can certainly provide fast results. I would assume EdgeDB likely has concepts like this integrated into its design.


I think the idea is to prove the query language more than deliver a database that has graph DB advantages. If they prove the language maybe they can implement a different backend. I'm just guessing here.


> If they prove the language maybe they can implement a different backend.

This is very true and might happen in a far future.


Why a new language? I can see and understand why and where graphs beat out SQL, but what does EdgeQL have over existing graph languages?

Eg. vs Cypher or the likely-Cypher-compatible forthcoming GQL standard? https://www.gqlstandards.org/


EdgeQL is designed to replace SQL, not graph query languages. Think of it as SQL getting a proper type system and GraphQL capabilities of reaching into deep relationships in an ergonomic way.


Forthcoming? Given that the last sign of live has been from 3 years ago in a space with already little movement GQL seems to be most likely dead.


Well, the last update on the official ISO page https://www.iso.org/standard/76120.html is from November 2021. Does not look that dead to me.

Standardization projects typically take awhile, specifically for something complicated as a query language spec.


The international committee (ISO/IEC JTC1 SC32 WG3 Database Languages) that is responsible for the database language standards SQL and GQL is pretty good at writing standards and not as good at taking about the work in progress. SC32 WG3 is developing two related standards to support property graphs:

1. SQL/PGQ (ISO/IEC JTC1 9075 part 16) -- This adds language to create property graph views on top of existing SQL tables and write property graph queries in a GRAPH_TABLE function in an SQL FROM statement.

2. GQL (ISO/IEC JTC1 39075 Database Language GQL) -- This is a full declarative property graph database language to create, maintain, and query graphs. This includes support for both descriptive and prescriptive schemas.

The Graph Pattern Matching language is identical between the two standards. For more details about GPM, see https://arxiv.org/abs/2112.06217

The ISO process has a defines series of milestones. I will spare you the details at the moment.

SQL/PGQ will start a Draft International Standard (DIS) ballot in July 2022 and so will be a published standard next year - 2023.

GQL will finish a Committee Draft (CD) ballot this month (February 2022) and should be ready for a DIS ballot in early 2023. However because GPM is shared between SQL/PGQ and GQL, the GQL Graph Pattern Matching will be stable when SQL/PGQ goes to DIS ballot.

For a little more detail on the GQL standards process and content, take a look at the talks from the LDBC TUC meeting August 2021: https://ldbcouncil.org/event/fourteenth-tuc-meeting/


Thank you for the monumental work you guys put into this! Looks very interesting, and something I would like to try out.

One quick question - I see that you have a Rust client, however it's marked WIP, how usable is it in its current state? Any idea on a timeline of it becoming an official binding?


It's usable and functional, because we use it in our CLI. It's WIP, because we haven't yet committed to an API, especially in async. Rust is a bit hard in that department :-)


I wonder how EdgeDB talks to postgres. Does it speak the Postgres protocol and translate the queries, or is it more of a deeper integration in to the storage layer?

Very interested in this! I've been wanting to build a generic HATEOAS server and this feels like this might be the right database!


I know benchmarking DB is very hard and pretty much nonsense, but do you have any idea / production use cases of EdgeDB at large scale? Did you see a performance drawback given from the higher abstraction, and the EdgeDB stateless client between the app and postgre?


We've got some benchmarks in an earlier blog post [1].

EdgeDB is designed to do its job validating and compiling your schema and queries and then get out of the way. In other words, once a query was first parsed and compiled, the cost of the next trip via EdgeDB would be similar to that of pgbouncer, i.e. we'll simply send the compiled SQL to Postgres and proxy the results back to the client. This is why our data protocol uses Postgres framing and encoding.

[1] https://www.edgedb.com/blog/edgedb-1-0-alpha-1


At the time of writing, docs have a large typo in them: https://www.edgedb.com/docs/clients/01_js/driver

impoart * as edgedb from "edgedb";


Thank you!


EdgeQL seems like a custom graphQL. Any reasons you did not follow dgraph's DQL or graphQL-compatible language? https://dgraph.io/docs/get-started/


GraphQL doesn't have any syntax for expressions, like 1+1 is inexpressible with GraphQL. Once you add support for arbitrary expressions, functions, etc you start departing from GraphQL to something that looks surprisingly similar to EdgeQL :)


If anyone makes this work with heroku Postgres, let me know!

I started a container and connected it, but instantly got an email from heroku telling me i had 19500 of my 10000 rows, and 215 tables. Not really any hobby-project viable priced hosting options in their documentation.


This looks pretty cool. I'd be interested to hear when the Rust bindings will be production ready: https://github.com/edgedb/edgedb-rust


I'm disappointed on you open sourcing it.

Now I'm not against OSS movement and in fact I'm working at a company doing exactly this, but releasing your bread-and-butter as an open source project leaves you vulnerable to peer plagiarism.

Instead, you should just release the core version so others will build an open ecosystem around your mainline product. This will secure your market fundamentals by making sure no one could overshadow you especially in a fiercely competitve market of database.

So please stop open sourcing too much, I don't want to see the same ill fate again and again for great products like RethinkDB (and its downfall) that could change the world


> Instead, you should just release the core version so others will build an open ecosystem around your mainline product.

To be clear EdgeDB is the core version, albeit a large core. We have many ideas about value-adds. But the goal for open sourcing it is indeed to foster an open ecosystem.


Hi, looks really great! Thanks for this great piece of tech!

I have a question regarding the migrations, in the website you say it's safe to run them with automated flow, we know it uses Postgres under the hood and we know that sometimes migrations on large datasets can cause downtime in the database clusters. How do you handle them? I guess since you don't have a backing "users" table for a "User" model adding / removing fields is not actually happening the same way it happens in a normal relational database, thus it's not a blocking or resource consuming operation for you?


How does this compare against neo4j? When should I choose one instead of the other?


I replied to a similar comment here: https://news.ycombinator.com/item?id=30291290


Looks great, but the example for SQL looks kind of bad. Is that how pg does things? I thought "modern SQL" used CTEs and other fun features, or this that not a thing for pg as much as say snowflake?


Yea the SQL example looks like the intentionally made it way worse than it needed to be...


It was taken from Elvis' talk. Here's the video explaining how we get to this query and why: https://youtu.be/WRZ3o-NsU_4?t=1892


> EdgeDB is built on top of Postgres

How it is a new database? Or an advanced orm?


We're working on a more comprehensive explanation of why EdgeDB isn't an ORM. Does EdgeDB do "object-relational mapping" under the hood — absolutely. The reason we try to distance ourselves from the category of ORMs is that the term "ORM" comes with a big bag of preconceptions that don't apply here. EdgeDB has:

- Full schema model with indexing, constraints, defaults, computed properties, stored procedures

- A query language that replaces SQL. If there's something you can do in SQL that isn't possible in EdgeQL, it's a bug. Most ORMS provide some sort of language-specific API for writing queries and generating SQL under the hood—that's dramatically different than providing a new query language.

- A full type system, grammar, set of functions and operators, etc. A set theoretic basis for all expressions in EdgeQL. https://www.edgedb.com/docs/edgeql/sets

- A set of drivers for different languages that implement our binary protocol.

EdgeDB is a new abstraction built on a lower-level abstraction: Postgres. Both indubitably fit any reasonable definition of "database".


What's the sort of timeline on GIS extensions? Great work btw, I've been following since the alphas and I'm excited to try it out :)


We need to figure out the formal API and packaging format for extensions. We're working on it.


I'm skeptical of any fancy new database, because they have always been disappointing in one way or another, but this looks really promising!

ORMs work fine for relational data, until there's a lot of edge data on the joins. I looked at the docs on mobile and couldn't find an answer, how does EdgeDB handle data on joins? E.g. GraphQL "connection" types with edges.


I only glanced through this, but would you say there’s some vague similarities (ignoring the distributed/caching aspects) to Facebook’s TAO here? For example, as a user of this, would I have to care about the Postgres schema, or is that abstracted away from me?

(Tao uses mysql and stores graph data as pretty much key/value pairs, and then a layer on top to query it)


We fully manage the underlying SQL schema for you, it's abstracted away (which will allow us to pull some tricks when we eventually introduce live schema migrations).

I haven't worked at Facebook so I'm not super familiar with TAO, only heard about it from my friends. But sometimes I pitch EdgeDB as a database with which you don't need to build your own TAO at your company. we give you much more capabilities than a typical database.


There are similarities with TAO and it's not a coincidence. Facebook engineers recognized that a better data abstraction and API was needed for productivity. EdgeDB follows the same logic.

> would I have to care about the Postgres schema, or is that abstracted away from me?

EdgeDB takes care of everything for you. You wouldn't know it's Postgres underneath unless we told you.


I love the feature set. Great work! Hard to believe this is the first major release. Are you guys planning to add C# client support?


Thank you. The feature set is indeed pretty deep. We are going to have a C# client for sure, but I don't have any ETA yet. FWIW you can interact with EdgeDB over HTTP!


it seems like it would be a great fit for creating a LINQ IQueryProvider :D Which would be amazing


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: