Hacker News new | past | comments | ask | show | jobs | submit login
SpiceDB Is Open Source (authzed.com)
197 points by jzelinskie on Sept 30, 2021 | hide | past | favorite | 43 comments



I figured that this is usable directly as a service, we only bring our own authentication layer (password + 2FA) and use SpiceDB to check the permissions?

However, the docs mention that we're required to have an authzed account, but this is not required when hosting it ourselves?

Would be nice to have a step-by-step guide for the self-hosting so that it's apparent how the flow goes.

PS: Would also like to use it in Cloudflare Workers, but they don't support gRPC yet (Only REST and HTTP/1.1).


Yes you can use it directly as a service with your own authentication. There is no Authzed account required.

The docs are built around the idea of using Authzed.com, the hosted SpiceDB solution for protecting your first app. We will try to put together a guide around self-hosting SpiceDB.

For usage from CloudFlare workers and other non-http/2 friendly environments, we're considering adding a gateway which can speak regular http natively. Bolt-on solutions like grpc-web or more invasive solutions like adding twirp support will be considered. Do you have a preference?


If you're talking about: https://pkg.go.dev/github.com/improbable-eng/grpc-web/go/grp...

Then it looks good to me, as long as it's HTTP/1.1 or WebSockets, it will should work everywhere.

Is this something you can add soon? We're in a refactoring stage and would love SpiceDB to be a part of it all. (I also created an issue in the repo)


Looks good.

1. How does this compare with Ory Keto?

   https://www.ory.sh/keto/docs/
2. Can it be nativity (I can integrate in Postgres SQL) integrated with Row Level Security in Postgres?

3. Any interest in supporting TiDB as a backend?

Edit: Number questions.


Disclosure: Another founder of Authzed, here.

Also, in case anyone was wondering, yes, SpiceDB is reference both Zanzibar and a popular sci-fi novel that will be in cinemas shortly.

>1. How does this compare with Ory Keto?

The blog post has a section dedicated to how SpiceDB improves on the Zanzibar paper[0]. Keto was originally a different project that has been rewritten to be Zanzibar-like. It is missing lots of the core functionality that I'd personally consider requirements to really be faithful to the paper: horizontally scalable, bounded staleness (Zookies), and userset rewrites, for example. ORY also develops a whole identity suite, while we're attempting to stay laser-focused on permissions and maintain vendor-neutrality.

>2. Can it be nativity (I can integrate in Postgres SQL) integrated with Row Level Security in Postgres?

We have been exploring the space between integrating deeply with Postgres from both entrypoints (SpiceDB->Postgres and Postgres->SpiceDB). For the former, we're playing with representing applications' Postgres databases as a read-only SpiceDB datastores. For the latter, we've checked out Postgres Foreign Data Wrappers, but they don't seem portable to the cloud hosted services like RDS. We're continuing to look for clever solutions, if anyone reading this bumps into any.

>3. Any interest in supporting TiDB as a backend?

I've created an issue for this[1].

[0]: https://authzed.com/blog/spicedb-is-open-source/#everybody-i... [1]: https://github.com/authzed/spicedb/issues/154


RDS supports `postgres_fdw` on pretty much all versions of vanilla PG and Aurora PG. This should be sufficient to implement what you described.

Though if you wanted to go one step further you could use `postgres_fdw` to connect to a bunch of stateless PG boxes running OSS PG and have those load non-supported FDWs in order to support all sorts of backends like `mysql_fdw` and friends. Adds a proxy hop but makes it possible to do all sorts of very cool things. Hit me up if you want to talk more PG/Zanzibar things.

PS: I wrote this to get an idea of Zanzibar but I haven't used it in anger yet: https://github.com/josephglanville/zanzibar-pg


> For the latter, we've checked out Postgres Foreign Data Wrappers, but they don't seem portable to the cloud hosted services like RDS.

My experience is that teams who want the full power of postgresql run their own compute nodes because of limitations like this.

It's a trade-off, as almost everything is, but I suspect the sort of company who's buying in to things like RLS is also the sort of company who're reasonably likely to have already migrated off RDS in digust.

I could easily be wrong, of course, but at the very least I think it's worth asking your users before assuming that excluding RDS would be a problem for the people who would want the feature in the first place.


> For the latter, we've checked out Postgres Foreign Data Wrappers, but they don't seem portable to the cloud hosted services like RDS. We're continuing to look for clever solutions, if anyone reading this bumps into any.

Probably off the mark here but...

View -> Function -> Table (Atomic Permissions) - On Miss -> Rest Call to SpiceDB

RLS:

CREATE POLICY "Resources are updateble by certain groups of users." ON public.resources for UPDATE USING ( EXISTS ( SELECT FROM atomic_permissions_view WHERE (user_id = auth.uid()) and (action_enum = 'modify') and (resource_id = id) ) );

Where resources inherit from resources tables..


Thanks! :)


Could this product also do row level securiry? For instance by finding all user roles and then generating where clauses to inject to the query automatically?


How can one maintain in sync the application database with the permissions database?. Suppose there is a project which uses a postgres database and a spiceDB (backed by a separate database).

This project is a "github clone" and a user has decided to delete a repository with all of its related objects. In postgres these related objects delete automatically in cascade. How can I do the same in SpiceDB to avoid leaving garbage tuples behind?


There are a variety of synchronization methods, and the choice of which is very application specific.

The simplest choice to get started with is to add it directly to your application logic. The main pro of this solution is that you can directly handle the semantics of when and how to add/remove permissions. For example, it's sometimes safe and idempotent to write permissions for data that may fail to commit, because permissions for the data will never be queried in the future. You can also directly control the management of ZedTokens and effectively eliminate the new enemy problem[0]. As far as I know, this is how Zanzibar is used within Google.

Solutions which tail a primary source of truth, such as the database logs, kafka topic, event queue, etc. may perform better in some applications where eventual consistency on permissions is ok. With these systems you are adding some delay in between writing the data and the permissions, which can open you up to new enemy or require UX compensation like adding data on the client side and hoping the permissions replicate before they are ever queried for real.

We will work to write a guide that talks about some of these architectures and their pros and cons.

[0] https://authzed.com/blog/new-enemies/


You are effectively "registering" the permissions on a separate service, so you need to handle updating those accordingly.

In the case of deletions, for example, you would need to track what entities you are deleting and call the service to update/delete the related permissions.

You could do this with some worker subscribing to postgres NOTIFY if you wanted to keep it closer to your DB layer (but that is async, and you may end up losing changes in the case of worker outages), or you could do it in your app where you would have more control over the transaction.

EDIT: I'm not associated with SpiceDB/Authzed in any way.


Sounds like a great use case for log-based change data capture actually. That way, you won't miss any updates also in case of outages, network issues, etc. Once everything is up and running again, such pipeline would resume from where it left off before. Plus, (unlike anything based on application-side triggers) no need to modify your app or the risk of missing changes e.g. done directly on the DB via SQL.

Disclaimer: I work on Debezium, an open-source CDC solution


So debezium is offering a nice way to listen to the replication logs? Awesome. Will have to check that out :)


Yes, exactly. Debezium (debezium.io) provides log-based CDC for a variety of databases, exposing a unified event format as much as possible. It can be used with Kafka (Connect), but also stand-alone or as a library embedded into your app.


What is your business model?

I'm really excited about this: building a scalable Access Control is a foundational challenge of cloud-scale systems, and I'm happy to see a new contender.

Is this like an "Open-Source Core" model, where the basic core platform is open-source but all the extra features to make it usable (to put it bluntly) in a given org are what you're selling?

(in my org it takes ~2s to lookup group membership (& thus permission) of a user on a cache miss, which is just shocking. I'd love it if we migrated to SpiceDB, but of course at our scale that won't happen anytime soon)


Disclaimer: We're a startup so things are obviously subject to change!

Right now, the business model is around running a single SpiceDB service globally so that you as a user don't have to figure out how to deploy and manage the service at scale.

This service is especially tricky to run because of the need for low latency, high availability, and a single globally replicated service. For an individual company to run HA clusters all over the world would be cost prohibitive, but when all Authzed.com customers share the cost it becomes more reasonable.


That's pretty clear -- and best wishes for a successful outcome! However, given the history of other companies, the fundamental follow-up question would be:

What are your plans for responding to large companies/cloud providers/... offering the same service based on your software?


Apache 2.0 license.

Call me pessimistic, but I wonder how soon we will see blog post in the style "It was wonderful journey" that will announce AGPL v3 or other style of "shared source" license.


For a project like this I would say pretty unlikely because the API is already free for anyone to implement and competing implementations already exist.

The main difficulty with a system like this is the subtlety of operating it at scale with consistent performance and global availability.

The software itself only accounts for a small portion of that.

If AWS and friends were to pick this up and offer it then they would be putting in all of that work too so there would be a lot less they get for "free" comparatively (unless they offer an inferior product). i.e it's a substantially different situation from MongoDB, etc where the software is 99% of the product.


This is a pretty good analysis of the situation I think. The only thing that I would like to point out is that only a description of the API is available via the Zanzibar paper. We, and other clean-room implementations, have had to fill in the gaps around what the APIs actually look like. We've even made some of our own improvements such as resource lookup by subject[0] and a filter-based delete that mirrors the read API[1].

[0] https://authzed.com/blog/acl-filtering-in-authzed/

[1] https://buf.build/authzed/api/docs/main/authzed.api.v1#authz...


AGPL v3 is proper open source.


It might be free software but for many people it'd conflict in practice with the OSD definition.

If you like Free Software in a Stallman sense, sure, but in these cases we shouldn't kid ourselves that they're doing it for altruistic reasons at all.

https://opensource.org/osd

9. License Must Not Restrict Other Software

The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.


I don't see the conflict there, in theory or in practise.

AGPLv3 and other GPL variants do not “place restrictions on other software that is distributed along with [them]”, there is no requirement or expectation that “all other programs distributed on the same medium must be open-source”. They at most insist that derived works (software that uses part of the code, by inclusion or linking) are covered by the license.


Awesome to see an open source project in this space! However, the docs say the service is production ready and v1, yet there seem to be no docs on how to run the open source version (except for a brief homebrew example in the README). So how do I run this? For example with a DB?

I also noticed that the v0 API is deprecated and discouraged but the v1 API is „work in progress“. To me, that doesn’t inspire confidence that the product is not going to have some breaking changes in API and design?

Is there something I am missing?


We have it on our backlog[0] to add documentation on how to run it with Kubernetes, at least.

Very sorry about the "work in progress" moniker on our v1 APIs. We developed the v1 APIs based on our real-world experience running the Authzed.com service for ourselves and others, and we just finished their implementation and porting everything over. We will remove that warning ASAP!

One thing to note though, is that we thought about calling our v1 APIs v5 or something, because we didn't want to give the impression that they will never change. We intend to continue to improve the APIs, sometimes in backward incompatible ways, but will just keep the existing APIs around for a very long™ period of time, similar to how Stripe handles their API versioning.[1]

[0] https://github.com/authzed/spicedb/issues/147

[1] https://stripe.com/docs/api/versioning


Oops; I posted on this earlier (by a few hours) submission [1] on SpiceDB story.

Reposting (with small edits) here for visibility by founders:

Super cool. I’ve been looking at other groups trying to implement such systems, many of which appear very nascent or otherwise missing key features.

This looks like it solves a lot of problems for me, a solo developer, trying to build a enterprise-targeted product as a side project (whether that's a fool’s errand is another discussion). In particular, correct and efficient implementation of PER OBJECT permission seems like a hard problem, and many other (external) solutions merely control by object type. Building per object control into the product (integrated in the code itself, with no external gateway/proxy/layer) requires really detailed thought and planning related to ACL, group membership, etc., and any change in plans later means changes to potentially deeply integrated code.

QUESTION: Do you see greater value for (a) large teams with huge and complex products involving many moving pieces, that need a consistent AuthZ layer, or (b) small teams that need robust AuthZ and don’t have the time and human power to develop it themselves? (Or c, false dilemma, equally great for both )

[1] https://news.ycombinator.com/item?id=28707072


How is permission introspection on something like this? So not just "does user x have this permission on object y" but "why does user x have permission this permission on user y?". For something like cascading folder permission insight and etc...


Disclosure: I'm also a founder of Authzed

The information you are describing can be retrieved via the Expand API [0], which returns a tree containing all of the relationships that are reachable from a permission, as well as how they were reached.

For example, if you have a schema with a permission like so:

    definition resource {
      relation parent: resource

      relation writer: user
      relation reader: user

      permission view = reader + writer + parent->view
    }
An ExpandPermissionTree call for the permission `view` on a resource will return a tree that contains the users with view access to that resource, each set of users placed under `reader` or `writer` with a reference to the containing resource, so you know how a user was granted the `view` permission.

[0]: https://buf.build/authzed/api/docs/main/authzed.api.v1#Expan...


I've done some thinking in the ACL space: https://github.com/theronic/eacl

(totally beta software - don't use in production)


ELI5: What is Zanzibar?


Jake, Authzed co-founder here.

Zanzibar is a global highly available distributed permissions system used within Google to power application permissions for things like Maps, YouTube, Calendar, Doc/Drive, etc. They wrote about it in a paper[0] that was widely discussed on HN at the time[1].

The service stores relationships between people, other people, and data, in a giant directed graph. There are primitives for querying and processing that graph to make permissions decisions. The majority of the rest of the engineering effort is spent on replicating the data globally and caching permissions decisions regionally and locally, since permissions don't lend themselves very well to sharding or siloing along service boundaries.

For the 5+ explanation, I wrote a little bit about my digestion of the paper and what the important parts are here[2].

[0] https://research.google/pubs/pub48190/

[1] https://news.ycombinator.com/item?id=20132520

[2] https://authzed.com/blog/what-is-zanzibar/


going from link #2 it sounds like it is a highly scalable engine that does the following:

1. Stores arbitrary state related to permissions

2. Customizable rules that may refer to any state in #1

3. A service which allows clients to query if User U should be given permission P on artifact A, based upon #2

[edit]

Actually it sounds like #1 is actually a directed graph, not arbitrary state.



I found this to be a good, concise description of the problem space, and Zanzibar's approach: https://www.youtube.com/watch?v=1nbSbe3kw2U


An easy to digest article: https://authzed.com/blog/what-is-zanzibar/

tl;dr: Highly scalable RBAC/ABAC


That ACL filtered list seems like it could be super useful. Its extremely horrid if a UI is full of controls you don't have permission to use.


Congratulations! I have looked in depth at Ory Keto some time ago. Will be interesting to take this for a spin and see how it compares.


How did it all get started? Did you hack at this in your free time and eventually get an MVP out to share with investors?


Sort of, except instead of free time it was full time. After reading the Zanzibar paper, we had a strong conviction that this solution could address many of the authorization challenges we've had in our past products and roles.


Why Golang?


With our roots at CoreOS and in the OpenShift org of Red Hat, we've got a solid amount of experience building distributed infrastructure in Go. We toyed around with the idea of using Rust, as the reliable software building darling du jour, but ultimately we felt like with this team, at this time, we would create a better product in Go.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: