Well this was fun to see! I'm the CTO of Oso, where we're building Polar (the second of the links mentioned https://docs.osohq.com/).
I have a few really minor nitpicks, so will try and make up for it by adding to the discussion :)
First of all, it doesn't really make sense to talk about Datalog as a good language for authorization, because much like with Prolog there doesn't really exist a single implementation of it. OPA's language Rego is a datalog variant, and Polar started out as a Prolog variant (although it's not really recognisable as one any more).
And that's an important point because otherwise it would be pretty reasonable to decide that: logic programming is good for authorization => you should go find the most battle-tested language out there and use that. For example, there's SWI Prolog [1] and Scryer Prolog [2] as two of my favourites.
To me, the thing that is mind-blowing about logic programming, is (a) how powerful the paradigm is, and (b) how concisely you can implement a logic programming language. Take miniKanren [3] which is a full-blown logic language in a few hundred lines of code.
In my mind, the original article makes a decent case that logic programming is a good fit for authorization. And just generally I love anyone bringing attention to that :)
But in my opinion the reason logic programming is such a solid foundation for authorization logic is the pieces you can build on top of it. For Polar, we've added:
- Types! So you can write authorization logic over your data types and help structure your logic. We've implemented this by simply adding an additional operator into the language that can check types
- Authorization primitives like roles and relationships (RBAC and ReBAC). These are all backed by the logical rules, so can be easily extended with custom logic
- And last but not least... the ability to convert authorization logic into SQL [4]. Which is done by having the language return constraints over any unbound (free) variables.
To me this is what makes logic programming exciting for authorization. It gives you this small kernel of declarative programming, and gives you a ton of freedom to build on top.
As a practical matter in an enterprise use case, you really want to be able to tell the user why their request is denied and what they can do about it. One thing I really appreciate at my employer is that any "access denied" message provides an on-ramp to the process of either amending my request to something that's within my permissions, or else getting the requisite approvals.
How do you get this kind of explainability when the authorization decision involves such advanced magic as logic programming languages?
Although that is probably not very idiomatic datalog. That is, I don't know if you can "call" a literal bound to a variable like I do in the last line of the program by declaring "Reason" after binding Reason to "owner(Resource, User)". If you do that in Prolog, the Prolog engine will prove the literal bound to "Reason" and execute the "owner" program. I don't know if that works in datalog. You can of course do all this without call-magic:
Historically, mechanisms of explanation very similar to the one above were common in expert systems of old, at least the ones implemented in Prolog. There used to be a nice free e-book describing how to build expert systems in Prolog in a quick-and-dirty fashion, but I can't find it online anymore. There's this book by Amzi Prolog though:
And this has a section on "Explanation" that describes a tracing mechanism based on the classic box-model debugger of Prolog engines. The technique can be applied to datalog and the program doesn't have to be an expert system, just to make it clear.
I think security systems tend to tell as little as possible in these cases because to do otherwise reveals something of the internal structure of a system, potentially indicating vulnerable points.
IIRC a post on naming servers where someone said if they knew one server was called Bilbo then they had a pretty good idea what the other servers might be named after. Seemed a good point.
This seems misguided. If you are taking principle-of-least-privilege seriously (and it seems that's what people are trying to do with very sophisticated authz systems) then legitimate users will bounce off the edges of their privileges very frequently. If they can't figure out what those edges are or how to expand the frontier when necessary, they're not going to be able to accomplish necessary tasks.
A policy of "stay the hell away, don't even ask" only works if you're drawing the boundary very far from the user's legitimate sphere of influence.
Dunno, just reporting what I understood to be. It may be a distinction can be made between company-internal and company-external attempts at access, but that is pretty risky.
Any solution to the valid point you have made would have to be organisational.
But the saying goes something like "security annoying, good security is very annoying" - anyone know the original?
This is definitely a consideration for authorization. To do it well you need to be able to distinguish when you want to reveal information.
For example, our APIs push people towards returning a 404 if the user doesn't have read access (you shouldn't even know it exists!), but a 403 if you can read but not edit.
You would probably want to do similar here -- only return a reason if the user is allowed to know it exists, but they don't have access. (e.g. it might be "read_exist" vs "read_details" -> "you cannot read the details of this document because you are not a member of this folder".)
Puzzled: why is the conversion to SQL either interesting (SQL being a logic language so if disregarding recursion[1] they should map straightforwardly[2]) or of business value?
[1] yeah, CTEs, I know
[2] yeah, words are cheap from my end, I know
Edit: additionally, and for curiosity: the article covers use of datalog but no underlying security model (eg. consistency of access to resources so you don't have one way blocked to you but can go around via another route, as I know of one homebrew system where you could do that). How do you deal with that? I understand lattices can be used for this (TBH I have only done a little reading on these).
- potential to push policy execution pushdown to the db, such as for authing bulk oriented queries vs record level
- potential for use with RLS
- potential for analysis tools
In practice it seems boring, but in principle, interesting. I'd love to see actual staged compilation generation & integration (so no runtime service/runtime) and, for analysis, conversion to z3/SMT. We went for Casbin bc of the former, and seems a simpler leap for the latter.
I have a few really minor nitpicks, so will try and make up for it by adding to the discussion :)
First of all, it doesn't really make sense to talk about Datalog as a good language for authorization, because much like with Prolog there doesn't really exist a single implementation of it. OPA's language Rego is a datalog variant, and Polar started out as a Prolog variant (although it's not really recognisable as one any more).
And that's an important point because otherwise it would be pretty reasonable to decide that: logic programming is good for authorization => you should go find the most battle-tested language out there and use that. For example, there's SWI Prolog [1] and Scryer Prolog [2] as two of my favourites.
To me, the thing that is mind-blowing about logic programming, is (a) how powerful the paradigm is, and (b) how concisely you can implement a logic programming language. Take miniKanren [3] which is a full-blown logic language in a few hundred lines of code.
In my mind, the original article makes a decent case that logic programming is a good fit for authorization. And just generally I love anyone bringing attention to that :)
But in my opinion the reason logic programming is such a solid foundation for authorization logic is the pieces you can build on top of it. For Polar, we've added:
- Types! So you can write authorization logic over your data types and help structure your logic. We've implemented this by simply adding an additional operator into the language that can check types
- Authorization primitives like roles and relationships (RBAC and ReBAC). These are all backed by the logical rules, so can be easily extended with custom logic
- And last but not least... the ability to convert authorization logic into SQL [4]. Which is done by having the language return constraints over any unbound (free) variables.
To me this is what makes logic programming exciting for authorization. It gives you this small kernel of declarative programming, and gives you a ton of freedom to build on top.
[1] https://www.swi-prolog.org/
[2] https://github.com/mthom/scryer-prolog
[3] http://minikanren.org/
[4] https://www.osohq.com/post/authorization-logic-into-sql