

Elasticsearch: dealing with complex permissions - bdauton
https://www.tuleap.org/elasticsearch-dealing-complex-permissions

======
buro9
I've delayed the move away from PostgreSQL search on an open source project.
Of course it's important to do the text matching quickly, but beyond that we
have three concepts that on their own could be done simply enough but together
they create a degree of complexity.

Thoughts on how to proceed welcome, in essence we have:

1\. Textual documents indexed.

2\. Permissions which are essentially group based (but lots of groups,
potentially hundreds), and then owners || admins || moderators get special
permissions.

3\. "Followed"/"Watched" items.

4\. "Ignored"/"Hidden" items.

Elastic Search remains our target, and we want to be able to reduce the
results on the based of permissions and the ignore list, whilst allowing a
further restriction to be "only the stuff I'm watching".

~~~
jmakeig
[Full disclosure: I’m a product manager at MarkLogic.]

MarkLogic can handle all of these requirements with aplomb. You can think of
MarkLogic as a database built with search engine technology. It uses a
document data model (text documents in XML or JSON). Each term (word, phrase,
parent-child relationship, etc.) is indexed on ingest. There are index knobs
and levers for things like diacritics, wildcards, and scalars, like you'd
expect in real search engine.

As for document permissions, they're indexed just like other terms. However,
they’re automatically ANDed on to each query in the database engine, not
application code. MarkLogic supports role-based permissions (read, write, and
execute for stored procedures) with optional Kerberos and/or LDAP
auth*n.“Ignored/hidden items” are those that a user doesn’t have permissions
to access.

"Followed/watched items" is a pretty common requirement. MarkLogic uses a
special "reverse index" to index queries along with text, values, and
structures. With regular "forward" queries, queries find documents. With
reverse queries, documents find queries. Thus anything that can be expressed
in a query can be turned into an alert. This provides some pretty powerful
match-making where a document can express its own attributes as well as those
it’s interested in matching. Hook that up to a trigger (pre- or post-commit)
and you have alerting that scales to billions of documents and millions of
queries. One of the world’s biggest news sites uses this infrastructure on a
MarkLogic cluster to handle saved searches and alerts.

------
molf
Makes no sense to me. Why not have a document for a file and a document for a
folder? Or if you must have a single document, why not a field for folder
permissions and one for file permissions? The corresponding query would be
trivial.

This seems overly complicated and I can't find any justification for it in the
article.

~~~
dylanbow
The article was maybe a bit too simplistic. What if you have more then one
folder? What if a file can be located within a whole tree of folders, just
like on a file system? Would the corresponding query still be trivial? Maybe
there's a trivial way of doing recursion in elasticsearch?

------
jnaour
elasticsearch.com works on security problems and will offer a solution: Shield
[1]

It seems that you have to pay the support to enable Shield on elasticsearch.
Probably the same business model as for marvel, free as in free beer for
development and have to pay for production

[1]
[http://www.elasticsearch.org/overview/shield/](http://www.elasticsearch.org/overview/shield/)

~~~
vaceletm
I'm not sure it fits the needs.

Based on what I read here, it seems to be the equivalent of "Mysql
users/Postgres roles" applied to ES.

I don't know how they will propose the feature but if we compare to mysql, it
wouldn't be possible to map application permissions to mysql users/privileges.
Mysql privileges are for "administration" accounts but not for the user
management of the application itself.

Anyhow, Shield doesn't isn't likely to be an open source product. Tuleap, on
its side, is 100% open source so it's a no go !

