
How Sandstorm Works: Containerize data, not services - paulproteus
https://sandstorm.io/how-it-works
======
nickpsecurity
This is very interesting. It combines a number of older ideas. Even the core
idea behind their service, IIRC, existed in commercial products and academic
research at various times. The security model looks like how MILS Architecture
systems were describes for servers combined with capability work. I also like
that they've heard of and use PowerBox's. :)

Worth watching or following up on later maybe.

~~~
kentonv
Thanks!

Indeed, there have been a bunch of promising capability research projects that
never quite made it to production. Sandstorm's Cap'n Proto is based on Mark
Miller's E language and CapTP protocol, while the Powerbox concept derives
from Marc Stiegler's CapDesk (although of course many production systems
contain narrow-purpose powerboxes "by accident"). Both MarkM and MarcS are
friends of the project and have provided review and advice.

Sandstorm, though, is not a research system. I like to think we have been able
to make capabilities practical by being willing to step away from the purist
philosophy when it makes sense. E.g. CapDesk required software to be written
in E (IIRC), meaning the world had to be rewritten from scratch, which simply
wasn't going to happen. Sandstorm compromises by allowing legacy native code
to run in fine-grained containers. As a result we are able to deliver real,
useful applications to thousands of users today. :)

~~~
gohrt
Hmm, it's hard to put faith into a system that spells its own name wrong on
its home page:
[http://erights.org/elib/distrib/captp/index.html](http://erights.org/elib/distrib/captp/index.html)

~~~
nickpsecurity
Then put faith in the fact his prior deliverable, the DARPAbrowser, got
favorable reviews during security evaluation sponsored by DARPA. Just a few
minor fails with major wins throughout.

[http://combex.com/](http://combex.com/)

------
xg15
The general idea is very interesting, but the drawback I see is that this
architecture makes it impossible for apps to do work that accesses multiple
grains.

Search would be the most obvious example. This was solved pragmatically by
implementing it in the framework and not in the apps, but that approach
doesn't seem to scale for me. What if certain types of grains require
application-specific indexing? What if there are other tasks that cross grain
boundaries but only make sense for a specific app?

Additionally, this limitiation makes it critical to get the definition of what
is a grain right from the very start, when you design your app - once you
realized you got the granularity wrong, I figure it would be very hard to
split or merge existing grains to change it.

If I remember correctly, the Sandstorm documentation itself had examples for a
word processor and for a photo editor app. However, while a grain for the word
processor represents a single document, a grain for the photo editor is a
photo gallery. So choosing granularity is not always trivial.

~~~
kentonv
Yes, there are certainly some patterns that become challenging under the grain
model (and some patterns that become easier).

Note that nothing is impossible. You can always connect grains to each other
using the powerbox (when it's ready, which will be very soon). Of course, if
nearly every grain of some app needs to talk to all the others, that will get
tedious. So the next thing you can do is fall back to course-grained apps, or
write an app that creates its own grains internally and therefore can talk to
all of them if it needs to. In this case, you're giving up a lot of the
advantages of the granular model, but you gotta do what you gotta do. What you
end up with is no worse than the status quo, at least.

With that said, in practice we have found that aside from a small set of
common features -- e.g. search and backup -- these kinds of problems really
don't come up much. Most kinds of inter-grain communications do in fact fit
nicely into the powerbox model. By adding platform features to cover the
things that don't fit, we can cover, say, 90% of use cases without
compromising the model, and that's a pretty big win.

And frankly, these features usually make far more sense as platform features
than as app features anyway. A search index that covers _all_ your apps is a
lot more useful than having to search each app separately. A backup system
that backs up _all_ your apps is one you'll be much more likely to actually
configure. Etc.

Also note that these kinds of systems that need access to "everything" are a
security liability, and so probably not the kind of thing that you want every
app implementing in their own special way. By moving them into the platform,
we can make sure they are designed with restrictions that make them secure.
For example, the backup system should only get access to encrypted copies of
data. The search index should be prohibited from exfiltrating data in any way
except as search results displayed to the user. Etc.

Anyway, the point is, yes, there are challenges, but with a pragmatic
approach, they can be minimized, and the gains far outweigh the losses.

~~~
augb
Do you see sandstorm as ever hosting medium-to-large scale, externally-facing
websites (as opposed to personal or intranet-type sites)?

~~~
kentonv
Some day, yes. However, for now we are focused on the use case where the
infrastructure reports to the user rather than the developer. Developers ship
packages, users choose where to run them (whether on their own machines or a
cloud host). We feel we have a lot more value to provide in this use case than
we would have in the SaaS infrastructure market.

Also, if we can make it just as easy to use apps on a user-controlled server
as it is to use SaaS, then SaaS no longer makes sense -- it's biggest selling
point is that it's easy. I believe a shift back towards more decentralized
infrastructure would be a very good thing, so that's what we're aiming to
create.

~~~
augb
Thank you for your reply.

> I believe a shift back towards more decentralized infrastructure would be a
> very good thing, so that's what we're aiming to create.

I definitely agree. Thank you for your leadership in this direction.

------
middleclick
I love Sandstorm, but IMO, the requirement of a wildcard certificate is a
small drawback in setting it up on my server. I know I can use sandcats.io but
if I am using something like Sandstorm, I want complete control over my data,
including domains. (I am now using sandcats though so there's that but I wish
I could get a wildcard cert for free or from Let's Encrypt :)

~~~
kentonv
I've talked to the Let's Encrypt people on several occasions about this. I
think they will support wildcards eventually. The details are surprisingly
hairy, though. In the meantime, we'll keep providing free certs under
Sandcats.

I too wish we didn't have the wildcard requirement. Unfortunately, same-origin
policy being what it is, there's really no way for us to get away from the
wildcard requirement without losing most of our security gains.

You've probably seen this already but for others wondering about the details:

[https://docs.sandstorm.io/en/latest/administering/wildcard/](https://docs.sandstorm.io/en/latest/administering/wildcard/)

And a sample of security problems that our security model (of which the
wildcard is an essential part, since it enables fine-grained isolation) has
helped protect against:

[https://docs.sandstorm.io/en/latest/using/security-non-
event...](https://docs.sandstorm.io/en/latest/using/security-non-events/)

Thanks for using Sandstorm!

~~~
tokenizerrr
Once the rate-limits for LE relax, what about on-demand renewal of a SAN
certificate?

~~~
kentonv
Sorry, that won't work. Sandstorm needs a new hostname every time you open a
document (that's a lost of hostnames), and to provide any CSRF mitigation it
needs to be a secret (where anything you list on the certificate immediately
becomes public knowledge).

Be sure to read the FAQ in the doc:

[https://docs.sandstorm.io/en/latest/administering/wildcard/](https://docs.sandstorm.io/en/latest/administering/wildcard/)

