
Openrsync imported into the tree - protomyth
https://undeadly.org/cgi?action=article;sid=20190211081518
======
geofft
> _The actual work of porting, however, is matching the security features
> provided by OpenBSD 's pledge(2) and unveil(2). These are critical elements
> to the functionality of the system. Without them, your system accepts
> arbitrary data from the public network. ... rsync has specific running modes
> for the super-user. It also pumps arbitrary data from the network onto your
> file-system. Do you want that running without specific mitigation in place?_

This is a confusing claim. What exactly does "accepts arbitrary data from the
public network" mean? (Most servers do that, they just choose not to _process_
the data without additional validation.) And in what way is it critical to the
_functionality_ of the system?

Is the claim that, after calling pledge() and unveil(), the openrsync process
is happy to satisfy arbitrary read/write requests from the other side of the
connection, and so without them it is insecure?

Does openrsync view peer-induced memory corruption after pledge() or unveil()
as a vulnerability? Or is the idea that the attacker can already "pump
arbitrary data from the network onto your filesystem" and that the attacker
gaining control flow is not a meaningful escalation of privileges?

My impression is that pledge() and unveil() are _hardening_ tools, intended to
limit the damage from a process that has already gotten out of control (in the
same way that e.g. running Apache as non-root does not mean that you're
actively fine letting attackers run code as www-data). Is that impression
wrong? Is openrsync using them for the basic functionality of making sure that
a file is only being rsynced to the filename given on the command line?

~~~
anjbe
I’m trying to understand your question.

Typically, a process once compromised can do all sorts of things: touch files,
access the network, execute programs, and so on. Among other things, OpenBSD’s
security culture focuses on mitigating the damage done by compromised code
through development practices such as privilege separation.

Traditionally this was done by splitting functionality into multiple
processes, each serving a specific purpose such as doing network communication
or parsing configuration, and dropping privileges in any way possible such as
chrooting and switching to a dedicated user. Thus the attack surface is
reduced, and the potential damage done by a compromised (sub‐)process is
reduced as well.

pledge() and unveil() are the latest evolution in OpenBSD’s technique.
pledge() whitelists syscalls, and unveil whitelists files that can be
accessed.

So your process reads this arbitrary data from the public network. You
validate it through some function and pass the data on to the next stage of
your program. But what if there’s a bug in your validator, and your process
gets compromised?

If your process hasn’t had its capabilities reduced, the attacker can do
practically anything, especially if the process has superuser privileges.

But if the program uses a multi‐process privilege‐separated architecture, your
validation process can’t access the filesystem or the network and isn’t
running as root. If it tries, the kernel will kill it for violating its
pledge. All the compromised process can do is pass malicious data through
whatever interface you’ve provided between your validator and filesystem
processes, hopefully an interface that is simple, well‐defined, and
well‐audited.

What if your filesystem process gets compromised? With pledge() it can’t
access the network or execute external code. With unveil(), even its file
accesses are limited to the files whitelisted earlier in the program. It can’t
read your SSH keys or delete your photos.

Certainly, if the process can be compromised that’s a bug that needs to be
fixed. But we see new bugs constantly in the software we use every day. It’s a
safe bet to say we will encounter more. By using a secure architecture, the
damage these bugs can cause is drastically reduced.

There’s a really good description and demonstration of privilege separation in
another project by Kristaps, acme-client (a Let’s Encrypt/certbot
alternative): [https://kristaps.bsd.lv/acme-
client/](https://kristaps.bsd.lv/acme-client/)

Another such project is Google Chrome, which uses pledge() and unveil() on
OpenBSD.

~~~
geofft
My question is that the README implies that pledge() and unveil() are
_required_ for _functionality_ , to the point that porting to an OS without
support for that is an inherently questionable idea. That certainly isn't true
of Chrome (I run it on non-OpenBSD and there are no functionality / security
issues in doing so). That isn't true of OpenSSH, which also supports pledge()
on OpenBSD and still is pretty secure on other OSes.

I do expect this is structured as you describe - that it has a validator, and
that it uses these kernel features as additional hardening if the validator
has a bug. But I would not describe that as _requiring_ pledge() / unveil()
and certainly not requiring it _for functionality_. So I don't know what the
author means.

And I am worried, in particular, that the author means that the validator is
not very strong, and the bulk of the validation is that it unveils the
filesystem to the files it's supposed to write to and then blindly trusts the
input, on the grounds that the worst the remote side could do is corrupt files
but it could have just have sent different contents for the files in the first
place. This seems unlikely to me, but I'm having trouble figuring out an
alternate interpretation.

~~~
anjbe
> My question is that the README implies that pledge() and unveil() are
> _required for functionality_ , to the point that porting to an OS without
> support for that is an inherently questionable idea.

Knowing Kristaps, he probably considers strong privsep and privdrop basic
functionality. That is after all why he developed acme-client in the first
place; he acknowledged at the time the plethora of “lightweight” certbot
alternatives but was more concerned with security architecture.

> That certainly isn't true of Chrome (I run it on non-OpenBSD and there are
> no functionality / security issues in doing so). That isn't true of OpenSSH,
> which also supports pledge() on OpenBSD and still is pretty secure on other
> OSes.

Chrome uses different techniques depending on the platform. On OpenBSD it uses
pledge() and unveil(), while on Linux it uses seccomp. Kristaps isn’t a fan of
seccomp’s complexity, as he mentions in the readme: “Linux's security
facilities are a mess, and will take an expert hand to properly secure.” He’s
not suggesting it can’t be done, and the Google Chrome team in particular has
the kind of expertise he’s talking about.

For projects of less‐than‐Chrome scale, though, Kristaps feels that seccomp is
too difficult: [https://github.com/kristapsdz/acme-client-
portable/blob/mast...](https://github.com/kristapsdz/acme-client-
portable/blob/master/Linux-seccomp.md)

> And I am worried, in particular, that the author means that the validator is
> not very strong, and the bulk of the validation is that it unveils the
> filesystem to the files it's supposed to write to and then blindly trusts
> the input, on the grounds that the remote side could just have sent
> different files.

I don’t understand this interpretation. It’s not what I got from the readme at
all. What kind of validation do you expect Kristaps to be overlooking?

------
hawski
From a comment on the site: "(...) its (original rsync's) compressed manual
page is almost as big as the compressed openrsync sources (...)"

It's license (ISC ofc.) and size makes it great resource to study rsync. I
would like to have Dropbox on my phone as legendary combination of rsync and
cron. It may be nice to have a port to Java so it would work without JNI, but
maybe that's only my fetish.

~~~
ComputerGuru
I just want to point out that rsync is, in fact, no longer ISC licensed but
rather GPL (v3, at that), which is likely a big part of the reason this new
implementation even exists.

~~~
meruru
rsync was never ISC licensed afaik. The parent is referring to openrsync's
license.

~~~
tinus_hn
Rsync was developed by the Samba people, it is under the same license (GPL).

------
accrual
Very cool news. rsync(1) is one of the first things I install on a new OpenBSD
instance.

Tangentially related, I've been using Time Machine-like wrapper [0] around
rsync(1) for a few years. It's very helpful for maintaining snapshots of my
home directory.

[0]
[https://blog.interlinked.org/tutorials/rsync_time_machine.ht...](https://blog.interlinked.org/tutorials/rsync_time_machine.html)

~~~
davewongillies
I use rsnapshot [0] for the same thing.

[0] [https://rsnapshot.org/](https://rsnapshot.org/)

------
amaccuish
For those wondering what this is, see
[https://github.com/kristapsdz/openrsync](https://github.com/kristapsdz/openrsync)

~~~
benatkin
I'll try explaining it. It's a new implementation, from scratch (clean room)
of rsync, which will become the new rsync in OpenBSD. The tree that it's been
imported into is the openbsd cvs tree that contains openbsd, openssh, opencvs,
and other major projects.

------
CaliforniaKarl
I would not be surprised if, in a few years, this becomes one of the CLI tools
installed on macOS, either as part of the default install or as part of the
Xcode CLI tools.

~~~
riffraff
Why? MacOS has a bunch of GPL stuff, such as bash, IIRC.

~~~
__david__
And it already has actual rsync.

~~~
avar
It has 12 year old rsync due to Apple not wanting to ship anything that's
GPLv3: [https://bayton.org/2018/07/how-to-update-rsync-on-mac-os-
hig...](https://bayton.org/2018/07/how-to-update-rsync-on-mac-os-high-sierra/)

------
AdmiralAsshat
Interesting. This is the first project I can think of where a clean-room
implementation was done so that a project could use a _less_ free license
("free" as defined by the FSF).

Does anyone else know of instances where a company did a clean-room
implementation of a previously FOSS tool so that they could make a
paid/proprietary version? Usually it goes the other way.

~~~
protomyth
How is the ISC (version of BSD license used by OpenBSD) less free than the
GPL3? This is very far from a "paid/proprietary" version.

~~~
meruru
Using "less free" or "more free" in this context just leads to pointless
semantic debates. What happened is that someone made a clean-room
implementation of a copyleft program in order to have it available under a
copyfree license. Both licenses are Free.

[http://copyfree.org/policy/copyleft](http://copyfree.org/policy/copyleft)

~~~
TeMPOraL
First time I see this, thanks. The website isn't explicit about this point,
but from what I gather, "copyfree" isn't viral in the way GPL is. It seems to
provide "Free as in Freedom", but unlike GPL, doesn't _protect_ that freedom
from being immediately taken away.

------
meruru
I hope this ends up being a lot simpler and easier to understand than the
original rsync. The rsync manpage is way too long.

~~~
gmueckl
Rsunc solved a complex problem that comes in many nuanced variants. It may
seem trivial at the outset, but it is actually not. So I don't think that
rsync has many features that are somehow unnecessary or bloat.

~~~
meruru
Well, the manpage for this is looking really good and it already has almost
everything that I care about. The -a option isn't in yet, but it's in the
TODO.

[https://github.com/kristapsdz/openrsync/blob/master/openrsyn...](https://github.com/kristapsdz/openrsync/blob/master/openrsync.1)
[https://github.com/kristapsdz/openrsync/blob/master/TODO.md](https://github.com/kristapsdz/openrsync/blob/master/TODO.md)

I hope the -c and maybe -X option make it.

------
joppy
What does a "clean-room implementation" mean?

~~~
Tor3
The first (well-known) 'clean-room' implementation was when Phoenix
implemented an IBM PC-compatible BIOS by having one team studying the IBM
source (which was available), then writing up a specification for how it
worked, handing that specification over to somebody else (they were Phoenix'
legal team, IIRC), which then handed the specs over to another team that had
never seen the IBM source. They sat down in their "clean room" (b/c it wasn't
tainted by actual IBM source) and implemented a BIOS from specs only. In that
way Phoenix was protected from any claims of copyright infringement: Nothing
was copied, and the people writing the code had never seen the original
source.

In that particular case the specs were reverse-engineered from actual source,
but that's not a necessary part of the process. It's more common to have one
team study the protocol, data going over the wire, disassembling, etc, then
use the knowledge gained to write specs, and then another team implements the
equivalent functionality from specifications only.

------
gerdesj
Is it any better than rsync?

~~~
rstuart4133
All openrsync implements is the equivalent of a fast "cp -a" across the
network, plus it can also remove files if they don't exist. rsync does much
more and over the years I've used most of it, so there is no way I would use
openrsync. The upside is the manpage of openrsync isn't that much more complex
than cp, which is a definite bonus if that's all you are doing.

The only thing I would change about rsync is it's default, which IMO should be
to copy all meta data supported by both sides. Ie, the default should be to
make the destination as similar to the source as possible. It's default is to
only copy the data, and you must add options to say what else you want copied.
To make matters worse you can just add every option because if you say you
want to copy something not supported by one side of the other it errors out. I
may have missed it as I am reading the man page source, but openrsync didn't
seem to change that.

~~~
kristapsdz
No. openrsync implements the rsync protocol. It doesn't have all of its
options, but the protocol is what it is. Do you have any idea what you're
talking about?

------
theamk
It is interesting that "open" part of openrsync refers to license -- BSD, vs
original rsync's GPL

It's not often I see "open" to mean "non-GPL" in software :)

~~~
protomyth
I get the feeling the "open" part was because they were hoping to get it
included in OpenBSD like OpenSMTPD, etc.

~~~
meruru
Yeah, that was my assumption too. It's coming from the OpenBSD community, so
openrsync it is.

