Hacker News new | past | comments | ask | show | jobs | submit login
Rsync client-side arbitrary file write vulnerability (openwall.com)
135 points by jwilk on Aug 2, 2022 | hide | past | favorite | 47 comments



https://github.com/WayneD/rsync/commit/b7231c7d02cfb65d291af...

This newly added function, add_implied_include, which was committed as a part of the patch gives me the creeps. I do not claim to be competent enough to work with a utility as important as rsync, but looking at C code written like that makes me grateful for that fact.


I'm glad I'm not the only one who thinks this.

Clever parsing code, multiple commits to get it right (https://github.com/WayneD/rsync/commit/2f7c583143bc6e8090213..., https://github.com/WayneD/rsync/commit/3d7015afa223494e33184...), no tests, seemingly no code review. Not an attack on the maintainer of rsync - this pattern is typical for security patches.

Within a few days, tools like Twistlock are going to alert on the CVE, and companies across the world will blindly upgrade without reviewing the change. Rinse and repeat thousands of times per year.

Our industry continues to not take security seriously while pretending that it does.


Our industry will only change when security exploits come with hard penalties for the companies that don't take security seriously, just like any food joint gets closed down when visited by the sanitary authorities and doesn't upheld the very minimum health conditions.

Until then, rinse and repeat as you say.


You can’t impose penalties for free and open source projects though. Well, you can, but that results in much fewer open source projects being ever developed.


Sure you can, just like people selling street food have to take care no one gets food poison.


OTOH this also stops people from giving away food they don't need anymore. Something like good samaritan laws should apply here as well.


That is why we have charities that take care of handling such issues.


If you care about security, you should not rely on a perfect code. Consider compartmentalization instead: https://qubes-os.org. Works for me (or so I hope).


Also consider policy [1].

It's become a dirty word because it's been somewhat hijacked by management-class values and wrapped up in an unthinking "compliance and audits" mind-set.

Thinking about what you want to allow, where to draw lines, planning in advance and sticking to it (including using tricks that force you to stick to it) is mature security thinking. You only have to be an organisation of one person to have a security policy.

https://news.ycombinator.com/context?id=32248506


I don't like the architecture because it looks like a set of ugly hacks. You don't need virtual machines to isolate applications because CPU already has a sandbox provided by protected mode. The problem is that legacy OS like Linux/Mac/Windows are unable to use this sandbox effectively.

For example, here [1] they show a separate VM for the USB stack. But why do you need it if you can simply run USB stack in protected mode?

Also, I assume that running so many VMs requires a lot of drive space and hurts performance.

[1] https://www.qubes-os.org/intro/


> I don't like the architecture because it looks like a set of ugly hacks.

It's not a set of hacks. You simply isolate your workflows using a hardware virtualization, transparently to the user.

> You don't need virtual machines to isolate applications because CPU already has a sandbox provided by protected mode.

What exactly do you mean? Qubes uses VT-d hardware virtualization. AFAIK it's the most secure compartmentalization method, and you can't use it without VMs.

> But why do you need it if you can simply run USB stack in protected mode?

On how many lines of code do you rely in "protected mode"? On Qubes, you assign the devices to VMs, hide them from the dom0, so they can't attack it. It relies on Xen, one of the most tested piece of software with a small Trusted Computing Base. See this: https://www.qubes-os.org/faq/#why-does-qubes-use-xen-instead....

> Also, I assume that running so many VMs requires a lot of drive space and hurts performance.

You need a lot of RAM, if you want to run a lot of VMs simultaneously. But you don't have to. CPU performance should not be affected much. Drive space is saved by using TemplateVMs (which provide root partition to other VMs): https://www.qubes-os.org/doc/templates/.


That's of course on the side of due diligence on the user of any software, but most software should have tests, and certainly one of the importance of rsync, doubly so because it's written in C.


Tests aren’t particularly useful for detecting this kind of bugs though.


You add tests for this sort of thing after the fact to make sure the fix works as you think it does and to make sure the issue isn't re-introduced later. Tests can't find them, but they can help keep them away.


I am competent enough to read c and note that none of those changes you highlighted have functional impacts. They were largely cosmetic changes (e.g. removing an else condition by assigning the inclusive flags to a variable). I saw one change that looked like a performance improvement (an early loop exit break added).

I will also note that rsync has a test suite and running tests as "integration" style testing is more common of shell utilities than writing gobs of unit tests. I would have liked to see an additional targeted test written for the CVE in this patch though.


My guess (hope) is that orgs like Red Hat will vet it a bit before putting it out, since they'll have to dive into the code and apply or reimplements it for multiple past versions of rsync.

That actually covers a lot of companies using it. Paying (or at least relying on) another company to take care of that for you is what most places do. Honestly, that's better than tens of thousands of people wasting time on the same thing IMO.


My review of that PR would be : "where's the test for this parsing code?".

I'd expect to see many test inputs that are both valid and invalid.

Note: I'm very much not a testing freak, but for this kind of code I think it's essential to have unit tests, if only because they provide necessary documentation about the intended functionality of the code.


If you wrote a code that contains some bug, usually your your tests won’t catch it either. You’d need a fuzzer instead. But then again, it just doesn’t work this way - you can’t write perfect code, you need to follow the principle of least privilege, ie use compartmentalisation.


Well, using a fuzzer would probably be a good idea? See eg:

> Fuzzing 100+ open source projects with OSS-Fuzz - lessons learned. (2021)

https://adalogics.com/blog/fuzzing-100-open-source-projects-...

Short hn discussion of post (but project has been featured many times):

https://news.ycombinator.com/item?id=28391620


What’s the point of compartmentalization, if all your airlocks are broken or unsecured?


Its not compartmentalization if your airlocks are broken or unsecured.


So many if, else, case, while, whatever nested control flow, all in one procedure, that one can almost be sure, that one of those branches of execution will have a bug somewhere. Very short variable names, sometimes 1 character or 2 and pointer magic.

Looks like a prime example of badly readable code.


What if the input contains multiple `/./` ? It seems that `cp = strstr(arg, "/./");` only checks once. Maybe wrap in a loop?


No, this is for handling the --relative (-R) flag where the part after the first /./ in each source path is a relative destination for that path so e.g.

  rsync -aR foo/./bar/baz/ dest/
copies data from foo/bar/baz/ to dest/bar/baz/ while any other /./ is effectively meaningless.


This repo's commit messages are not good:

"A few more minor tweaks."

"A few more minor changes."

"Some extra file-list safety checks."

https://github.com/WayneD/rsync/commits/master

Those are the full commit messages. By way of comparison, here's what commit messages to git itself look like:

https://github.com/git/git/commit/198551ca54f6ff1c95c9357ccc...

https://github.com/git/git/commit/dee8a1455c8ad443ef59e0d5b7...

Each commit contains paragraphs of explanatory material. Jeff King's commit messages are "gold standards" to me, often well exceeding the size of the diff itself:

https://github.com/git/git/commit/aa0834a04e1d9d3ab81ecd4a4a...

https://github.com/git/git/commit/38f865c27d1f2560afb48efd2b...

https://github.com/git/git/search?q=author%3Apeff&type=commi...

Please folks, I beg of you, spend time writing your commit messages. You're not writing them for you today. You're writing them for others, including your future self. They should explain the commit: What is its goal? Why does it exist? Why was it written in this way? What other context does the developer need to understand this commit?

A PR description is not a substitute for a good commit message for multiple reasons:

1. If the PR is a single commit, then it's just the commit message and your job is probably done.

2. If there are multiple commits, then the PR description should summarize the goal of the aggregate of its commits.

3. The PR description is typically written hours or days after the commit(s). What was fresh in your head when you wrote that code is now stale and you will struggle to recall why you made a change in a particular way more than if you wrote it down fresh when you commit the change.

4. The PR description is not part of the repo's history. It requires access to a (typically propriety) platform to read.

So please, https://cbea.ms/git-commit/


I wonder whether the UI of many git clients contributes to this. Even if very long messages are allowed, from what I've seen typically the dimensions of the fields are fairly small. This could subtly give the impression that you are being too verbose and not using the tool correctly.

Even for git itself it's funny that the article you linked encourages people to learn to love git CLI, but recommends a third party tool for writing multi line commit messages for ergonomics.

Clearly it's not a massive technical challenge, as the good examples show, but the tooling seems to inadvertently encourage shorter messages.


> but recommends a third party tool for writing multi line commit messages for ergonomics.

Suggesting a text editor to edit text rather than trying to type it on the command-line is hardly "funny". If you already live on the command line you don't even need to set anything up as git will launch your $EDITOR for you when you don't specify a commit message using -m, i.e. this is the DEFAULT behavior.


And if you work by making many rapid small commits, rebase and squash when you're "done" and then add the large, longer commit message to the "new big commit".


> 3. The PR description is typically written hours or days after the commit(s). What was fresh in your head when you wrote that code is now stale and you will struggle to recall why you made a change in a particular way more than if you wrote it down fresh when you commit the change.

The PR contains all the back and forth and comments from other people and the backstory on the decision that was made. I'll also periodically add comments to the end of closed and merged PRs after I receive feedback and question days later just to record it all for posterity. You can't do that with git commits because they aren't mutable/fixable documentation.

> 4. The PR description is not part of the repo's history. It requires access to a (typically propriety) platform to read.

It should be "table stakes" to carry your PR history if you migrate between any of the vendors or something self-hosted and they all have import tools which do it for you. This isn't a practical objection in 2022.

I don't see the author there doing any PRs, either though other than for external contributors.

But they're probably an unpaid open source maintainer, so I can't blame them for cutting corners on process to make their own life easier. How much are you paying this person so that you think you can criticize the way they're doing anything?


It looks like this was reported 8 days ago and was patched 2 days ago[0]. It was then packaged in v3.2.5pre1 12 hours ago[1]

It's always conforming to see such a fast response time :)

[0] - https://github.com/WayneD/rsync/commit/b7231c7d02cfb65d291af...

[1] - https://github.com/WayneD/rsync/releases/tag/v3.2.5pre1


Funny, I reported something similar on the rsync mailing list a couple of months ago: https://www.mail-archive.com/rsync@lists.samba.org/msg33452....

Good to see that it will be fixed. Still, rsync is not the right tool for the job if you do not trust the server.


I take it that "server" in this context includes the remote party in a "serverless" transfer. I mean, I take it this isn't particular to the rsync daemon.

It sounds like a very serious defect, very easy to exploit. It needed to be addressed quickly. I'm not surprised they skipped the code review.


>rsync is not the right tool for the job if you do not trust the server

Why would you ever trust the server not to do bad things?


I do trust my own server not to misbehave, but I probably would not trust some random server on the internet.


I've had personal servers compromised. I don't unconditionally trust them either.


Exactly. I'm usually worried about the opposite scenario (the client infecting the server). If you're copying files from a personal computer it might have all sorts of random software running on it that's accumulated over the years. Whereas on a remote server you tend to have a better security posture, and aren't installing random software and apps. Admittedly, that might just be my personal use case though.



I'm struck by how big the code fix is; 200 line diff, at least 100 lines of code with new logic.


Seems like a good time to plug https://github.com/kristapsdz/openrsync , especially on OpenBSD. You wouldn't get complete arbitrary file write vulnerability, only where unveil(2)ed.


Wasn't this kind of bug part of the reasoning to make scp use stfp in the background, instead of using its own scp protocol?


Scp does not use sftp in the background. Scp is generally faster than sftp but lacks many capabilities like remote dir listing


It does since recently, check the -O option in the manpage.


No it doesn’t SCP has no support for directory listing. The -o flag only controls the security layer which is openSSL just like SFTP. So you established the S’s are the same, But the P’s (protocol) are different. A protocol is like get post put… Totally separate from security


https://www.openssh.com/txt/release-9.0 :

> This release switches scp(1) from using the legacy scp/rcp protocol to using the SFTP protocol by default.

From the scp(1) manpage:

> -O: Use the original SCP protocol for file transfers instead of the SFTP protocol. Forcing the use of the SCP protocol may be necessary for servers that do not implement SFTP, for backwards-compatibility for particular filename wildcard patterns and for expanding paths with a ‘~’ prefix for older SFTP servers. This mode is the default.


Maybe my sleuthing skills are lacking this morning, but is there a reported minimum affected version?


It looks like versions prior to 3.2.5 are affected [1]. If you're using the version of rsync bundled with MacOS, then it's affected.

If you're not connecting to random malicious servers then this might not be a part of your threat model though. As long as your private keys are secure, and you've verified the public keys in your known_hosts file, then this exploit would be hard to take advantage of by a man in the middle attack. Basically, all of the normal security rules of the underlying SSH protocol apply here.

[1] https://nvd.nist.gov/vuln/detail/CVE-2022-29154


> a malicious rysnc server (or Man-in-The-Middle attacker) can overwrite arbitrary files in the rsync client target directory and subdirectories

Do I read it right? You rsync to a malicious server and bother you get bogus files from it?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: