"Proof" suggests a level of absolute confidence that this example certainly does...

FakeComments · on Jan 10, 2019

The hands-on-keyboard SLA for a lot of on-calls is 30 minutes.

So in an “attack was detected, break all the glass” scenario, the difference between 10 and 70 minutes is sufficient to allow human operators to render the attack moot by offlining its target, while the attackers are still trying to break through API servers.

At both big corps I’ve been at, the incident response plan for an exfiltration attack on customer data was invalidate DB creds and take the system down ourselves.

Better to be out of service than lose custody of customer data.

dTal · on Jan 10, 2019

>Is there any realistic threat model under which the difference between 10 minutes and 70 minutes is the difference between "insecure" and "secure"?

How about an intrusion detection system that flags up a human response? 10 minutes is hardly any time at all to respond, an hour gives you a chance to roll out of bed.

segfaultbuserr · on Jan 10, 2019

PaX offers an anti-bruteforce protection: if the kernel discovers a crash, the `fork()` syscall of the parent process is blocked for 30 seconds for each failed attempt, the attacker is going to have a hard time beating 32-bit entropy. Meanwhine, it also writes a critical-level message to the kernel logbuffer to notify sysadmins, and possibly uncover the 0day exploit the attacker has used.

JoshTriplett · on Jan 10, 2019

> if the kernel discovers a crash, the `fork()` syscall of the parent process is blocked for 30 seconds for each failed attempt

That seems like something good to be able to turn on in a stock kernel, but not with that high a timeout. Imagine your shell failing to start another process for 30 seconds after you're debugging a segfault, or your browser failing to open another tab for 30 seconds after one crash.

500ms would drastically slow brute-force attacks without noticeably inconveniencing the user (and then they can always turn it off manually when doing something like fuzz testing).

floatingatoll · on Jan 12, 2019

I've waited almost 15 minutes for an ssh(1) to a server to complete, with a latency of 1-2 minutes per command.

30 seconds, predictably? I'd take it any day over that.

pas · on Jan 11, 2019

journald is a systemd service, right? cannot restartsec simply increased to 1s?

wstuartcl · on Jan 10, 2019

I guess, as long as the IDS senses the attack in progress quickly -- my gut is this type of attack would be hard to detect until the outcome was achieved. More likely the initial entry would be the detected event(s) -- in which case yeah the extra time gives some safety net.

In either case, it still feels like pulling all things into systemd creates a much harder to protect surface area on systems. Why should init care if your logger crashes, let alone take down init with it? I am not a anti-systemd person but I honestly do see the tradeoffs of the "let me do it all" architecture as a huge penalty.

viraptor · on Jan 10, 2019

> Why should init care if your logger crashes

It cares in the same way it cares about all the other processes. There's nothing systemd-specific here. Journald service is configured to restart of crash, same as many other services.

It's not taking down init when journald crashes either.

dane-pgp · on Jan 10, 2019

> There's nothing systemd-specific here.

Well, except journald itself.

fao_ · on Jan 10, 2019

> In either case, it still feels like pulling all things into systemd creates a much harder to protect surface area on systems. Why should init care if your logger crashes, let alone take down init with it? I am not a anti-systemd person but I honestly do see the tradeoffs of the "let me do it all" architecture as a huge penalty.

100% this. Also, as I understand it the exploit would not exist if it was literally just outputting log lines to a file in /var/log/systemd/ ?

EDIT: Also as I understand it, appending directly to a file is just as stable as the journald approach, given that many, many disk controllers and kernels are known to lie about whether they have actually flushed their cache to disk (actually moreso, because the binary format of journald is arguably more difficult to recover into proper form than a timestamped plaintext -- please correct me if I'm wrong, though!!)

viraptor · on Jan 10, 2019

> the binary format of journald is arguably more difficult to recover into proper form than a timestamped plaintext -- please correct me if I'm wrong, though!!

It depends what you mean by recover. To get the basic plaintext, you can pretty much run "strings" on the journal file and grep for "MESSAGE=". It's append-only so the entries are in order. Just because it's a binary file doesn't mean the text itself is mangled. (Unless you enable compression)

The reference may look complicated https://www.freedesktop.org/wiki/Software/systemd/journal-fi... but that's all extra features you may ignore for "recovery in emergency".

loeg · on Jan 10, 2019

> Why should init care if your logger crashes, let alone take down init with it?

They're separate processes. Logger crashes do not take down init.

> I am not a anti-systemd person

Whether you are or not, you are (inadvertently) repeating misinformation about it.

okket · on Jan 10, 2019

A server suddenly spiking full load for over an hour should raise alarms. It shows up even in the dumbest of 5 minute averaging charting tools.

A few minutes of high load can easily get overlooked.

geggam · on Jan 10, 2019

Enterprise systems or any large scale stack can have one running like this where people dismiss it for an hour.

Some systems run hard like this by default. See Transcoding

pixl97 · on Jan 10, 2019

Also, Weekend and Christmas attacks. In the field we are seeing more attacks with a valid username and pass occur at times when a sysadmin may not be on call.

segfaultbuserr · on Jan 10, 2019

> Is there any realistic threat model under which the difference between 10 minutes and 70 minutes is the difference between "insecure" and "secure"?

Time is given here just for an example. To crack systemd, it only takes 70 minutes, but in general, bruteforcing ASLR on 64-bit systems can take as few as 1.3 hours but as many as 34.1 hours, depending on the nature of bug. On the other hand, the ~20-bit of entropy on 32-bit systems is trivial to crack in 10 minutes for nearly all cases, and does not provide an adequate security margin.

Oon a 64-bit system there is ~32-40 bit of ASLR entropy available for a PIE program. It forces an attacker to brute-force it. Unlike other protections, no matter how is the system cleverly analyzed beforehand, it taxes the exploit by forcing it to solve a computational puzzle. This fact alone, is enough to stop many "Morris Worm"-type remote exploitations (they have suddenly became a serious consideration, given the future of IoT), since an exploit takes months or years to crack a single machine.

If it's not enough (it is not, I acknowledge ASLR by itself cannot be enough), an intrusion detection system should be used, and it already has used by many. For example, PaX offers an optional, simple yet effective anti-bruteforce protection: if the kernel discovers a crash, the `fork()` attempt of the parent process is blocked for 30 seconds. It takes years before an attacker is able to overcome the randomization (so the attacker is likely to try something else). In addition, it also writes a critical-level message to the kernel logbuffer, the sysadmin can be notified, and possibly uncover the 0day exploit the attacker has used. I'd call it a realistic threat model.

Finally, information leaks is a great concern here. Kernels and programs are leaking memory address like a sieve, and effectively making ASLR useless. Linux kernel is already actively plugging these holes (but with limited effectiveness, HardenedBSD should be the future case-study), so should other programs.

> e.g. I can remember a time when ASLR was touted as a solution to C's endemic security vulnerabilities; now cracking ASLR as part of vulnerability exploitation is routine, as seen here.

You can make the same comment on NX bit, or W^X/PaX, or BSD jail, or SMAP/SMEP (in recent Intel CPUs), or AppArmor, or SELinux, or seccomp(), or OpenBSD's pledge(), or Control Flow Integrity, or process-based sandboxing in web browsers, or virtual machine-based isolation.

Better defense leads to better attacks, and it in turns leads to better defense. By playing the game, it may not be possible to win, but by not playing it, losing the game is guaranteed. In this case, systemd is exploitable despite ASLR, due to a relatively new exploit technique called "Stack Clash", and for this matter, GCC has already updated its -fstack-check to the new -fstack-clast-protection long before the systemd exploit was discovered. If this mitigation has been used (like, by Fedora and openSUSE), it causes simply a crash, and is not exploitable. At least before the attacker finds another way round.

Early kernels and web browsers have no memory and exploit protections whatsoever: a single wrong pointer dereference or buffer overflow is enough to completely takeover the system. Nowadays, an attack needs to overcome at least NX, ASLR, sandboxing, and compiler-level mitigation, and we still see exploits. So the conclusion is all mitigations are completely useless? If it's your opinion, I'm fine to agree to your disagreement, many sensitive C programs need to be written in a memory-safe language anyway. But as I see it, as long as there are still C programs running with undiscovered vulnerabilities, and as long as attackers have to add more and more up-to-date workarounds and cracking techniques (ROP, anyone? but now the most sophisticated attackers are moving to DATA-ONLY attacks) to their exploit checklist, then we are not losing the race by increasing the cost of attacks.

On the other hand, if an attacker don't have to use an up-to-date cracking techniques, then we have serious problems. For example, broken and incomplete mitigation is often seen in the real word, and it's the real trouble. Recently, it has been discovered that the ASLR implementation in the MinGW toolchain is broken, allowing attackers to exploit VLC using shellcode tricks from the 2000s (https://insights.sei.cmu.edu/cert/2018/08/when-aslr-is-not-r...). And we still see broken NX bit protection and the total absence of any ASLR, or -fstack-protector in ALL home routers (https://cyber-itl.org/2018/12/07/a-look-at-home-routers-and-...).

The principle of Defense-in-Depth is that, if the enemies are powerful enough, it's inevitable all protections will be overcame. Like the Swiss Cheese Model (https://en.wikipedia.org/wiki/Swiss_cheese_model), a cliche in accident analysis, eventually there will be something that managed to find a hole in every layer of defense and pass though. What we can do, is to do our best at each layer of defense to prevent the preventable incidents, and adding more layers when the technology permits us.

My final words are: at least, do something. ASLR is already implemented as a prototype, analyzed, and exploited by clever hackers back in 2002 (http://phrack.org/issues/59/9.html), but only seen major adoptions ten years later. It would be a surprise if ASLR-breaking techniques has not improved given the inaction of most vendors.

> "Proof" suggests a level of absolute confidence that this example certainly does not give.

I agree. I should've use "given more empirical evidences" instead of "given a proof".

For real security, I believe memory-safe programming (e.g. Rust), and formal verification (e.g seL4) are the way forward, although they still have a long way to go.

lmm · on Jan 11, 2019

> You can make the same comment on NX bit, or W^X/PaX, or BSD jail, or SMAP/SMEP (in recent Intel CPUs), or AppArmor, or SELinux, or seccomp(), or OpenBSD's pledge(), or Control Flow Integrity, or process-based sandboxing in web browsers

I can, and I would.

> or virtual machine-based isolation

A little different because a VM can be designed to offer a rigid security boundary (with a solid model behind it) rather than as an ad-hoc mitigation technique.

> So the conclusion is all mitigations are completely useless? If it's your opinion, I'm fine to agree to your disagreement, many sensitive C programs need to be written in a memory-safe language anyway. But as I see it, as long as there are still C programs running with undiscovered vulnerabilities, and as long as attackers have to add more and more up-to-date workarounds and cracking techniques (ROP, anyone? but now the most sophisticated attackers are moving to DATA-ONLY attacks) to their exploit checklist, then we are not losing the race by increasing the cost of attacks.

> The principle of Defense-in-Depth is that, if the enemies are powerful enough, it's inevitable all protections will be overcame. Like the Swiss Cheese Model (https://en.wikipedia.org/wiki/Swiss_cheese_model), a cliche in accident analysis, eventually there will be something that managed to find a hole in every layer of defense and pass though. What we can do, is to do our best at each layer of defense to prevent the preventable incidents, and adding more layers when the technology permits us.

> For real security, I believe memory-safe programming (e.g. Rust), and formal verification (e.g seL4) are the way forward, although they still have a long way to go.

I think the defense in depth / swiss cheese approach has shown itself to be a failure, and exploit mitigation techniques have been a distraction from real security. It's worth noting that systemd is both recently developed and aggressively compatibility-breaking; there really is no excuse for it to be written in C, mitigations or no. Even if you don't think Rust was mature enough at that point, there were memory-safe languages that would have made sense (OCaml, Ada, ...). Certainly there's always more to be done, but I really don't think there's anything that would block the adoption of these languages and techniques if the will was there.

nickpsecurity · on Jan 11, 2019

Before the critique, I want to thank you all the detailed information (esp compiler tips) you're putting out on the thread for everyone. :)

"You can make the same comment on NX bit, or W^X/PaX, or BSD jail, or SMAP/SMEP (in recent Intel CPUs), or AppArmor, or SELinux, or seccomp(), or OpenBSD's pledge(), or Control Flow Integrity, or process-based sandboxing in web browsers, or virtual machine-based isolation."

You can indeed say that about all those systems since they mix insecure, bug-ridden code with probabilistic and tactical mechanisms that they prey will stop hackers. In high-assurance security, the focus was instead to identify each root cause, prevent/detect/fail-safe on it with some method, and add automation where possible for these. Since a lot of that is isolation, I'd say the isolation based method would be separation kernels running apps in their own compartments or in deprivileged, user-mode VM's. Genode OS is following that path with stuff like seL4, Muen, and NOVA running undearneath. First two are separation kernels, NOVA just correctnes focused with high-assurance, design style.

Prior systems designed like those did excellent in NSA pentesting whereas the UNIX-based systems with extensions like MAC were shredded. All we're seeing is a failure to apply the lessons of the past in both hardware and software with predictable results.

"Better defense leads to better attacks, and it in turns leads to better defense. By playing the game, it may not be possible to win, but by not playing it, losing the game is guaranteed. "

Folks using stuff like Ada, SPARK, Frama-C w/ sound analyzers, Rust, Cryptol, and FaCT are skipping playing the game to just knock out all the attack classes. Plus, memory-safety methods for legacy code like SAFEcode in SVA-OS or Softbound+CETS. Throw in Data-Flow Integrity or Information-Flow Control (eg JIF/SIF languages). Then, you just have to increase hardware spending a bit to make up for the performance penalty that comes with your desired level of security. Trades a problem that takes geniuses decades to solve for one an average, IT person with an ordering guide can handle quickly on eBay. Assuming the performance penalty even matters given how lots of code isn't CPU-bound.

I'd rather not play the "extend and obfuscate insecure stuff for the win" game if possible since defenders have been losing it consistently for decades. Obfuscation should just be an extra measure on top of methods that eliminate root causes to further frustrate attackers. Starting with most cost-effective for incremental progress like memory-safe languages, contracts, test generation, and static/dynamic analysis. The heavyweight stuff on ultra-critical components such as compilers, crypto/TLS, microkernels, clustering protocols, and so on. We already have a lot of that, though.

"For real security, I believe memory-safe programming (e.g. Rust), and formal verification (e.g seL4) are the way forward, although they still have a long way to go. "

Well, there you go saying it yourself. :)

"Early kernels and web browsers have no memory and exploit protections whatsoever"

Yeah, we pushed for high-assurance architecture to be applied there. Chrome did a weakened version of OP. Here's another design if you're interested in how to solve... attempt to solve... that problem:

https://www.usenix.org/legacy/events/osdi10/tech/full_papers...