Hacker News new | past | comments | ask | show | jobs | submit login
Linux: Vulnerabilities in nf_tables cause privilege escalation, information leak (lwn.net)
294 points by pdenton on March 29, 2022 | hide | past | favorite | 217 comments



https://nvd.nist.gov/vuln/detail/CVE-2022-1015

https://nvd.nist.gov/vuln/detail/CVE-2022-1016

https://access.redhat.com/security/cve/CVE-2022-1015

https://access.redhat.com/security/cve/CVE-2022-1016

https://ubuntu.com/security/CVE-2022-1015

https://ubuntu.com/security/CVE-2022-1016

https://security-tracker.debian.org/tracker/CVE-2022-1015

https://security-tracker.debian.org/tracker/CVE-2022-1016

I just spent the whole weekend patching whatever the last kernel vuln was and had to plan around like 20 people's schedules. I thought Meltdown/Spectre was bad, this year is already feeling like that year in repeat.

15 years as a sysadmin, anyone have suggestions for my next career move? Thanks.


If you had to spend the weekend updating kernels, you might want to look at your overall system architecture. Replacing a node with another one running a newer kernel shouldn't be a stressful or time-consuming task; it's part of the normal progress of the system.


This greatly depends on the organisations size and if you're on-prem vs cloud. If you have a small-medium startup with a single product and a fairly simple architecture in AWS this might make sense. But, if you're a bank or something with 1000's of applications, 100k+ servers on-prem, and a global footprint with availability requirements this is a vastly different story.

These are obviously two extremes but you can see there is tons of stuff in the middle two. The larger folks are the ones that are stressed 24/7 even when they have the tools to do it.


The opposite actually. If you're that size it's insane to not have fault tolerant clusters, on-prem or cloud doesn't change much. Time to invest in it.


Yeah, wishful thinking. I work with them everyday and get to see under the hood. I 100% agree with you on investing in fault tolerant clusters / automation. There is a massive trend happening to modernise legacy infrastructure. Almost all the big companies will have some internal platforms team that are trying to standardise. Just extremely slow and tons of training and education.

I don't think it's the newly build stuff that companies are worried about. It's the old legacy stuff that's on life support that no one wants to modernise or maybe they cannot.


The existence of small and large companies with bad systems architecture isn't an argument for implementing more bad systems architecture. Small or large, it doesn't need to cost more to design stuff well.


For what it's worth -- I agree with you and is why folks are flocking to docker/kubernetes and devops tooling like Terraform. I just think you're missing the scope of it. Say for example, a big box store, they might have 500 locations, and all the sudden they buy a few more companies, and merge all this stuff under a single brand. All the sudden they have tons of systems all over the place, lots of existing platforms that need to talk to each other, and staff that are resistant to change. This isn't "newly designed stuff" it's systems that sort of organically grew over a long time. You're talking lots of different operating systems, networks, front-end/back-end systems, www corperate site, mobile site, rewards sites, all sorts of internal support applications, pos system backends, etc. They probably have 20+ different large database systems and they might not even know all the apps connecting to them. I actually worked with a company like this on a few cloud projects. It was amazing to see the complexity. This is the type of stuff that's running large parts of companies that you interact with on a daily basis.

I guess what I'm getting at, is that sure if they design new things they will follow modern patterns but there is so many things that are not modern. They don't have the time or incentive to just go and rebuild all this stuff. There is zero benefit to them on a bottom line, unless there is some burning fire, a way they can extract more money, or save tons of money. So, they just keep them on life support and run in a keep the lights on mode until something happens. These are the systems all sysadmin's just wish went away and there's many of these types of things all over the place.


Yeah, I agree. Your initial reply to me didn't mention legacy issues, just org size, and I was interpreting it more in the context of a large org building new systems the same unmaintainable way they always have because of inertia / politics / ossified sysadmins / etc.


The problems are around process almost always. But I’ve also seen on occasion some sadly pragmatic reasons for very slow processes such as critical legacy software that a replacement for can’t be found even with 8+ figure checks to vendors waved.

One of the big trends from the 2010s for cloud software was to cloudwash old stacks that really weren’t more than simply validated to not crash and burn on an EC2 instance, which is why the entire cloud native movement exists to differentiate greenfield cloud architecture services from cloudwashed ones.

Things can be pretty frustrating working with different vendors of different applications and competencies. People will be patching log4j issues for years to come, for example, and that’s probably easier to validate in aggregate than entire kernel upgrades for decrepit, unsupported distros like CentOS 5 that I still hear about being used.


If you are starting from scratch, sure.

Real life is more complicated, and even if the organisation willpower and politics are aligned in a way to _want_ to fix it, this takes a long time.

Chastising someone on HN because they own a system that probably wasn’t designed and might not have the power to fix seems at best, a little unfair.


I agree, there are many reasons why bad systems architecture might remain in place due to inertia, lack of resources, organisational politics, etc.

I didn't chastise anyone.


Sorry, that wasn’t addressed to you specifically, just aiming upwards in the thread.


A modernization effort is like adding a new, alien science officer named Spock to your crew. Who has the patience, skill, and charisma to deal with the annoying Vulcan and his cold logic?


I have seen a lot of places, yet not a single one that runs all the latest software everywhere all the time. This is even true for smaller pieces of software like Linux kernel.


How would fault tolerance help with privilege escalation?

Would it not just mean that you have more computers to update in your redundant/tolerant cluster?


In this context it means you can take individual servers offline without taking your entire service down. So you can then update each server (even on production systems) live without requiring a maintenance window.

For bonus points you're also not babysitting manually provisioned servers but instead have your software installs automated. So any failure on a server or OS update isn't seen as a maintenance piece but rather just terminating the old server and letting your pipeline auto-build a new server. This is often referred to as "treating your servers as cattle rather than pets", though not everyone likes that analogy.


In the context of a cluster, fault tolerance allows you to replace nodes without downtime. With automation a kernel update can then be a routine, low effort, low stress task.

Honestly, a kernel update has to be a routine, low effort, low stress task. It's a common event that should be seen as part of the normal operation of the system, not as some exceptional event that means someone has to work on the weekend.


Theoretically it means that you can be running regular node OS updates as a matter of course simply by replacing some percentage of them on a rapid cadence.

Then there isn’t any stress to doing it, it’s routine and automated.


You just made a bunch of assumptions about what their system is and does.


Not really? I said they "might want to look at their overall system architecture". I didn't say "your system needs to be redesigned". Maybe it's the best it can be given their constraints. Hence the word "might".


Support in a multinational, clock in, clock out.

No stress, (relative to what you are doing now).

You can hand off to another team at the end of your shift.

You got HR, perks, pension etc. You just gotta eat a bit of shit :)

Yes there is a ton of downsides too but look, you are a Linux admin you may as well sit back and use them skills for a while so you can de-stress and get your life back.

Not sure where you are based but there is a massive demand for Linux admins in Ireland (and probably Europe)


Build systems that can cope with the loss of nodes and ideally self-heal if you kill a node.


Do the major distros not share and coordinate such high-impact security issues with each other? 1015 is tracked for RedHat bugzilla since 2022-03-17. Can't find any information relating to Gentoo, Arch or NixOS, and no fix for Debian or Ubuntu.

Is there a patch one can apply in the meantime? The RedHat suggested mitigation requires disabling functionality that is heavily depended on.


1st 2 links report cve id not found Redhat & Ubuntu have cookie walls and Debian pages are the only readable without pressing or clicking or agreeing or disagreeing on anything. However none of the sites mention a fix.

What is the fix? Surely they don't release this kind of info without a fix.


The oss-security message has the commit hashes, these links (aside from the NVD, which seems to just lack information for the moment) are is only needed to figure out when their backports hit your distro kernel. The answer to that seems to be “not yet” for all the liked distros, which is weird.

I do not see a mitigation mentioned, but the impact of both of these seems to be limited to users with the ability to install nftables bytecode, so it seems having user namespaces disabled (if you don’t need them) would make this irrelevant?


Yes, this confuses me as well. It would be good to know what kind of fixes the grandparent has applied.


Further down "MayeuIC" wrote 1015 is fixed in latest lts and stable kernels. Idk how to link to particular comments here. I'm on the phone and copy/pasting is a bit difficult.

So bottom line - upgrade kernels to latest of your branch, if you're able.


> anyone have suggestions for my next career move?

I'm keeping my eyes open on circus website careers pages. I reckon I'd have way fewer clowns to deal with if I was an actual clown car driver... :sigh:


Maybe look at Linux kernel livepatching, that should reduce your need to reboot as often.


> anyone have suggestions for my next career move?

https://www.goatops.com/


At least point 1 is solved now.


Infosec


> 15 years as a sysadmin, anyone have suggestions for my next career move?

I made the switch to Technical Product Marketing after 15+ years doing linux sysadmin stuff. This might seem weird at first but all tech companies have complex products that they are trying to sell to a technical audience. Marketing needs technical folks embedded that can translate between the tech stack and marketing speak. You can probably 2x your sysadmin compensation quite easily and offers tons of career growth (developer relations, tons of conference speaker opportunities, become some industry expert, etc).

No idea about your skill set or area of expertise but here's an example from vmware [1]. Just search for "Technical Marketing". The job typically involves doing technical reviews of competitors, something you'll already do when deciding to choose a product as a sysadmin, reviewing internal marketing content to make sure people are telling the truth, doing talks/training, recording demos, interacting with PM/Eng about product releases, testing and writing about new releases, etc. If you like the technical side and don't mind teaching this can be a good transition. You basically leverage all the skills you've built over 15 years and apply them to something else quickly.

The kicker here is that you can just apply to companies where you already use their products and know them inside and out (giving you a massive advantage compared to other people applying). Say, you do tons of AWS stuff, well who better to work with marketing on the technical side then a sysadmin who breaths this stuff everyday, or maybe you're doing stuff on cisco switches [2], or maybe some netapp storage fabric expert [3], same thing. All these companies have technical roles in marketing that want you and it can range from mega corps to cool startups like GitLab [4].

[1] https://careers.vmware.com/main/jobs/R2204162?lang=en-us

[2] https://jobs.cisco.com/jobs/ProjectDetail/Technical-Marketin...

[3] https://jobs.netapp.com/job/Bangalore%2C-Karnataka-Technical...

[4] https://about.gitlab.com/job-families/marketing/technical-ma...


As an industry, we need to stop pretending that computer security is a possibility. Assume everyone on the Internet has full access to any network-connected computer, and arrange your affairs from that assumption.


SRE. Learning to build the automation that makes changing the kernel a change to a config file (Ansible/Salt/Nix/etc.) and pushing a button.


It's because of user namespaces lol this is gonna be the next decade, sorry


Its yet another int overflow bug too. Something like a kernel should probably be built with saturating arithmetic rather than the stupid C overflow behavior.


Saturating arithmetic is not free from problems, although typically better than two’s complement that some languages have settled on. I’d rather it just panicked, to be honest.


Which on x86, one could trap on overflow. That is one of those legacy options people complain about because they aren't used by C. I'm not sure that is as easy on Arm/etc because IIRC there isn't an integer overflow exception.

For whatever reason most newer languages (say rust) don't actually solve this problem either. They could diverge from the normal and do saturating (which arm does have) ints, or throw exceptions on overflow/underflow, but they don't because that would be to hard when they have to manually check overflow on each operation because its not a common feature of many processor arches.

edit: although LEA is one of the instructions which avoids flags updates, so even if you wanted to trap it probably wouldn't.


Rust does in fact have a story around overflows: https://doc.rust-lang.org/book/ch03-02-data-types.html#integ...


But the rust story is basically the same as the C story, which is use something other than the normal operators to do your arithmetic.

There are tons of checked_add()_sub/_mul macros or functions floating around. At least in C++ one could override the global operators if needed.


You cannot overload operators of integral types in C++, if you mean you can overload `int operator+(int, int)` to be saturating. You can create `saturating_int` kind of types easily though.


Despite its C underpinnings, C++ does provide the tooling for Type Driven Development like ML languages, I think it is a hard sell on the performance driven culture to have little types for those purposes, which is why it isn't used as much as it should.


You can also pass a flag to have all overflows checked, if desired. We do, even on embedded where space is a premium. We’ll take the performance hit.


This was solved by Ada 40 years ago, with modular types added later to satisfy use cases where silent overflow is wanted.


And, Ada's solution is the proximate cause of the first Ariane 5 exploding, destroying (IIRC) $1/2 billion in payload, unnecessarily.

In the Ariane 5 case, an ill-tuned and anyway unnecessary out-of-bounds check was compiled into the inertial platform firmware, causing a debug traceback to be dumped to the rocket gimbal controls, which interpreted the traceback as instructions to steer sideways. To be clear: without the trap, the payload would have been delivered unharmed. There was nothing useful that could be done about the out-of-bounds value during the actual launch, and they knew that when they built the system. Extra points for dumping debug details down a channel where it corrupted correct data.

Europeans I have mentioned this to considered what did happen a sensible outcome.


This was the first attempted flight of Ariane 5, which has ended up becoming one of the most reliable launch vehicles of all time. Also, according to NASA, because of its reliability, the Ariane 5 was used to launch the James Webb Space Telescope.

The code reuse in the project was pretty poor engineering, they reused code from a different project which had different flight paths and characteristics, and didn't test it.

The modern equivalent would be copying and pasting code from a completely different project, and then just shipping to production without testing it at all, even locally.

Note that Ada is being used for the Ariane 6.


Yet, integrating over the lifetime of Ariane 5, the overwhelmingly largest single cause of loss-of-vehicle failures was a software bug that would not have occurred with a different implementation language.

Spacecraft failures that can be blamed on failings of a particular language are very rare. There was a planetary space probe that failed arguably because it was coded in Fortran.


I don't agree with your point.

Leaving in debug code, especially for an entire part of the launch that doesn't exist anymore by bringing in reused and untested code, seems like the real problem. I'm pretty sure integer overflow in this case would be undefined behavior, so you don't know what the compiler would do, especially since things were a lot more wild on the hardware side in 1996. Your point sounds like arguing that preferring Objective-C would prevent all those pesky `NullPointerException`s in Java since it allows sending messages to nil (null), but that's success by coincidence.

If you read the ESA's report [1], there were a lot of things which failed before you even get to the language. This software failure is the only one in the entire Ariane series of rockets. Considering that this is 1996, the list of languages to pick from was pretty slim, e.g. this predates C++ initial standardization (1998), `enum class` and other modern C++ features.

Even using just Ada 83 (of which 95, 2005, 2012, and 2022 succeeded and improved), I would argue that its forward looking features of preventing mixed mode arithmetic having properly typed checked enums and other features (like being strong typed) probably prevented a significant number of failures should other available languages at the time been used.

[1]: http://sunnyday.mit.edu/nasa-class/Ariane5-report.html


You miss that all the Ariane 4s were launched with the same flaw: any of them would have blown up for the same reason, if their vibration had peaked a bit beyond expected amplitude.


It wasn't a flaw in Ariane 4 because it used a very different rocket flight path. This means the BH value which caused the error was physically impossible.

> The value of BH was much higher than expected because the early part of the trajectory of Ariane 5 differs from that of Ariane 4 and results in considerably higher horizontal velocity values.

> o) In Ariane 4 flights using the same type of inertial reference system there has been no such failure because the trajectory during the first 40 seconds of flight is such that the particular variable related to horizontal velocity cannot reach, with an adequate operational margin, a value beyond the limit present in the software.


It is a flaw exactly because, while Ariane 4 should not produce such an acceleration, if it ever did, momentarily and otherwise non-destructively, the spurious trap code would have automatically destroyed the vehicle by dumping garbage to the gimbal.

The Ariane 5 disaster was implicit in the design. Without that built-in flaw, the Ariane 4 inertial platform would have worked correctly as-was, and the first Ariane 5 would have delivered its $half-billion payload successfully, instead of obliterating it.

There is no circumstance where dumping debug tracebacks down a control channel not prepared for that is ever correct. Period. Anyone suggesting otherwise is displaying their blinders.


> There is no circumstance where dumping debug tracebacks down a control channel not prepared for that is ever correct. Period. Anyone suggesting otherwise is displaying their blinders.

We agree here.

Did you read the report?

> The error occurred in a part of the software that only performs alignment of the strap-down inertial platform. This software module computes meaningful results only before lift-off. As soon as the launcher lifts off, this function serves no purpose.

> The alignment function is operative for 50 seconds after starting of the Flight Mode of the SRIs which occurs at H0 - 3 seconds for Ariane 5. Consequently, when lift-off occurs, the function continues for approx. 40 seconds of flight. This time sequence is based on a requirement of Ariane 4 and is not required for Ariane 5.

> m) The inertial reference system of Ariane 5 is essentially common to a system which is presently flying on Ariane 4. The part of the software which caused the interruption in the inertial system computers is used before launch to align the inertial reference system and, in Ariane 4, also to enable a rapid realignment of the system in case of a late hold in the countdown. This realignment function, which does not serve any purpose on Ariane 5, was nevertheless retained for commonality reasons and allowed, as in Ariane 4, to operate for approx. 40 seconds after lift-off.

> The reason for the three remaining variables, including the one denoting horizontal bias, being unprotected was that further reasoning indicated that they were either physically limited or that there was a large margin of safety, a reasoning which in the case of the variable BH turned out to be faulty.

> Ariane 4 should not produce such an acceleration

My point is that it's a velocity, not an acceleration and hence is physically impossible by the design and flight parameters on the Ariane 4. It's like saying an unsigned byte (8 bits) is okay for a speedometer for a minivan because >255 mph is outside the physical capabilities of the vehicle, but reusing that speedometer code for a fighter jet is not appropriate.

All code is written with these assumptions of value ranges, it's just a manner of how the language handles them, and whether that behavior is predictable (UB or not) or crashes when outside those bounds. A lot of C and C++ code is written with short/int/long without realizing that those are fuzzy boundaries (short <= int <= long) and without going to ensure sufficient sizes (e.g. uint32_t). Math code which doesn't account for this can produce wild and usually incorrect behavior.

When enabled, any value in Ada can throw a constraint check if violated, and it just makes it explicit. You're assuming that "Don't crash on overflow" is always a good thing based on a single case, there's a lot of systems, particularly mechanical ones, in which a broken sensor reporting a pegged sensor value would want an immediately halt to operation. Also in space applications, out of bounds values can arise from other conditions as well, such as cosmic rays flipping bits, in which a computer restart (relying on the redundant system) is the correct and expected behavior.


> Europeans I have mentioned this to considered what did happen a sensible outcome.

Well, yeah. The firmware had been designed for a quite different rocket, the Ariane 4: https://www.bugsnag.com/blog/bug-day-ariane-5-disaster

I don't know how this is really a mark against Ada, it's an issue that would only be discovered by human intervention or by running a simulation of the rocket.


It would have made the Ariane-4 explode, too, if it had ever tripped.

It is not a mark against Ada, particularly, it is a mark against brain-dead insistence on overflow traps.

And, destroying a $half-billion payload because you insisted on leaving debugging apparatus in a production system would seem less obviously correct if the cost of the loss were deducted from your salary. So, people insisting it was a good result may be interpreted as expressing satisfaction with freedom from consequences of irresponsible behavior.


Maybe having devs paying from they own salary the business costs of buffer overflows, stack corruption, and use-after-free isn't that bad idea, good move.


That is the biggest issue with C derived languages, the unsafe parts would be perfectly alright if they weren't the default way of doing things.

Of course using straighjacket systems programming languages (as they put it) doesn't match hacker culture, so here we are.


Don’t you have to add an INTO (interrupt on overflow) to trigger that? I don’t remember there’s a mode where arithmetic overflows trap automatically.

(And if that’s the case, by now INTO will be a slow microcode implementation, with JO _trigger_into probably being much faster)


Unfortunately INTO has been deprecated by AMD 20 years ago, with the passing to 64-bit.

No Intel/AMD CPU supports INTO in 64-bit mode.

It is wise to compile any C/C++ program with options like:

"-fsanitize=undefined -fsanitize-undefined-trap-on-error"

(as for gcc) or whatever they are called by the used compiler.

In that case, the compiler will add instructions with the same effect as INTO, but with more overhead than it was needed in 32-bit Intel/AMD CPUs.


are you sure it is more overhead? Sanitize overflow adds a single jo in the critical path. INTO has been microcoded for decades in 32 bits.


The guaranteed overhead is in size, because JO has either 1 or 5 extra bytes per overflow check, compared to INTO. If you want the exception place to be known precisely, the program counter must be saved separately for each check, which adds a few extra bytes per check (e.g. 5 extra bytes for a CALL instruction, which allows the use of a short JO, so there are 6 extra bytes per check in total). Because there are many checks in a program, the extra program size is non-negligible.

The execution time should be about the same for INTO and JO, both for the normal case, when both INTO and JO are ignored and for the error case, when JO is guaranteed to be mispredicted, so its long execution time added to the time needed to fetch and execute the following instructions required to match the effect of an INTO (e.g. saving state and a second not-predicted jump or call that might be needed to reach the actual exception handler) will also be about the same as what would have been needed by an INTO exception, if not longer.


If this was really a concern, it would have likely not been deprecated in the move to 64-bit, and would have had support in other architectures as well.

And if you're really worried about size, "JNO 1f; INT xy; 1f:" 'with xy having to be system wide and well known, but it would only take 3 extra bytes (or just 2, if you HLT and the OS knows what JO/HLT combo means). The branch predictor will likely predict this as a no-take in any tight loop.


Almost all older architectures, except the very low-end CPUs like the 8-bit microprocessors, had integer overflow exceptions, not only divide-by-zero exceptions, so they did not need any instruction like INTO.

Other architectures, like IBM POWER, have much more powerful conditional trap instructions.

The 64-bit ARM ISA is the only other important architecture where integer overflow must also be done by combining a conditional jump either with a function call or with a supervisor call (i.e. the equivalent of INT).

It is true that if INTO would have never been defined in 8086, it would not have been a great loss, because it can be emulated, as you say.

Nevertheless, its deprecation by AMD cannot be a reason for praise. At the same time, they have also deprecated a few other instructions, which were a good choice, because they were much less useful. On the other hand there were a very large number of other instructions that could have been deprecated instead of INTO, which were much less useful than INTO, but they have been kept nevertheless.

It was pretty clear that INTO and BOUND were not included on the deprecated list based on their degree of usefulness but mainly because it was obvious that the CPU designers will be able to get away with this, because all the important software companies were habituated to compile their release versions by choosing to disable all run-time checks, so they will not complain about this omission.

If the software vendors would have ever been punished for choosing to omit the error checks, then they would have requested CPUs that still have good performance when the checks are enabled and the CPU designers would have been prodded to improve such features.

Nowadays, due to the hype about security flaws, the CPU vendors are introducing a large number of run-time checks, but most of them seem rather misguided, because in the attempt to protect against programming errors some of the methods used e.g. for limiting the targets of jumps also prevent the use of certain good implementation techniques for some programming language features.

Instead of the many dubious new "security" features of Intel/AMD and ARM CPUs, I would much prefer to see improvements in the efficiency of overflow checking and bounds checking, because these are checks that are required in any program, no matter how well it is written, while most of the new "security" features are mostly useful to avoid problems created by programmer mistakes.


> That is one of those legacy options people complain about because they aren't used by C

Another one would be bound, which eventually got dropped.

https://www.felixcloutier.com/x86/bound


> That is one of those legacy options people complain about because they aren't used by C.

Have you seen how the C math library does error handling? It silently writes to errno. I can just imagine the runtime using the trap to do just that and continue as if nothing happened.


Unfortunately the default is like you say.

However all C/C++ compilers have options to detect and report or trap on all common errors, but few people use these compiling options, because of a usually baseless fear that they might reduce the performance more than acceptable.

In the cases when the performance is indeed too low for correct execution, the complaints should be directed to the CPU manufacturers who are guilty of this (by neglecting the architectural features needed for handling many kinds of exceptions, which were considered mandatory in older CPUs), instead of improving the performance of the programs by cheating, i.e. by omitting the error checks and by accepting that the programs may have horribly wrong behaviors in seldom cases.


There is a long-standing principle in system design: do not check for an error condition that you cannot do anything about. It is the responsibility of whoever is so equipped either to ensure that you cannot encounter such an error, or to have provided a path to deal with the error.

In C++ you can throw an exception, and count on anybody who cares to act on it, resolving the conflict.


x86 can trap on int overflow? I hadn't heard that before, I should look into it, but is there some special little-known trick? I'm not an x86 wiz obviously, but I thought I had basic familiarity with it.


It was removed from amd64 because it was not actually useful in practice.

I.e., sometimes you might want the check, but other times you certainly don't, and turning it off and on at the right times is too hard to get right. And, the ability to turn it on, even when it was never used, slowed everything down.

You would anyway need to be able express the choice in the languages used. If you had such a way, a compiler could insert extra instructions to check or to suppress checking where it is not wanted, but making such a choice wherever arithmetic happens would be a big burden to programmers already just barely hanging on.


It was not removed because it was not useful. Generating exceptions on overflows is extremely useful.

It was removed because it was seldom used by Microsoft and by the other large companies which mattered, so the AMD and Intel designers took advantage of this bad habit of many programmers, in order to simplify their CPU design work.

It was seldom used because the programmers have always preferred to obtain the highest speed even with the risk of erroneous computations in seldom cases.

The reason is that a lower speed is noticed immediately, possibly leading to a lost sale, while occasional errors may be discovered after a long time and even when they are discovered, the programmers who have made this choice are not punished proportionally with the losses that might have been caused to the users, which in most cases are difficult to quantify anyway.

A normal policy is to always check for integer overflow, out-of-bounds accesses and so on. Such checks are provided by all decent compilers.

Only when performance problems are identified and if it is certain that there is no danger in omitting the checks, the checks should be omitted in the functions where this matters.

Selecting whether run-time checks must be done or not is very easy. You just need to use the appropriate sets of compiler flags for the 2 kinds of functions, those with enabled checks and those with disabled checks.

It is sad that the C tradition has imposed that the default compiler flags are with disabled checks, but that is not an acceptable excuse. Any C/C++ programmer should always enable the checks, unless there are good reasons to disable them.

If they do not do this, it is completely their fault and not the fault of the programming languages.

However the C/C++ standard libraries have the problem that they lack a good standard way of handling such exceptions (i.e. a better way than the UNIX signals). That means that if you do not want to abort the program in case of overflows or out-of-bounds accesses, it can be difficult to ensure a recovery after errors.


You may insist on this, but the fact that it does not exist in any convenient form leaches strength from your argument.

In fact harmless overflow is extremely common, and relied upon. Trapping by default, while it would would call attention to some bugs, would also turn many working programs into crashing programs.

Saturating arithmetic would sometimes produce better results.


I believe that you confuse integer overflow, which applies only to the signed integers of C/C++ with the arithmetic operations on modular numbers, which are improperly called "unsigned integers" in C/C++.

When computing with modular numbers, you rely on the exact modular operations. That is not overflow.

There exists no "harmless overflow". The overflow of "signed integers" is an undefined operation in the C/C++ standard.

The actual behavior of integer overflow is determined by the compiling options. With any decent compiler, you may choose between 3 behaviors: trap on overflow, which aborts the program unless it is caught, display an error message and continue the execution with an undefined result of the overflown operation (typically with the purpose of also seeing other error mesages), or ignore the overflow and continue the execution with an undefined result of the overflown operation (the default option).

Relying in any way upon what happens on overflows (of "signed integers"), when choosing a compiling option that does not trap on overflow, is an unacceptable mistake, because what the program does is unpredictable.

Relying on what modular arithmetic does with "unsigned numbers" is correct, except that in most cases they are not really used as modular numbers, but the programmers expect that modular reduction will never happen. When it happens and it was not expected, it may cause similar problems like overflows, even if the result could have been predicted.

You are right that saturating arithmetic is the main alternative to trapping on overflow. It can be used in the same way as infinities are used with floating-point numbers when the overflow exception is masked.

Unfortunately, saturating arithmetic does not exist in the standard C/C++ languages. Therefore in C/C++ you may choose only between trapping on overflow and dangerous undefined behavior.


Saturating on overflow as bad as the 2s complement wraparound on unsigned if you were not expecting/relying on it. C lacks a type where unsigned overflow is an error or UB. Having it UB is acceptable because then you can choose a compiler option (-ftrapv) that traps it. But, I have heard this is unreliable, e.g. what happens if the overflow happens in library code?

Ada does this correctly, and lets you choose between exception and wraparound. UB is not an option in the language spec, though you can turn off the checks as an optimization option in GNAT, or maybe turn it off at specific lines of code with a pragma.


As far as I can find an integer overflow just sets the overflow flag and you need an explicit INTO instruction to check for the flag and run the trap handler.


> not free from problems

is a massive understatement…


C# has a compiler flag for checked arithmetic, i.e. overflow will throw an exception. You can't use c# for kernel development of course, but the idea has existed for decades (for much longer than c# existed).

That flag is not set by default in the new project templates for whatever reason. There's also an escape hatch, the unchecked keyword, for the rare cases where you need to overflow, e.g. when implementing hash tables.


> You can't use c# for kernel development of course

Midori team kind of disagreed with that, and most of System C# features keep landing on regular C# a bit per version, but I digress.


> Midori team kind of disagreed with that

Also Singularity https://en.wikipedia.org/wiki/Singularity_(operating_system) and a few random attempts like SharpOs https://sourceforge.net/projects/sharpos/ and Cosmos.


Singularity actually uses a micro-kernel in C++, with everything else in Sing#.

An approach similar to what Meadows has taken with NuttX.

I haven't mentioned, because I expected it as a possible reply that Singularity wasn't 100% pure Sing#.


-fsanitize=signed-integer-overflow produces rather decent code with GCC.


It's buffer overflow. Integer overflow doesn't cause anything by itself there.


Read the report closer, its a buffer overflow because the buffer overflow check failed due to:

"

Effectively this implies that the compiler is free to emit code that operates on `reg` as if it were a 32-bit value. If this is the case (and it is on the kernel I tested), a user can forge an expression register value that will overflow upon multiplication with `NFT_REG32_SIZE` (4) and upon addition with `len`, will be a value smaller than `sizeof_field(struct nft_regs, data)` (0x50). Once this check passes, the least significant byte of `reg` can still contain a value that will index outside of the bounds of the `struct nft_regs regs` that it will later be used with. "

Which is the classic way to exploit integer overflow to bypass buffer checks (i've personally written code like this, which thankfully was caught before it got this far).


Did you try to contradict? You only confirm that the bug is buffer overflow. Take any language with bound checking, you won't be able to reproduce this bug there by merely overflowing integers.


There are over 1.412 nodes in the nftables syntax tree (as of 2019).

I know. I wrote a Vim syntax hilighter for nftables.

And it is still failing.

Just imagine how much untested surface area is that for the `nft` CLI.

My work: https://egbert.net/blog/tags/vim.html


How odd, your website gives me "ERR_SSL_VERSION_OR_CIPHER_MISMATCH" on Win64 > Chrome.


They have a good cert rating but ipv6 is not responding [1] and some clients will be denied based on cipher policy. I would be surprised if that applied to Chrome but one can always check. The DNS entry for ipv6 should be removed if not used.

[1] - https://www.ssllabs.com/ssltest/analyze.html?d=egbert.net&hi...


Thank you for the input. I really don’t care for Chrome users and their inability to do pure TLSv1.3. Also, my server decides the selection of algorithms in TLSv1.3, not the web browsers: and that’s ChaCha/Poly.

Yeah, it’s also a JS-free website.

Migration to IPv6 is a work-in-progress.

Hurricane Electric ISP also has a weird login condition; they want you to create a second account so you can pass your IPv6 certification (first was for DNS/IPv4). So I am looking for a secondary DNS provider for IPv6.


Dude your config is FUBAR. Why enforce TLS1.3 when you don’t understand that the way cipher suites are negotiated changed in TLS 1.3?

> Also, my server decides the selection of algorithms in TLSv1.3, not the web browsers

SMH. Either use Caddy with an empty config file or follow Mozilla’s recommendation: https://ssl-config.mozilla.org/#server=nginx&version=1.17.7&...

    # modern configuration
    ssl_protocols TLSv1.3;
    ssl_prefer_server_ciphers off;
Also lol are you trying to host your site via HE’s free tunnel?


Would the cypher negotiation be the problem? I checked that Chrome supports Chacha/Poly, and it seems it does.

- https://chromestatus.com/feature/5355238106071040

Chrome seems to support TLS 1.3 since v70, and I'm on 99.

There's only the 0-RTT/EarlyData as far as I can tell that may be messing things up, is it required for TLS 1.3? It's not enabled by default yet (still in dev?).

- https://chromestatus.com/feature/5447945241493504

- https://developers.cloudflare.com/ssl/edge-certificates/addi...


Yes, you’re not supposed to specify the cipher suites with TLS1.3

This guy also forces secp521r1 (the NSA curve which is impossible to implement correctly, is unsupported by Chrome and eventually by Firefox, and is dog slow) instead of using DJB’s x25519. This is what roleplaying as an SRE looks like.



The linked Mozilla bug report provides an excellent reason to drop P-521.

If you look at the telemetry for last actual Firefox release, only 8 (yes, eight!) out of 1.67 BILLION handshakes used a P-521 curve. This is a good indication that P-521 isn't needed, at least for certificates verified by web browsers.


Well, I need it. So, I get to have this. Isn’t the Internet awesome?


Thanks for the details!


FUBAR, in terms of speed, but of course.

because, TLSv1.3 negotiation capability is there.

My server dictates the selection, lest the client gets the sideway algo slip.

try it through your enterprise TLS transparent proxy.

also, no HE IPv6 tunnel.


> I really don’t care for Chrome users and their inability to do pure TLSv1.3.

You "don't really care" for a majority of users, based on what? TLS 1.2 is perfectly fine to use with strong ciphers. I salute your static website which is free of JS/cookies/bloat/external stuff (similar to my own blog), but at the same time you're actively sabotaging accessibility.


Not concerned with accessibility as opppsed to security.

If they are astute in cybersecurity, they will be using other (and secured) browsers.


What is your threat model here?

Also, I use Vanadium with JS disabled, which is already decently hardened imho. Yet, your choice of key exchange bars me from your site.



Thank god I don't use Chrome then.


Of course, it’s Chromium-based which in turn is Chrome-based, or something.


Choose a better set of keys.


also a JS-free website

That's great. I would love to see more JS-free sites and more lightweight sites.


> I really don’t care for Chrome users and their inability to do pure TLSv1.3.

I am confused why even provide a link to a website which is configured to be inaccessible to most readers. May as well say "My work: an unpublished manuscript; message me for a copy".


Because a real cybersecurity aficionado would use the right tools.

And my website is for those who do do use the right tools.


Knocked the IPv6 from its DNS records … for now.


Yet another CLONE_NEWUSER and fly away to victory privilege escalation. Really seems like enabling that was premature.


Exploitable uninitialized stack variables in 2022! Remarkable.


Apparently nobody wants to take the 'secure default' performance hit of setting CONFIG_INIT_STACK_ALL_ZERO , as it will look slower compared to other distros.



As far as I can find that flag seems to be clang specific? Which distros even use clang? Also since the Kernel is not pure C not all safety options are safe, at one point a few distros enabled stack overflow protection, only to end up with a kernel that randomly corrupted application stacks.


> Which distros even use clang?

Altough not a proper distro, Android's Linux kernel has been using clang for about 5 years now.


GCC 12 will have -ftrivial-auto-var-init=zero


Yeesh.


I don't understand, does this lead to privilege escalation by parsing a crafted netfilter rule? I am under the impression that I need NET_CAP_ADMIN or root privs to do this on my machines to load said rule anyway, right? So this affects deployments where regular users are able to do send netfilter rules to the kernel, right?


> In order for an unprivileged attacker to exploit this issue, unprivileged user- and network namespaces access is required (CLONE_NEWUSER | CLONE_NEWNET)


My understanding is that you can supply netfilter code which reads/writes to a register memory which is outside of the registers struct. So the rules bytecode you provide can do out-of-bounds access when the rule is executed, rather than when parsed.

You can either do this as NET_CAP_ADMIN, or when you create your own user+network namespace as an unprivileged user. (which may not be allowed on your system either)


With user namespaces all of that is accessible now. For decades upstream hasn't cared about privesc let alone root -> kernel privesc, but now everyone's root lol


You can't have privilege escalation if everyone is root, modern problems require modern solutions.


Ironically, this privesc is in fact only possible because everyone now has root via user namespaces.


Out of bounds access, uninitialized stack data with an extremely weak language that doesn’t check any of that and happily compiles that hidden footgun gives you an escalation of privileges vulnerability.

Perhaps Rust would have prevented this in the first place. But the entirety of Linux and the ancient UNIX philosophy is a giant labyrinth full of cobwebs riddled with hidden traps, landmines and trip-wires beyond exploring.

Must be the worlds largest and endless minesweeper game discovering all those C style vulnerabilities in the Linux kernel.


I'm starting to get sick of the neverending "rust good, c bad" sentiment on HN.

Bugs occur in rust code too. It's not a silver bullet to fix all problems associated with software development. Each language will have its own set of issues and quirks. It's easy to pick on C because it's been around far longer than most things. Give it 40 years and people will be saying "rust bad, new thing good".


That's not an argument for sticking with the old programming language that apparently is impossible to use safely (if you think that you can use it safely, what makes you think that you are better than the all-star cast of C programmers that have contributed CVEs over the years?). There were better alternatives to C in the 80s, there most certainly are better alternatives to C these days. I'm so sick of the Uncle Bob school of thought that all that is needed for better software engineering is discipline. If we haven't managed to impose that discipline yet after five decades, why would we believe that it will magically appear in the future? My guess is that a programmer that had the requisite mythical discipline would gravitate to programming languages like Rust or Ada anyway, because that developer wouldn't mind tooling that helped with it.


IIUC, this is just another case where unprivileged users are now allowed to do what once was allowed only to superusers. As long as only root was allowed to add net filter rules, what did it matter if they could do bad stuff? They're root already!

Now, in places where security didn't matter, it suddenly does. Thus, it's not about bad coding habits, but inadequate care in extending privileges to untrusted users. The code should have been cleaned up first.


No it's both. There's a ton of issues like overflows in the kernel but people have prioritized looking at unpriv -> kernel or unpriv -> root, and not root -> kernel. Now many of the root -> kernel vulns are unpriv -> kernel.

The issue is still the code, it's just the impact that has changed.


> Bugs occur in rust code too. It's not a silver bullet to fix all problems associated with software development.

Not being a silver bullet doesn't mean it's not an enhancement.

Of the two bugs mentioned in this announcement, one is an uninitialized local variable, which is allowed in C but not allowed in many other languages (like Java or Rust). The other one is an implicit cast from an enum to a signed integer; in languages where this particular kind of cast must be explicit, not only it's easier to see when it's a signed integer, but also it's more probable that it would be cast to an unsigned integer instead.

> Each language will have its own set of issues and quirks. It's easy to pick on C because it's been around far longer than most things.

It's easy to pick on C not because it's been around for longer than most other popular programming languages, but because it has a larger amount of footguns. It's true that Rust has plenty of quirks of its own, but most of them will not cause security issues.


Bugs absolutely occur in memory-safe languages. Log4j was a thing.

But a huge number of exploitable vulnerabilities are completely prevented by memory-safe languages. We've spent decades trying like hell to come up with other solutions to prevent these kinds of bugs in languages like C and utterly failed. Nobody knows how to write C or C++ programs of any meaningful scale free from memory safety errors without calling for total rewrites and ballooning budgets by 50x.

> Give it 40 years and people will be saying "rust bad, new thing good".

Of course. Rust will have systemic problems and some new thing will mitigate those problems. This is a good thing, not a bad thing. Why should there be any expectation that a language be the desired language for engineering projects until the end of time?


I never said new languages are a bad thing. I only made the point that people mindlessly bandwagoning and repeating the same talking points endlessly is an annoying thing.


Is it mindless bandwagoning? "Memory unsafe languages are an industry-wide emergency" should be a more-or-less constant discussion in the industry, IMO.


Memory safety is certainly a topic worth talking about, but on HN it always ends with the same conclusion: use rust. Is that really the only solution?

I've never seen anything quite like the rust community. They're hell-bent on recruitment. More so than any other language platform I've seen. Personally, I'm just sick of it. It's fine if you don't feel the same way. I was just voicing my frustrations.


> Memory safety is certainly a topic worth talking about, but on HN it always ends with the same conclusion: use rust. Is that really the only solution?

No. Of course not. It appears to be the most common solution and community matters, so I suspect it will end up being a dominant solution. But you can (mostly) solve this problem with completely orthogonal approaches. Getting all of your code to run on x86 with MTE enabled is another approach that prevents entire classes of problems.

I'm not a member of the Rust community. My actual work focuses more on the Log4j-style stuff than buffer overruns. But I really don't know what people who are upset about the state of memory safety can do to prevent the "ugh, here come the rust nerds" complaints other than simply never raising the problem - which obviously isn't the right approach.

10 years ago the C and C++ community complained about a different thing whenever memory safety came up. "UGH, stop talking about GCed languages. GC advocates are such aggressive recruiters." Now the complaints have just reoriented themselves.


No it isn't the only option, plenty of languages are available, specially if automatic memory management is an option on the specific deployment scenario.

Usually we land on Rust only discussions, due to more active advocacy that should known better to avoid tiring the audience, or by defensive attittude from unsafe language folks that push back every single reference to write secure software as a sales pitch for Rust.


> No it isn't the only option, plenty of languages are available

My point exactly.

> to write secure software as a sales pitch for Rust.

You guys already do a good job of writing the sales pitches. I'll leave that to you.


I always mention secure system programming languages going back to 1958. Don't remember Rust being that ancient, and have plenty of critical remarks about Rust as well.

So maybe address that "You guys" to someone else.


Seriously! Kinda like those food services people who harp on and on about gloves. Clean gloves prevent this, clean gloves solve that. And they're not content to just use gloves themselves. They keep evangelizing to recruit everyone to their practice.

It gets annoying to those of us who know how to do safe food prep without wearing gloves.


I'm starting to get sick of the neverending sentiment that Rust is the only option to write secure software, and any remark for security gets dismissed as Rust sales pitch.

Plenty of languages to chose from to replace C and its kind.


Of course there are. Swift might be a viable alternative. There are also up-and-coming languages like Zig. But, like it or not, Rust has the best combination of features, mindshare, and community to make it a viable path for businesses today or in the very near future.


Only for domains where automatic memory management is not an option, kernels, hypervisors, high integrity computing, GPGPU,...

Also I should note that NVidia, prefered to go with Ada/SPARK instead of Rust for their automotive firmware.


Sure, Ada/SPARK is another option. Listing out the various potential choices people can make is not especially interesting to me. I don't personally care about Rust. I care about people taking action to solve a frankly embarrassing problem in our industry.


Liabity would be a good push turning into reality.

https://www.thomashelbing.com/en/whitepaper-templates-checkl...


That would, depending on implementation, either have limited impact (if it only applied in the context of a commercial relationship), or destroy FOSS (if it didn't).


The usual argument, health inspections don't destroy the little business selling sandwiches on the corner, or street regulations and those doing their own home made car.


Plenty of languages far better suited for most use cases, yes we have that. But to replace C, there are not that much candidates. I don’t mean Rust is such a candidate.

Maybe Zig is currently the closest thing to such a candidate, but it doesn’t even pretend to aim at this, andalign with the "C ABI as interoperation modus for foreign function interface" policy, like the rest[1] of programming languages.

[1] https://ziglearn.org/chapter-4/


From 1958 up to the mid-1980's many OSes were written in something else other than C, and in the 15 years that preceded it there were approaches beyond pure Assembly.

Granted it is a matter for the market to be willing to move beyond it, and the OSes written in it, but the matter isn't technical per se, rather economical, political, human behaviours,...


Yes, we totally agree that this is not a purely technical issue. This is not completely unrelated, inertia here is also fed by technical considerations.


while rust probably wouldn't have caught this bug, it is worth noting that the class is memory bugs C and C++ have are especially pernicious. bugs caused by manual memory allocation in these languages are often highly exploitable, and come from a common and necessary part of the language. As a result you end up with somewhere around 70% of security vulnerabilities being caused by a feature that just doesn't exist in other languages. the reason Rust gets so much hype is that it removes the last excuse of C/C++ (GC time).


Yes. I understand the argument. It's only been reiterated on HN hundreds of times now.

> it removes the last excuse of C/C++

Odd way of looking at it. I never ask myself "what _excuse_ can I give myself for using this tool"?

I'll continue using C and I don't need an excuse. I prefer C.


But do your users prefer C? For software engineering to be a professional discipline, we really need to make choices based on the product outcome rather than based on what feels fun to program.


I imagine they do. If they didn't, they wouldn't be using it. They would instead be using some equivalent piece of software written in a trendy language.


The large majority of users cannot make well informed security risk decisions like this. Engineers should do the right thing and help these users. In the same way, I can't make some meaningful risk assessment for using a bridge or riding an elevator. Civil Engineers don't just get to say "well, if users are concerned about the safety of my bridge then they can make a different choice so I'm going with the stuff I personally like working with." Why do Software Engineers get away with this?

Outside of small hobbyist projects, the industry has an obligation to provide users with safe software.


Right. So in your opinion, is it irresponsible to write a project intended for production in C or C++?


I believe that it is irresponsible to

1. Start new projects intended for production that have nontrivial security threats in C or C++

2. Not have a plan to categorically prevent memory safety errors in legacy codebases over the next decade or so, whether that be by transitioning to new languages or by applying rigorous hardware-level memory tracking


I bet if liability was a common thing in software delivery, those languages wouldn't be the first option after a couple of lawsuits.


Yeah can you imagine if it actually mattered if companies made choices that they knew were inevitably going to lead to zero-click exploits on internet-enabled devices? Somebody sitting down to write a media decoder in C today knows that this means a steady stream of exploits harming their customers.


I don't think your argument is all that strong. Memory bugs are not equivalent to vulnerabilities. To get to a security vulnerability you first need a logic bug that makes incorrect assumptions about the program state, the hardware, the API, the input, or something else.

Also, what is so bad about crashing bugs? To me, as a programmer, they're very good news. It means you found a bug, you (hopefully) have a crash/log to analyze, and more importantly it means the program didn't just silently continue executing and corrupt state/data without anyone knowing.


Memory bugs are often vulnerabilities. Efforts to classify them as “very unlikely to be exploited” almost always end up turning into “someone with sufficient interest can exploit this”.


>someone with sufficient interest can exploit this

Sure, but that applies to any technology on any platform at any time.


It absolutely does not.


The problem isn't that we don't understand what you're saying.

It's that it's parroted a thousand times every week.


As you can see from these vulnerabilities, it's more often buffer overflow and not allocation.


Most languages other than C/C++ (eg rust) do bounds checks by default. Also, C is pretty much the only language that uses null terminated strings which are a very common source of overflow.


> Give it 40 years and people will be saying "rust bad, new thing good".

Why is this an argument? Yeah, personally, I'd be very happy if in 40 years there's something better than Rust. It's been a solid few decades of C prominence, and moving from lang to lang should of course not be something that happens every few years. But on the scale of 40 years? I'd hope we aren't all still using something made today in 40 years, let alone a few years old.


> Why is this an argument?

Who said it was an argument? It's merely an observation of what I've witnessed here on HN: endless sentiment about rust and how everything should have been written in it.

The linux kernel was initially written 30+ years ago. Rust didn't exist. C did. Complaining about it being written in C does nothing to benefit anyone.

> I'd be very happy if in 40 years there's something better than Rust.

Me too.


> Complaining about it being written in C does nothing to benefit anyone.

For what it's worth, I agree. It's already written, so rewrite it yourself or use a Rust-based OS or work on Rust internals/specs in order to improve the language.

But I also believe that sometimes it can be a legitimate discussion. It can serve as a reminder that often these languages have footguns that even experienced developers that insist they can use C effectively will eventually fire off.


Rust is good and C is bad though.

> Bugs occur in rust code too.

Generally speaking, not these ones.

> It's not a silver bullet to fix all problems associated with software development.

No one said otherwise.

> Each language will have its own set of issues and quirks.

But not memory safety ones.

> It's easy to pick on C because it's been around far longer than most things

Not at all. Lots of languages existed before C that had improved safety. It's easy to pick on C because, with regards to memory safety, it's awful.


There are ways to manage the risk of buffer overflow in C without changing the language, but C programmers don't want to do it, because traditionally they believe they need to be brave and do it the hard way.


One could even add basic string and array types, so that the ecosystem could optionally slowly move away from pointer based types, but it will never happen.


> Give it 40 years and people will be saying "rust bad, new thing good".

If there's a significant improvement without critical downsides - I really hope they do. What's the reason not to?


There's no reason not to. We'll just have to suffer through thousands more HN posts saying "new thing better, they should have used the new thing" whenever somebody finds a new kernel vulnerability.

It does nothing to help the situation. It's just complaining from people who won't be satisfied until all software is written entirely in rust. Why not contribute to the kernel and try to fix some bugs instead?


I believe there will always be some people cheering from the side, not directly contributing. But they're a useful barometer for the current weather. And for some time now we knew a change is coming. Who knows if the Rust kernel modules would've happened without so many people reminding us that we don't have to suffer from the same bug classes forever.


> It's just complaining from people who won't be satisfied until all software is written entirely in rust.

I don't care what memory safe language they choose, rust is just the obvious choice for this domain.

> Why not contribute to the kernel and try to fix some bugs instead?

Who says we don't? My company contributed a 0day to the Linux kernel just recently.

Maybe we're all saying "new thing better" because we're sick of people writing garbage software and making users unsafe? Maybe we actually know better than you?


Rust doesn't provide integer overflow safety unless explicitly requested. This is something refinement types which ATS2 provides would catch.


This isn't exactly true. Rust doesn't include such checks when compiled in '--release' mode. And Rust provides plenty of methods to make the programmer's intent clear and perform 'safer' arithmetic ops.

See: https://doc.rust-lang.org/book/ch03-02-data-types.html?highl...


And unless there's an unsafe block integer overflow can not lead to a memory safety issue.


I know, like why didn't they just write everything in rust from the start! Oh yeah, rust wasn't about.

Well they should all stop what they're doing and rewrite everything in rust. Oh hang on, that's millions of lines of code.

Is not Unix philosophy, its legacy. And it predates rust.


Oh, it is UNIX philosophy alright, given that C was created to make UNIX portable, and the security best practices were disregarded, offloaded into lint, that most people keep ignoring.

> Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions.

https://www.bell-labs.com/usr/dmr/www/chist.html

> Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.

-- C.A.R Hoare on his Turing award speech in 1981.


Uninitialized stack is just negligence. Compilers can trivially initialize all stack variables. Yes, it's slower, but Android and Windows already chose slow but secure default. Linux just decided to be insecure.


Trivial initialization is not a panacea.


Absolutely correct! That's the answer/solution. Not rewrite everything in rust.


Unix philosophy has plenty of problems: https://web.mit.edu/~simsong/www/ugh.pdf


That book as trully outdated. By ~1999-2000 most BSD's and GNU/Linux since 1996 solved all the issues with better shells, faster tools and toolkits like Motif (and later QT), and multimedia thanks to SDL, DRI and FFMPEG.

It's like noticing Windows 9x's lack of security and stability on Windows 10 times. Absurd.

Heck, that book praised Mac OS back in the day, and just look what Mac OS has been turned into since Mac OS X.

Also, thanks to Unix philosphy my Fluxbox+Rox+Roxlib combo based "DE" it's much faster and featureful than most light desktops out there.


Of course it's outdated. But many of the complaints are still just as true. Two letter command line options, the expansion of * in place particularly if there's a file with that name, the role of the shell versus the app in parsing arguments, coredumps on crashes versus interactive debuggers, losing many of the great things that Symbolics computers did that still most computers don't, the preference of simplicity over correctness, the way file versioning works (it doesn't exist)... it's full of great timeless analysis. I'm only partially way through it myself and its easy to find things that are not or cannot be solved without massively changing the OS. It goes into great depth about the philosophical problems regardless of time.


> Two letter command line options

That's a con against RSI.


Plenty of languages older than C would have caught that.


As an early career developer I feel helpless with the vast world of cyber security looming all around me, and not that many people thinking very much about it.

It feels kind of like COVID in 2022. Obviously everywhere. Probably not going to hurt me? Could end my career.


> Could end my career.

Very unlikely.

Stick to good practices. If you are asked to do something that's bad for security, raise your objections in written form (email/tickets).

Security isn't only your responsibility; it's also the product manager's and the security team's responsibility.

All together this means that it's very hard for a company to convincingly blame a single developer for an incident. Maybe they fire you as a scapegoat, but that isn't very likely, and it should be far from career ending.


> Security isn't only your responsibility

This.

As a developer, your security-aligned goals should include code review, readable/auditable code, configuration options that support testing/QA, features that operate on a principle of least surprise to the user, and other things like these. Security is ultimately a blended practice.


Don't worry, embrace the lack of care writing secure systems, plenty of job opportunities in cyber security until the culture finally changes.


I felt the same way back when my career started in the late 90's, with what felt like daily new kernel updates for security issues and tons of realizations that the "old way" of doing things was insecure.

You'll make it. There will be bad times and there will be very long nights and weekends, and you'll probably have a few events in your life that will make you say "ok, I'm done" but you'll survive and make it through.

Try to be mindful of your mental and physical health as dealing with these issues can take a toll on you, but do know that you will survive and make it through.


I’m an old developer and I’ve been worried about the same thing for most of my career. We’ll either be fine or you’ll have a far easier time switching careers than me. :)

But we’ll likely be fine. Just keep a healthy respect for security, keep learning and think through best practices.


> Could end my career

If it does half of us would be jobless. Do reasonable choices, agree them with the team and the stakeholders. For extra safety do not accept jobs above your level of knowledge of security issues.


Just being practical and communicative will get you farther than you think.


Is this something that only affects linux servers, or are desktop users affected as well? Apologies if this question is too noob-ish.


This is what I gathered from looking around:

1016 comes from an uninitialized value. 1015 is an int8 overflow and out-of-bounds access. Both are C footguns (though not exclusive to C). The latter arguably might not have happened under a stable/specified ABI.

1015:

    Introduced in 5.12
    Fixed in 6e1acfa387b9, 2022-03-17
    In LTS, fixed in 5.10.109 and 5.15.32
    Also, in 5.16.18 and 5.17.1
1016:

    Introduced in v3.13-rc1
    Fixed in 4c905f6740a3, 2022-03-17
    Fixed in same point releases as above, plus 5.10.109
    Doesn't look fixed in older LTSes yet


What's the practical exposure to CLONE_NEWUSER | CLONE_NEWNET?


Most Linux distributions these days are allowing all/most users (not just root) to create these namespaces by default, which allows normal unprivileged users to access security bugs in nftables (like this one) (amongst other things).


Most containers.


Most containers are going to block unshare() via seccomp, no?


Therein lies an interesting detail. Docker does block unshare in default configurations, using its seccomp filter.

However in Kubernetes, by default, Docker's seccomp filter is disabled. At the moment you need to re-enable it on a pod by pod basis. There is work to allow a default cluster-wide setting but that isn't at GA yet.


Most containers run as root inside the container, which means they can access nftables in the container.

One of many reasons running as root inside a container is a bad idea.


Most containers would not have CAP_NET_ADMIN and not be able to access nftables.


My understanding is that containers actually can access nftables with CLONE_NEWUSER even without CAP_NET_ADMIN.

EDIT: Apparently the Docker default capabilities don't allow CLONE_NEWUSER: https://opensource.com/business/15/3/docker-security-tuning


Except the default seccomp policy is not used for Kubernetes containers.

I didn't really think about this vector where you CLONE_NEWUSER in a container... definitely on systems that allow unprivileged users to do this it is a problem.


root@ee375d5150bc:/# pscap -a ppid pid name command capabilities 0 1 root bash chown, dac_override, fowner, fsetid, kill, setgid, setuid, setpcap, net_bind_service, net_raw, sys_chroot, mknod, audit_write, setfcap

That's ubuntu.


> Most containers run as root inside the container

Is that actually surveyed / quantified somewhere? I can't say I see that too often in professional environments and even home stuff sees a lot of standardisation around separate users (https://docs.linuxserver.io/general/understanding-puid-and-p...)


Who would have thought it !

Where are these admins who demand this configuration ?


I'm going back to kernel 2.6 .


It sounds crazy, but our CentOS 6 servers (which run 2.6.32) using CloudLinux 6 ELS (which provides kernel hot patching for vulnerabilities like this, plus continued security updates for Apache, PHP, MySQL, Glibc, OpenSSL, OpenSSH, etc through 2024) are our most reliable ones. It's given us a couple years more breathing room to let the Alma/Rocky debate settle and work on migrating legacy applications to newer platforms without having to stress about being on an unsupported release.


It is ok until new hardware is purchased that won’t run on the older kernels. For pure VM stuff, the 2.6.32 s fine (Go supports it without patches) but the newer kernels have much better scheduler and other cpu optimizations. I saw about a 25% efficiency gain from upgrading to 4 series kernels after specter etc.


Is this remotely exploitable or is it only a local vulnerability?


Local


Makes sense for a program which uses the command nft /s


I know you are probably getting downvoted, tho your comment made me chuckle really hard, thank you for that.

People, sometimes, forget how sarcasm is a nice tool in criticism (tho here is just for the fun)


Here comes the Rust people saying... should build everything in Rust it will fix all your problems and solve world hunger.


You could also just compile your kernel with things like stack zeroing enabled and eat the 0.1% performance hit in exchange for not getting owned by classic exploits :(


The safe systems programming people would rather point out to the list of systems programming languages since JOVIAL in 1958, and make a point that Multics, Burroughs and VAX/VMS actually had higher security assessements than UNIX, primarly by not having C powering their kernels, rather systems language whose security was part of the whole OS design.


Yeah, and I'll never stop saying we should use memory safe languages. Deal with it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: