Hacker News new | past | comments | ask | show | jobs | submit login
Making C Less Dangerous in the Linux Kernel [video] (youtube.com)
149 points by reddotX on Jan 26, 2019 | hide | past | favorite | 14 comments



Kees is one of the few developers in the world doing the boots-on-the-ground rough-and-tumble soul-withering grunt work of actually making Linux kernel more secure. It's not glorious work, but it's perhaps the most important work anyone's doing in the kernel these days. The world runs on the Linux kernel, and from a security standpoint, it's really been a mess (see: Dmitry's Syzkaller talk). Too many contributors are drive-by scattershot "my boss told me to upstream this stuff" types who don't really care about the incremental suckage they inject into the kernel. Bravo on Kees for doing what he's doing.


Indeed and that is why I mostly criticise C, we need solid foundations and UNIX clones aren't going anywhere for the foreseeable future.

So whatever can be done in a safer languages should be, and for use cases like this we really need to improve what actually means to use C.

Linux is a very good example that even with quality gates, safety errors creep in and something has to be done about it.

Really kudos to Kees and everyone else involved on these projects.


Indeed, we need more people like him.


There's a misleading statement in the "undefined behavior" slide. Allow me to nitpick, since this subject is so full of misunderstandings and confusion.

"What are the contents of uninitialized variables? ... whatever was in memory before now!"

This may be true or false - as the slide itself says, "with undefined behavior, anything is possible!".

Besides, the subject of accessing uninitialized variables is more nuanced than "undefined behavior". Among other things, the effects depend on the variable's type ("unsigned char" does not have trap values):

https://stackoverflow.com/questions/11962457/why-is-using-an...

https://stackoverflow.com/questions/6725809/trap-representat...

It's also important to note that C++ has different rules on this. For example, the extract_int function in the last link is valid C, but not valid C++ (in C++ you'd use std::memcpy to achieve the same thing in a valid way).


Also to note that ISO C++ working group is trying to reduce the amount of UB, while ISO C working group doesn't have any ongoing papers into this direction.


[flagged]


There is a genuine, non-pedantic difference between a variable having an arbitrary value and reading it causing undefined behaviour. For example, say you don't care what value a variable has so long as it's even:

    void foo() {
        int i;
        i -= i % 2;
        printf("%d %d\n", i, i);
    }
The numbers printed could be odd, because the compiler is allowed to do anything it likes. It could even print two different numbers out!

If you still think this is all pedantry that can't happen in practice, here's an example where compilers are known to do strange things when the behaviour is undefined:

    int bar(int param) {
        int uninit;
        if (param == 0) {
            uninit = 0;
        }
        baz(param);  // Some other function
        return uninit;
    }
In this snippet of code, you might think that you're safe so long as you don't look at the return value of baz(). But in fact the optimiser may conclude that param must be zero (because anything else would be undefined behaviour), so baz() is always called with a parameter of zero even if bar has another argument. An problem very similar to this was discussed as a source of possible vulnerabilities in the Linux kernel [1] (although I don't know if any actual vulnerabilities were found).

[1] https://lwn.net/Articles/575563/


How was the original statement perfectly valid? Let's try with two different compilers:

gcc 8.2: https://godbolt.org/z/ZcsChr

clang 3.8: https://godbolt.org/z/pVDCPi

It looks like at least gcc disagrees that reading uninitialized variables is a way to find out what's in memory.


I didn't even know that scnprintf was a thing. Cool! I learned something just by skimming the video


I think the presenter mentions it's only in Linux kernel though, not available in userspace.


The userspace version is snprintf.


There's a great write up here on how assuming scnprintf behaved like snprintf led to memory bugs:

http://blog.infosectcbr.com.au/2018/11/memory-bugs-in-multip...


The speaker mentions it, seems like the return value is different: https://i.imgur.com/Jj2fMZu.png


yes - snprintf() returns the length of the string that would be written even if it wasn't clipped to the length of the output buffer - if you want to accumulate strings into a buffer using the output length to update the offset/size bad things will happen.

scnprintf() returns the actual number of bytes written (less the null)


The C standard's snprintf is not like scnprintf.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: