Hacker News new | past | comments | ask | show | jobs | submit login

    // use heap memory as many modern systems do
    char *line = malloc(MAXLINE);
    char *longest = malloc(MAXLINE);

    assert(line != NULL && longest != NULL && "memory error");

    // initialize it but make a classic "off by one" error
    for(i = 0; i < MAXLINE; i++) {
        line[i] = 'a';
    }
So, you create something that does not fulfill the C library invariant of what constitutes a "string", and then pass it to a copy function that assumes this invariant? It isn't a fair thing to do, and frankly, I doubt it many beginner programmers care about things like this. Yes, they may run into such a "defect" and be very miserable for a while, but that will just teach them about debugging, and most important, invariants.

Zed, I appreciate your work, but if this is the direction you'll be taking with these articles, then don't bother.




> So, you create something that does not fulfill the C library invariant of what constitutes a "string", and then pass it to a copy function that assumes this invariant?

From the article:

> Some folks then defend this function (despite the proof above) by claiming that the strings in the proof aren't C strings. They want to apply an artful dodge that says "the function is not defective because you aren't giving it the right inputs", but I'm saying the function is defective because most of the possible inputs cause it to crash the software.

Zed's point is not that K&R is bad because their example code doesn't match C library invariants, he's saying it's bad because it encourages people to write code that blindly assumes C library invariants will always hold.

Certainly, if you're going to write C code there's some things that really do require blind trust (for example: that your code will be compiled by a conformant C compiler), but "all strings are safely null-terminated" is incorrect so often, and the cause of so many historic security vulnerabilities and crashes, that perhaps we shouldn't be encouraging new C programmers to do it.


So in a dynamically typed language, we then add type checks everywhere?

Because we wouldn't want a function to accidentally process data that wasn't meant for it, no?


The equivalent errors in dynamically typed languages don't lead to critical security vulnerabilities like they do in C. You get a nice exception and a clean crash, not a hook to grab root.


Realistically, just about every mainstream dynamic language today has its major implementation(s) written using C or C++. Some of them have significant amounts of C code underlying their common libraries, as well. Some of the most popular third-party libraries or modules are in fact nothing more than thin wrappers over existing C or C++ code.

Don't think that you're escaping C or C++ just because you're using Ruby, or JavaScript, or Python, or Perl, or Tcl, or even Java. Don't think that you aren't as vulnerable using a dynamic language as you are using C or C++ directly.

Furthermore, it is quite easy for dynamically typed languages to suffer from very serious security vulnerabilities. There was one affecting Ruby and Ruby on Rails widely publicized just a few days ago. You can read more about it at: http://news.ycombinator.com/item?id=5002006


But the beauty of modern programming languages is that the responsibility has gone from sole programmers to actual implementation developers. You would expect that people who implement the runtmies and libraries are far, far more knowledgeable, experienced and trustworthy. And it often is so -- the end user(programmer) can't be trusted to know all about security implications, possible vulberabilities let alone how to exploit them!

The whole idea that "Well it's insecure because you program in insecure way!" is outright idiotic, and should be killed. If that means getting rid of C and C++ and people who write in these languages(Hey, myself included!) then so be it. The faster the better. Sure, there are cases in which it's hardly possible, but I'd take 10 times slower computer for 1) fewer bugs 2) more advanced software 3) cheaper software 4) far less security concerns any day.

Now, I'm going to close the C++ project and go write some Python. Makes me feel happy and far less stressed, although C++ does damn well in comparison to C.


But using a language like Ruby or PHP, for instance, doesn't really lead to fewer bugs, or more advanced software, or cheaper software, or fewer security concerns in practice.

What it often actually leads to is untalented developers creating a lot of bug-ridden and vulnerable code extremely quickly. It's efficiency in all the wrong ways.

Do you remember that Diaspora social networking project that received a lot of hype a couple of years ago? It was a Ruby on Rails app, and the early releases were absolutely filled with some particularly nasty bugs and security flaws. The only reason they were eventually fixed is because the code was made public, and people pointed out these flaws. There is a lot of Ruby code out there, for instance, that isn't public, yet is still riddled with the same types of problems.

That's not to say that the same isn't true for Python, or Java, or C#, or C++, or any other language. But we shouldn't be claiming that using a language like Ruby or Python somehow leads to more secure code. It doesn't, and it's dangerous to think that it does.


> But using a language like Ruby or PHP, for instance, doesn't really lead to fewer bugs, or more advanced software, or cheaper software, or fewer security concerns in practice.

Which domain are we talking about? Of course web-based applications have their own problems, but imagine if idiomatic Ruby or PHP code was vulberable to buffer overflows, double-free/use-after-free or format string vulnerabilities on top of the current problems, would you still say that languages don't matter when it comes to software development problems and issues? Essentially, what you're saying is that modern programming languages aren't any more better in practice in said regards than C. Honestly?

Of course no language can prevent outright bad code, but a language, by it's design, can eliminate issues related to for example type safety and memory safety. Consider Rust as an example of this. What this means in practice is that the language by it's design manages to eliminate these issues. Code is less prone to bugs and has no security concerns related to these issues. More time for validating correct behavior and fixing misbehaving parts.

> But we shouldn't be claiming that using a language like Ruby or Python somehow leads to more secure code. It doesn't, and it's dangerous to think that it does.

What on earth am I reading? What are the equivalents to buffer overflow or format string vulnerability in Python or Ruby? How do I execute arbitrary machine code with Python or Ruby if malicious input is given to the vulnerable program?


Still, there is a whole class of memory errors / exploits that you can stop caring about once you have managed code. The tradeoff is obviously performance. Although as java/.net show us, not necessarily too much of it.


>I'd take 10 times slower computer for 1) fewer bugs 2) more advanced software 3) cheaper software 4) far less security concerns any day.

Fair enough, but as soon as you build your slow app the competitive market will want to buy the version that runs 10x faster. In some cases it won't matter, but where the software has to run in real time it very much does. There's no escaping C.


Sure, there's no escaping C, but that's mainly because of the investments we've put into it. Same goes for C++, but it's far more manageable from security point of view. Today we have modern languages which are only a tad slower than C, yet which guarantee safety and control. See Ada2012 for example. Also optional unsafe/unmanaged code blocks can really help with maintaining performance in critical parts while keeping the non-critical parts safe/managed. This goes as far as optional garbage collecting for certain objects and manual for others. Flexibility, none of which C provides and which makes C a horrible language in todays world.

If only someone would start re-writing de-facto low-level infrastructure such as kernels(say Linux) and userspace tools and programs(servers such as apache, implementations such as for Python and Ruby, libraries, ...) in something like Rust or equivalent which guarantees both type and memory safety and has strong emphasis on concurrency and encourages immutable state etc.

Maybe one day we simply don't have to care so much about what's "secure" and what's "vulnerable". Because the concept of software vulnerability is destructive. Yes, it employs people, but these people create no real value, they just fight the destructiveness of vulnerable software. They are worthless in ideal world.


There is a very important concept in software called "good enough". Sure, there are language that could be better than C/C++, but 90% of what see in a desktop is written in C or C++. For a bad language, this is good enough, I would say.

It would be different if we had a language with the same performance characteristics and dramatically better high level features, but to this point we still don't have it. That is why software developers are in no rush to move from C to another of the languages that have been discussed lately.


I would rather attribute the current pace of things and C and C++ popularity to what is invested in those languages. Tons of big projects are written in C and C++. Many of them were begun during times when performance was a major issue unlike today. Also the ability to find contributors for a C and C++ projects is going to be far easier than for projects written in say Go or Rust or any other relatively suitable language. Not to talk about libraries even.

For a typical new desktop application, C and C++ have been long dead for at least a decade now, thanks to C# and .NET. It's a tad different on Linux and Mac though.

If we were to start from a scratch, I'm sure C wouldn't have such popularity as it had 20 years ago. The language is inferior by it's design on modern standards. Yes, there are domains where it's still relevant, but consumer PC(or let alone mobile) is not one of them. If C were relevant, I'm sure we'd rather write mobile apps in C instead of say Java.


  > There was one affecting Ruby and Ruby on Rails
What has it to do with Ruby, besides the fact that RoR is written in Ruby?


Why not use a modern statically typed language instead?


>Don't think that you're escaping C or C++ just because you're using Ruby, or JavaScript, or Python, or Perl, or Tcl, or even Java. Don't think that you aren't as vulnerable using a dynamic language as you are using C or C++ directly.

Actually this is completely backwards.

Think exactly that you are NOT AS vulnerable as using C or C++ directly.

That it's C/C++ underneath has little importance.

It FAR MORE difficult and FAR LESS common to reach runtime/interpreter bugs that to produce bugs of your own in the higher level language.

>Furthermore, it is quite easy for dynamically typed languages to suffer from very serious security vulnerabilities.

Of a different kind, that doesn't pertain to the current discussion.


So they have infinity minus one problems. You still have critical systems crashing or silently misbehaving.


Does the absence of type checks in dynamically typed code result in remote execution vulnerabilities typically?

It's twenty goddamn thirteen, stop writing trivial buffer overflow vulnerabilities already. And that applies whether you are a novice or an expert.


When you use a dynamically typed language you have CHOSEN to live without those concerns -- you traded checking for flexibility.

In a statically typed language, where you already made the effort of using types, it would be a shame for your program to die or cause havoc because you didn't also think about enforcing some invariants.

Plus, those kind of errors in C can cause buffer overflows, privilege escalation and such.

In a dynamic language it's usually just a halt and a stack-trace.

(And actually there is a move to do exactly what you say --add type checks in dynamic languages. See Dash and Typescript, with their optional type annotations and checks. But if it was possible to optionally have those checks by pure inference, without adding anything to a dynamic language's syntax, most people would jump at it instantly).


You are being too kind. I was expecting some earth shaking discoveries, only to find the author doesn't understand C.


I concur, I wonder honestly whether this is a matter of hubris becoming more important than accuracy.

The accurate fact is that K&R C is a book about C. It is not the end-all, but rather an introduction to the language. Sure, it has thorns. Sure, you'd be a fool to adopt the style from it; this speaks more of the culture of its readership than the book itself, however. The authors are very honest that their samples are an attempt to engage the readers attention in the Language; especially the Ingenue, new-comer, non-Professional C programmer.

To that end, the book succeeds; new C programmers get an introduction, a light read, a good set of nomenclature to understand the topic further, and so on. It is not intended, in spite of the cultural proclivity towards these things, to be "A Bible of C".

And if it were, no professional C coder worth their salty words these days would be without the New Testament, right alongside K&R on the neglected end of the bookstack, which book is of course: "Advanced C Programming - Deep C Secrets" which explains rather a lot more about the thorns of Professional C, and more, in an equally comfortable manner as both K&R, the Authors of C as well as books about C, have done.

In my opinion, Peter Van Linden has already done to K&R what Zed doesn't seem to have the humility to do; proven its value to the newcomer in becoming one step closer to a professional.


agreed. Once you know C, deep C secrets takes it to a whole new level and is an great body of work. My 3 C books are

1- K&R 2- C Traps and pitfalls 3- Deep C secrets.


Zed and his writings are not to be taken too seriously.


I suspect this one is a diversionary tactic.

I guess Zed Shaw suffers from nerd burnout. As in, a sort of more emotional burnout from having had to deal with them all the time in the past - or at least, that's what I get from some of his writings anyway. So I imagine him popping this stuff in as a sort of early warning system. It's all true enough to be right, and true enough to get his point across, BUT IT'S NOT TRUE ENOUGH FOR A NERD. So any time somebody complains about his strcpy example, or 0-terminated C strings, or whatever, that's his nerd alert. This person is not worth dealing with, and now he can block them, or set up a mail filter to put their email in the junk folder, or whatever, without having had to invest any time in finding this out the long way.

There was also a bit in one of his essays about the way ruby fans were always these stupid armchair pop psychologists.


I think it functions less as a "nerd" alert but more as a "recognizes I may not be as profound as I like to imagine myself" alert.

I mean, he's right.. but he's not being profound. You may as well tell me that I should be careful about losing precision while using floats. Yes... no shit?


Yeah, when your entire career and persona is built around being the only intelligent person in an industry full of idiots, it is natural to need to drive all the knowledgeable people out of your personal space.


"C is a ghetto"


I'm not trying to be "kind". I usually like reading Zed's technical work and just express my opinion about this particular piece. I agree that debating the theoretical provability of correctness of programs may be interesting in some academic courses, but using it to bash K&R for their book's contents is, as I said, unfair.


He certainly understand C programmers well enough to understand that when they write code like the one he critiqued, it's like handing a kid a loaded shotgun.

It will lead to misery.


Yes, we are all sure you understand C better than ZShaw.

I expect you will point us to your prodigal output in the language, and that it was only by accident that you forgotten to add any points of critique in your comment.


"if this is the direction you'll be taking with these articles, then don't bother"

But this thread on the book is a pretty good resource on issues and gotchas for newbies.

I largely use Python, I've dabbled in C and always mean to learn more. K&R to me is the touchstone for that, just because so much of programming culture stems off of it. I find knowing these historical patterns helpful for understanding how programmers talk.

For someone like me, respectful critique of its style and decisions helps separate the good from the less-helpful. Whether Zed intended to start that discussion or not, it's still helpful.


>So, you create something that does not fulfill the C library invariant of what constitutes a "string", and then pass it to a copy function that assumes this invariant? It isn't a fair thing to do

Well, life as a program is not fair either.

The problem is your function can be used in many contexts, including by other people. You should not expect them to be fair, you should make your function robust.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: