Pro tip: when you develop with gcc, don't settle for anything less than
gcc -ansi -pedantic -Wall
At university I often had people coming to me and asking, "my program compiles correctly, but it keeps crashing / doing weird things; what did I do wrong?". I always said, "add '-ansi -pedantic -Wall' to the compile flags and come back to me when you cleared that wall of warnings down to nothing". Every single time the problem was directly or indirectly caused by the things that were in the warning messages.
It's a shame that the default choice of GCC flags make this compiler pretty much useless for work.
The warnings are there to help you. You can choose to ignore them and that's fine, as long as you do it deliberately and with a damn good reason, not because you don't even know they're there.
-std=cNN specifies the base language standard. Much existing code depends on GCC extensions to that standard (for better or for worse), so it is often necessary to use -std=gnuNN instead (specifying base language dialect + extensions).
Even when you do have a good reason to ignore them, you should still either disable that specific type of warning or, better yet, disable that specific warning for that specific spot in code using gcc's diagnostic pragmas .
I think even the most skilled programmer can benefit from making their code pass -Wall completely, even if just for the sake of sharing with other developers (but we all make mistakes and it'll likely catch some of those too!).
There's a reason that for quite a number of these warnings, Go has either designed the causes out of the language (no implicit casting, only manual conversions), or promoted them to compiler errors (unused variables and imports).
It was named when "ANSI or not ANSI?" was the interesting question, and stuck around for compatibility. It should be kept and its semantics should not be changed, but it should be called out as deprecated. At least the gcc man page does say "In C mode, this is equivalent to -std=c90. In C++ mode, it is equivalent to -std=c++98."
Yes, but if more developers used these flags the world would be a better place. Whenever I have the time, I build with these flags and report bugs back to the upstream developers. Then I drop the flags and move on with my life :)
Furthermore, I wish I had known earlier how much I could contribute to the open-source community by simply downloading a product, walking through their documentation, and submitting corrections, and little usability improvements to both the interfaces and documentation. It's an easy way to start learning your way around the code and internal architecture.
It may be a developer mindset sure, but look at the post you are commenting on... It's a series of posters made for developers to influence their choice of memory handling functions. Context matters quite a bit.
Ansi and pedantic are solely about standards compliance; that's often not what I want. It means my code might not be portable to other compilers, but a lot of other compilers implement much or all of GCC's extensions. If the extensions help you write better, safer code faster, use them! I whole-heartedly agree about -Wall, and would add -Werror and -Wextra, and would recommend skimming the list for other things that might be appropriate but aren't enabled by default.
Half the people in this thread seem to have missed the point. This article is not supporting 100% abstinence of these functions. It is doing exactly the opposite - and showing up people who say "never do this" or "that is always wrong" as being as ridiculous as those who blindly support abstinence from sex. This kind of tunnel vision, learning by rote, and refusal to teach what is really happening, is exactly the same stupid attitude that creates bad programmers, as it creates unbalanced teenagers.
Perhaps, just like condoms, memcpy, strcpy, strcat are dangerous or ineffective 10% of the time. But that other 90% of the time, when used correctly, they are perfectly safe, fun and essential to learning and growing. Avoiding them completely does not create a healthy relationship between a language and its programmer. I realize that not everyone in this thread is a C programmer, but this prevailing attitude of saying "this should never be done/used" just because you're heard that from someone else, is a really petty, annoying, and persistent aspect of programmer culture.
Perhaps, but tested and true advertising methodology is 1) find a undeniable grain of truth, 2) wrap your message around it, so the consumer is confused into thinking your message is by association true too.
You might be right about the sarcastic message 'wrapping', but these posters only work because the messages they're wrapped around are true.
I certainly didn't come away from those posters going, 'dang, those guys... so over the top; there's nothing wrong with using strcpy, you're making a fuss about nothing'.
I'm a beginner C programmer, maybe you can share your experience with me.
I've found bugs in libraries where sprintf was used instead of snprintf and it caused crashes. So I know from experience of cases where snprintf was a better choice than sprintf. And I've certainly coded a few off-by-one errors in my time so I can sure understand it's easy to get the destination buffer size wrong!
I'm not doing anything performance-critical with strings. Under what circumstances should I use sprintf instead of snprintf and what benefits will I see?
If you understand the difference between snprintf and sprintf, what trouble it can cause, and where to use it, it sounds like you're already doing great. What would be bad was if you had the blind notion that "snprintf is good" and "sprintf is bad" without knowing why.
As for an example of when you might want to use sprintf. If you compile to C89 (ANSI) snprintf isn't included in the standard so you're left with sprintf if you want your code to be portable.
A better example is strcpy though. A function which is completely normal and safe when used correctly, but someone with the wrong idea might use strncpy as a "safer" version and cause themselves more trouble for not knowing what is going on. As if to prove my point you can see someone advocating this exact thing in another comment tree in this thread. They've probably heard from someone that the "n" version of functions are safe. If they'd actually looked into it they'd realize strncpy doesn't do what you might think.
I think the areas where C is a good and effective solution are shrinking and safer languages are becoming more common and faster (even C++)
The C syntax, not its library doesn't allow for good string handling. String handling should be built deeper into the language, and, yes, even though it is possible to have safe C code, it's very hard. So try when it's worth it.
Saying C is "very hard" is useful to dissuade the unprepared from writing buggy, vulnerable software, but for the experienced developer with modern tools, C is at worst "tedious". Some simple habits, like always writing your malloc() and free() calls at the same time in balanced pairs, can make C quite managable. Add valgrind, unit tests, and static analysis, and I'd have much more confidence in a good C program than an average program in a weakly typed language.
For people not familiar with it, Valgrind will run your program in a VM and trace memory accesses. It detects when you read from unitialized/unallocated memory, don't free your memory, etc. Almost every time I have had a non-logic bug in C code, in had a corresponding warning in Valgrind.
"malloc() and free() calls at the same time in balanced pairs"
There is no stable correspondence between number of malloc calls and number of free calls. I might allocate things in 10 places that get freed in one, or vice-versa. A simple example would be "parse a packet and send the built packet (through a message queue) to handling code."
While it's true that there are times when you can't achieve it, I intend to suggest limiting oneself to design patterns that facilitate simpler memory management, and implementing both halves of the memory management equation at the same time.
Naturally there's a tradeoff; if redesigning your code to allow one malloc() to one free() would introduce more bugs in logic than it would solve in memory issues, then it's not worth it.
I'm pretty sure the its very hard refers to string handling in C, and also fairly solid in my agreement that unless you have a very very good reason, that is one of the places most people are better off staying away from (especially once you start playing with C unicode constructs).
(but your point is totally valid with regard to C in general~)
BString relies on undefined behavior for security:
The reason is it is resiliant to length overflows is that
bstring lengths are bounded above by INT_MAX, instead of
~(size_t)0. So length addition overflows cause a wrap
around of the integer value making them negative causing
balloc() to fail before an erroneous operation can occurr.
I'm curious as to who uses a garbage collector in C. It seems like if you're using C, you probably are in a situation where you want as close control of that kind of stuff as possible (short of assembly).
Expanding - in particular, Boehm seems to only run collections on allocations. If your code is structured such that you don't allocate during periods when you need more precise control (already a good idea, if you are using malloc!) then this won't have an impact.
It's a very conservative garbage collector at the end of the day. It can be tweaked so as to be completely bare bones, and the performance impact is very benign. Rather, memory consumption is its weakness.
> safer languages are becoming more common and faster
> though it is possible to have safe C code, it's very hard
A dull knife is pretty safe for most people. It's still possible to shove it into your eye and blind yourself, but other than those extreme cases it won't cause much injury when used in the regular manner. However, it is also extremely inefficient at the purpose it was designed for: cutting things.
There are, of course, safer alternatives to a knife. EMTs use special tools designed to fit a seatbelt or cloth into a small slot and slice through without any risk to a person; they also have specially designed shears which make it difficult to cut flesh, but easily cut through nylon and leather. Utility knives have retractable blades to reduce injury, and other tools are designed to fit specific materials into slots and make cutting people impossible.
All of those are purpose-driven and application-specific solutions, however. For the most high performance and general purpose application, a really sharp fixed-blade knife is still the most precise and efficient tool for the job. When wielded correctly it is still safe and efficient. But the practitioner is not protected from harming themselves; it's expected that they know what they're doing. And really, it's not that hard to learn how to use it correctly.
But I totally get that it's easier to use a dull knife or scissors than learn all about knives, and it gets the job done.
While your point is true, I want to object to your metaphor. I object for safety purposes. In fact, a dull knife is more dangerous than a sharp knife for most anyone who needs to cut things.
You need to press much harder with a dull knife and sometimes even to saw down into the object. You may even need to get a firmer grip on the object you're cutting to counter all that force. Those are very dangerous behaviors. A sharp knife that cuts easily is much, much safer.
Not true. A dull knife is dull; it's not dangerous at all because it basically can't cut anything. A half-dull or half-sharp knife is dangerous. It can cut, but you don't know how much, and variations in the blade make it unpredictable, in addition to the behaviors you defined.
I think I qualified my analogy correctly. For general purpose, high-performance efficiency, you can't beat a knife. However, there are tons of specific applications where a tool other than a knife is preferred. That's the idea behind the phrase "the right tool for the job".
I wouldn't say a hatchet or a machete is a "crappier version" of a knife. Surely a hatchet is much better suited for chopping down a tree, for example. On the other hand, a knife would be very inefficient at the job, even if it could get the job done eventually. However, a hatchet is arguably less safe than a knife for many kinds of jobs, and a knife for hatchet-jobs, etc.
So really I guess my point was the idea of a "safer" language is dumb, because not every tool is "safe" for every job, and not every job is suitable for a "safe" tool.
There's nothing here to prevent a buffer overrun. If your src doesn't end in 0, or your dest is too small, you'll be reading memory you shouldn't and/or obliterating memory past your dest buffer. Your method is pretty much on par with an inlined strcpy.
It still does. Microsoft also banned a lot of C functions from usage internally . For codebases that need to build in VC and other compilers I'm not yet sure how to really solve this apart from _CRT_SECURE_NO_WARNINGS.
The thing I didn't like about Microsoft's approach is that they also flagged some of the safer C functions, and introduced a new bunch of non-standard functions, the *_s functions. Those functions were eventually added to the C11 standard at Microsoft's suggestion, but until that point the MS C compiler would admonish programmers to write unportable code.
Make damned sure you know what you're doing. That means making sure you have enough memory allocated to avoid overflows, and that any input is sanitized before putting it down. Meaning, if you're using a function that's expecting a null terminated string, make SURE it's null terminated before copying. Or that you know the exact length to pass into a length specified function.
The problem isn't necessarily the functions themselves, it's coders who make assumptions that don't pan out to be true.
Let's assume your string struct is solid. Then does that mean you can safely use it with `printf, fprintf, sprintf` (e.g. printf("%s", string->value)? Or must you also write custom versions of those functions? How deep does this rabbit hole go?
You don't have to write custom versions of any of those functions; just use the char pointer in the struct instead of a bare char pointer. Keeping track of the length of your strings gives an easy way to provide the 'n' in all of those 'n' functions, and has other advantages besides. But the use of such a struct in and of itself, of course, provides no guarantees of safety. There is no such thing in C anyway :)
if you want it space padded. In principle you can bound your space usage and avoid an snprintf with such constructs; in practice, it's probably better to still use snprintf (if you're using standard-library string functions at all).
Unfortunately, I don't know of one. My recent C work has worked with text only in a very limited capacity (parsing and building packets in an ascii format - for the later, vectorized write buffers are a poor-man's ropes).
"The n in strncpy describes to what size the destination buffer (not string) should be padded with '\0'."
This is false.
From the man page: "The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated."
And the following code prints foo:ar on my system.
Edited to add: huh, scratch that. Obvious error in above test :-P. Testing it with strncat like I had meant to, it seems it is in fact padded, not just (possibly) terminated. Interesting, and very worth knowing if you are trying to move a probably small string to a large buffer under time pressure.
The alternative to gets() is fgets(). In fact, I'm pretty sure the gets() function is completely deprecated in the C11 standard.
However, despite the n functions being generally safer, you're still propagating misinformation by touting them as secure alternatives.
The original use of the n functions was to manipulate strings in matters of fixed size arrays. If you don't know what you're doing and just blindly use strncpy() as a strcpy() replacement, you could end up truncating your strings.
OpenBSD's l functions, on the other hand, were specifically designed with security in mind.
As somebody who did little other than C for just over a decade, my answer would be "FreePascal", which lets you do similar low level things, but has a few "safeties" built in. Objective C looks to be a little better with strings as well, but I'm not very familiar with it. (dabbled in iOS years ago)
The people who brought us the [in]famous "Why Pascal is not my favorite language" article would have done well to look at their own glass house.
OTOH, C does make a great portable assembler if you are using it to implement another language (which is exactly what we did at one of my jobs in the early 90s)
... Had to explain to my daughter this morning why I was laughing at the "abstinence" posters ...
I haven't used pascal for closing in on 20 years, and don't miss it one bit.
For you younger people reading this who have never been exposed to pascel, go dig up that article and scroll down to section 2.1 and just think about that for a minute. Ask yourself 'is this the kind of computing environment I want to work in?'
FYI: FreePascal has Dynamic Arrays that are automatically resized by the runtime, which addresses section 2.1.
2.4 - separate compilation was added in Turbo Pascal 4. (one could argue that the result is Modula, rather than Pascal - so be it)
2.2 - initialization of module data was added in Turbo Pascal 5. Yes, "static" data has to be at the module level instead of hidden within individual routines. Bug or feature? (let the jihad/crusade commence...)
Serious question: Aren't the C-based problems simply hidden from the programmer there? That is, the problems still exist, but you can no longer address them, qua Python, Ruby, Java programmer. That seems worse, not better.
Depends on implementation. CPython may have problems due to C bugs. Java is self-bootstrapping, though, and has no relation to C except for interfacing with external programs using C calling conventions. In this case problems are not hidden, they are truly not existing, except for problems introduced by other programs you interface with - but not your code.
You're quite wrong regarding Java. Much of its standard library is implemented in C and C++, and there have been frequent security vulnerabilities found in the language that are a result of buffer overflows or similar memory corruption bugs within the underlying C/C++ implementation.
Well, yea. This is exactly what you do in a few levels of microcorruption.
You can always say "dont do X and you'll be fine." But that's kind of like saying "don't point a gun at someone" and the gun will be completely harmless. That's the trick, isn' it? That seems to be the point of the "True bugs wait" site at least. It only takes one mistake.
I agree, especially with things like strcpy and strcat, but printf is kind of a special case because it's a very, very simple rule to follow. Plus most of the time it's illogical to use it without a format specifier (or with user supplied ones).
Not that the negative correlation between abstinence only programs and pregnancy + STI rates deters the people who push them. When you start to view public health policy through the lens of individual morality (if not piety in some cases), the actual effects of a given policy effectively become irrelevant next to the intent of the policymakers.
They're a parody of ads promoting abstinence from sex (probably targeted at high school aged kids). It looks like a typical US campaign. I couldn't find the original ads, so I can't answer your last question.
A lot of older pregnancy awareness campaigns were all about abstinence in the USA. It is only recently that people have realized it doesn't work and things have been changing to advocating condom use etc.
It's not "older", it's "Republican". There are still states that predominantly teach abstinence (looking at you, Texas, https://www.dshs.state.tx.us/abstain/), and any mention of contraceptives as both a means to reduce the risks of pregnancy, and of STDs, are mostly at the teacher's discretion, rather than a mandatory part of the course work.
Here, runtime, make some results and put them into this (character array) buffer I'm providing for you. Oh, the buffer wasn't big enough? Well, I'm sure the stuff after the buffer's end that you wrote over wasn't all that important, anyway. Probably just a few neighboring local variables that my function was storing nearby on the stack, or maybe the return address on the stack, or maybe the heap management data (block size field, free list pointer, ...) around the block of memory I allocated from the heap.
I wonder what the next pass of my loop will do now, or what will happen when this function returns???
Because C strings are arrays of characters terminated by a null ('\0') character, and arrays in C don't have bounds checking and don't come with a length attached.
If you allocate a 10-element character array, it can hold strings of up to 9 characters (plus the null terminator). But, the library functions ("runtime") don't know how big it is.
So if you try to copy a 20-character string into that 10-element buffer, it will copy 21 characters. The first 10 will go into the array, and the next 11 characters will go... somewhere else. Depending on where in memory your array and other things are, something important will probably get overwritten with garbage.
I would like C to have a native String type, and the primitives could be function pointers, replaceable in those cases when custom handling is needed (e.g. inside kernel). Object Pascal is sort like that.
D has this. A lot of people assume that D is just C++ with some GC and some improvements, but it's really more like C with GC and some improvements -- including a native String type.
There's also a string module in the standard library that includes a lot of convenience methods, including ones for interoperating with C and for working with Unicode.
For someone like me with a background in dynamic languages like Python and Lua, as well as some background in C, D was a great fit. Unlike when I started learning C++, D felt like a very natural extension of C to include GC and lots of modern language features.
Lamentably, not many people are interested in D, and so aren't yet a lot of third-party libraries, and many of those that are abandoned. But the core language and standard library are great, and there's a nicely growing and incredibly fast web framework (vibe.d). And Facebook has started supporting the language. So hopefully it will start seeing some growth.
I've gone down this route of thinking many times. Inevitably, as soon as I think of all of the features that I would want in C, I simply end up with a language that's not C any more... for precisely that reason. C isn't that kind of a language; it's a high-level assembly language and little more. Most of the things that C is lacking, it is by design, because of performance reasons and/or need for fine-grain control of the machine.
As another respondent mentioned, there are other languages that fill that need (C++, D, Rust, Nimrod), basically all of the systems languages that are designed to do C's job easier and safer (but probably not faster).
I've only sees "only abstinence is 100% effective" mocked from a policy perspective. It is true that abstinence is 100% effective on a personal level, but teaching it is remarkably ineffective at a population level.
There is room to mock it on a personal level, by pointing out that we are willing to engage in far more dangerous activities without 100% guarantees on safety. 'Seatbelts fail, only not driving is 100% effective'.
>I've only sees "only abstinence is 100% effective" mocked from a policy perspective. It is true that abstinence is 100% effective on a personal level, but teaching it is remarkably ineffective at a population level.
The phrase "only abstinence is 100% effective" is a slogan for teens, not a claim that only abstinence only sex education is 100% effective (which is clearly false).
>There is room to mock it on a personal level, by pointing out that we are willing to engage in far more dangerous activities without 100% guarantees on safety. 'Seatbelts fail, only not driving is 100% effective'.
It depends on how much a person wants to avoid having an abortion or unplanned pregnancy, and how much value they get out of sex.
> The phrase "only abstinence is 100% effective" is a slogan for teens, not a claim that only abstinence only sex education is 100% effective
You don't have to claim that a stupid course of action is perfect to deserve mockery. Following the stupid course of action will suffice.
> It depends on how much a person...
No, the success of abstincence-only education as public policy does not depend on a single person's wants, desires, and incentives. It depends on the wants, desires, and incentives of an existing imperfect population.
Claiming that "our policy would work if only people were moral and rational in X, Y, and Z ways" is irrelevant if people are known to not be moral and rational in X, Y, and Z ways, which is indeed the case with respect to sex-ed. At the end of the day it either works to reduce teen pregnancy or it doesn't, and in this sense abstinence-only education doesn't work.
The reason is that that statement is, in fact, a lie.
Effectiveness of any form of birth control is usually quoted as two numbers: an perfect-use rate, and a typical-use rate.
The perfect rate assumes that you are able to follow the instructions exactly every time. For example, perfect use of a hormonal pill means that you take it every single day, at the same time, without ever forgetting; perfect use of condoms means that you use a condom every time you have sex, before beginning sex, that you put it on correctly and that you stop immediately if it breaks; et cetera. Even the much-lambasted "withdrawal" and rhythm methods have really quite good efficacy, assuming you implement them perfectly.
Of course, people aren't robots, especially when it comes to sex, and that's why we have typical-use statistics that reflect the reality of the situation: People forget to take the pill, or take it at the wrong time. People skip using the condom, just this once-- and forget about "pulling out". And people who were abstaining, well, don't.
It may be the case, if we ignore certain unpleasant factors, that abstinence, done perfectly, has 100% efficacy for preventing pregnancy. But we need only to glance at the statistics to see that, in the real world, held to the standard we hold any other procedure to, it is the very worst form of birth control anyone has come up with yet. And that makes your "true statement" nothing more than a bald-faced lie.
 Look it up. About 5% failure rates, only twice as bad or so as condoms.
Promoting abstinence results in higher rates of teen pregnancies and STDs than pretty much every other policy. That's the reason for ridiculing it - far from 100%, it actually has a negative value, working against its stated goals.
Didn't I see a Law & Order about a woman drugging men and using a tazer to stimulate ejaculation? There's no such thing as 100% effective (http://lesswrong.com/lw/mp/0_and_1_are_not_probabilities), though abstinence actually practiced (rather than simply intended) is certainly the most effective approach.
How do you mean? strcpy and friends aren't buggy - they work according to the spec, and you can't just go and change an API that literally billions of lines of code depend on. You can only provide new, better alternatives.
If you mean the buggy software that uses the unsafe C APIs (and is not careful enough), then, that's exactly what those posters are about.
pickle is a (de-)serialization library for Python. It allows arbitrary code evaluation, so shouldn't be exposed to the outside world (for the same reason that we have JSON parsers and don't use JS eval() to parse JSON).
> "If you’re serializing and de-serializing Python objects, use the pickle module instead – the performance is comparable, version independence is guaranteed, and pickle supports a substantially wider range of objects than marshal."