There's lots of C gotcha lists out there; the "Undefined behavior" series is good (and I think Koenig's Pitfalls book from 1989 is still worth reading (pre-ANSI C)
I agree; we should start with assembly language. Students will never understand why these are dumb things to do until they can visualize the assembly code they get translated into.
C, being the glorified assembly language that it is, would be an excellent intro to programming if we didn't insist on teaching it as a high-level language.
A beginner's course in C will tell you "uninitialized variables are bad", and "don't use pointers after free", but it won't explain the stack frames or heaps that underpin so much of our computing.
I happen to think that having C blow up in your face (and fixing it) is one of the best ways to learn the subtleties of programming.
A lot of students would not respond well to that kind of approach. It's too much at once; you're trying to teach them to think like an idiot savant computer, and teach them common patterns for translating their thoughts into code, and hit them with a bunch of low-level details about pointers and stacks and instructions? Some exceptional students would thrive. Others would dig in and do sort of okay, passing the test and then maybe learning it properly later on. Many -- maybe most -- would just get discouraged and switch majors to something where they'll never have to deal with null pointers or case fall-through or (god help us) malloc alignment along dword boundaries.
If you want to do it that way, you've got to take one thing at a time. Colin's idea of starting with assembly language isn't nearly as crazy as it sounds; I know several people who learned that way. Personally, I would start with very simple, linear programs in a high-level language like Python. Let them get a feel for it, and fiddle around. Then gradually start introducing more and more difficult concepts, like if statements, or looping, and how to use these strange new wonders. Later -- much later -- you'll be able to talk about the low-level details exposed by a language like C. If you try it too early, you'll either get head-explosions or blank stares.
(Incidentally, head explosions are preferable. One of the greatest horrors of teaching is that there will always be students who stare at you with looks of blank incomprehension, and it always might be your fault.)
I mostly agree with this answer. However, in my experience, starting with a high-level language (like python) is a double-edged sword.
a) It fails to get people interested in low-level things like call stacks and frames. Someone once told me, "My language frees me up to think about the really important things: like solving the problem at hand. I don't have to worry about things like the stack and where my memory is going or coming from." Perfectly valid point maybe, but I didn't learn about these things because I needed to, I did it because I was interested and curious. I think C and it's pointers played a big role in helping me "get interested".
b) People take the _awesome_ things in high-level languages for granted. When I first moved into python from C, functions as first class entities blew me away! Lisp macros were like learning to do magic. People who have only used a high-level language don't give these features the credit they deserve. (Alright, this might just be a pet peeve of mine...)
I think that C _should_ be taught to students, at a slower pace and in more detail than it was taught to us. And if it occasionally blows up in their face, I think they'll be better off for it.
The abstraction of C over assembly is just high enough to make it completely unclear to students how a machine could execute that. Teaching assembly is much more valuable form a CS perspective (though perhaps not from a Java/C++ school perspective).
My school has a CS 100 course that everyone has to take, but in the second half of the semester teams have to use PIC Assembly to program a microcontroller on an RC car/truck (that can go up to 20mph) to navigate an obstacle course with its four IR sensors. There's also a concurrent CS 120 class that is basically a standard C course.
Unfortunately the Assembly scares off a lot of people from looking at the Computer Engineering degree, as does the very minimal breadboard wiring they have to do, since they assume that's all CE students do. It's a tradeoff of getting more people for a degree program vs. starting off low level. Why do you think so many schools teach Java or C# and never touch an assembly? If your school has to justify keeping a program alive, they're going to have to make it more appealing to a larger audience which unfortunately means creating an easier program or first few semesters, and targeting big-industry popular languages so you can keep subtly hinting at potential good jobs on graduation.
BCPL barely had types. Everything was an "int" and you stored pointers, addresses, floats and so on in that same type (no concept of sizeof(int) != sizeof(void*) in those days!)
rwmj mentioned BCPL, but most of the languages C defeated lost by virtue of being too high-level, not too low-level. Pascal was probably the best of the lot, and "Why Pascal is not my favorite programming language" is an excellent contemporary explanation of the issues with it, none of which apply to modern Pascal dialects, but many of which apply to Fortran and PL/1, the mainstream high-level languages of the time.
On microcomputers, high-level languages in general were not viable. They were too inefficient. Jonathan Sachs wrote the first version of Lotus 1-2-3 in 8088 assembly, the same language as the operating system it ran on. Later versions of both were written largely in C. It was competing with VisiCalc, which was originally written in 8080 assembly and, as I understand it, sort of cross-compiled to the 8088 — thus failing to take advantage of many optimization advantages.
Really, though, programming languages don't win broad adoption by sucking less, except very indirectly. You rarely choose the programming language you're going to learn next according to its aesthetic merits. You usually choose it because you're working on a system that's written in it, that can most easily be called from it, or that has good support for it as a development language. You're not going to write an AJAX app in C++, but in JavaScript, or maybe CoffeeScript, because debugging and performance improvement would be a nightmare if you use the LLVM JavaScript backend. You're not going to build a search engine on Lucene using Lua, no matter how much you love Lua, because you'll spend all your time on getting Lua to talk to Java. (I don't know, maybe with GCJ this is practical? I'm not going to find out.)
And, although I wasn't there at the time, I think people adopted C in the 1980s not because they rationally weighed its advantages and disadvantages relative to PL/1, Pascal, Forth, and Bliss, but because the only internet-connectable machine their university department could afford was running BSD Unix, and if you wanted to write software for it, especially if you wanted to modify any of the large volume of software that already existed, you'd damned well better write it in C. So you learned C. And then a lot of the first-class programmers who cut their teeth on Unix moved on to the microcomputer world, where different incompatible versions of Pascal were vying with different incompatible assembly languages, and they brought C with them.
I think you mean "declaring the variable inside the loop initialization", which is not a bug in C99 and not subtle anywhere else.
The intended bug is that 0.1, in floating-point, is not exactly 0.1. On x86 (and probably anything with IEEE-754 floating-point) the loop counter reaches 0.5 successfully, is a little bit high when it should have reached 1.0, remains high up to when it reaches slightly more than 2.2, and then suddenly becomes low when it should have reached 2.3, and continues to be slightly low and get progressively lower thereafter, all according to whether the sum is being rounded to a value that's slightly too high or slightly too low. The consequence is that the loop runs for an extra iteration with x ≈ 3.5, or 3,5 if you're from, say, Argentina.
If someone has enough SO reputation (I don't), you might mention on pmg's post that casting malloc is a good idea, even though it's not necessary in C. Otherwise, should your code ever move to a C++ environment it'll be an error -- not even a warning.
That sounds like a good reason to omit the cast. Compiling C code as C++ is a bug: You can take valid C code, run it through a C++ compiler without any errors or even warnings being printed, and have it behave differently.
If you want to use C code from a C++ program, you MUST compile it with a C compiler and link it in.
b) a bad idea, because if you forget to include stdlib.h the compiler won't warn you since it assumes that you know what you are doing when you cast an int (the default return type) to whatever pointer type it is.
One of the commentators mentions some... frustration... with void main()
Out of curiosity, is there any reason to have a return value from main() in something like a microcontroller, where there's nobody and nothing (that I know of) to care what main() returns?
Out of curiosity, is there any reason to have a return value from main() in something like a microcontroller, where there's nobody and nothing (that I know of) to care what main() returns?
It's very unlikely... but yes, just possibly. There are some very wacky calling conventions out there; among them, "caller reserves space for result on the stack after the function arguments".
If you run into this particular variety of crazy, declaring main as returning void will result in the compiler looking in the wrong place for main's parameters, with obvious and very rapid breakage resulting.
I'd say that you don't need to worry about this, except that microcontrollers are exactly the sort of niche environment where you're likely to encounter craziness -- so it's better to play it safe and declare main correctly.
Starting the program is almost entirely implementation-defined in freestanding environments (i.e. those without a host OS), so I don't think your example works.
Of course, there are reasons not to use "void main" - for one, a strict C99 compiler will not compile it.
If you're running on a microcontroller, with no OS underneath you don't even need to use main(). You of course need some entry point for your loader to start executing, but you can pick whatever convention you want.
The only reason to have a return value for main() is if you yourself care, and sometimes you might. For my real time operating system class I have a Startup.s file that ends up calling the OS's main(), and while I don't care about its return value I might want to flash certain LEDs if it returned something other than 0. If I had to ship it I also might want to make it log something to some persistent memory (like an SD card) so I could debug a customer's problem without being there with a debugger under the same circumstances. Also I'm working on getting dynamic applications on the system, and the OS might be interested in their return values.
http://www.literateprogramming.com/ctraps.pdf
http://blog.regehr.org/archives/232
http://blog.regehr.org/archives/226
http://www.andromeda.com/people/ddyer/topten.html
http://news.ycombinator.com/item?id=1990244