Hacker News new | past | comments | ask | show | jobs | submit login
C for Python programmers (2011) (toves.org)
314 points by bogomipz on Nov 7, 2016 | hide | past | web | favorite | 162 comments

I've been programming in Python for a long time and recently took up an OS class which exclusively used C. Syntactic differences aside (declarations in C can get pretty hairy), the steepest learning curve while writing anything useful in C is with a) pointers and b) memory management which this guide doesn't seem to cover.

From my experience, the best way for learning C has been [0] Build Your Own Lisp and Zed Shaw's [1] Learn C The Hard Way.

That and of course spending countless hours debugging segfaults and memory leaks.

[0] - http://www.buildyourownlisp.com/

[1] - https://learncodethehardway.org/c/

Uncommon and slightly extreme viewpoint: the best way for a python programmer to learn C is to learn a bit of assembly first.

Python is a high-level language: it provides tools to manipulate abstractions easily. C is a low-level one: it gives you access to chunks of memory (and far enough ropes to hang yourself while shooting yourself in the foot while drowning)

Assembly is not hard to learn. It has a reputation of being hard because it is tedious to use and that optimizations can be very intricate but learning basic assembly does not take long. Learn about MOV, ADD, JMP, INT, a few conditional jumps, learn how negative numbers are represented, what a memory mapping is, what a program counter or a stack counter are.

Then, go back to C. Read a bit about how function calls are made, memory allocated, and you will see that C is actually a high-level assembly language. All the hard parts of C will become obvious: a pointer is a variable that stores an address, a stack overflow, a segfault, a memory leak, all these will make sense very easily within that framework.

Completely agreed!

In the past, I had always avoided C because I didn't understand why many aspects of why it is the way that it is (eg pointers etc). Then I took my university's CPU architecture courses. That sorted that fear right out since I had to go from the ground up, learning everything I'd previously largely avoided or ignored - everything from transistors and the basic logic gates (AND, OR, XOR, NOT, etc) they consist of to full adders, all the way up to pipelined CPUs. Naturally, we learned assembly as part of this.

C made an awful lot more sense after all of this! I still don't like using it, but that's a lot more to do with my understanding of it's dangers (I prefer to let the compiler do the hard work of verifying my programs make sense, a la Rust &c), rather than a fear born of ignorance.

Alternatively write some basic constructs (loops, functions etc.) in C and inspect the generated assembly. You have to fiddle with compiler settings a bit because the optimized output isn't very helpful but for me it was extremely helpful to see what ASM code my compiler tool chain actually produces (to understand the typical function call structure for example).

Doing simple reverse engineering challenges is also a fun and easy way to get a patter matching kind of feeling for assembly.

>Uncommon and slightly extreme viewpoint: the best way for a python programmer to learn C is to learn a bit of assembly first.

I agree. I also sort of learned it that way. The difference was that I learned Turbo Pascal first, before C, and BASIC before TP, but then either while learning TP or C, I also learned some assembly and something about the fundamentals of computer hardware and microprocessors (as related to programming, not electronics) in an interleaved manner. Was lucky to have access to some very good books on all these (and many other topics) from the British Council Library in the city I was in at the time. Gave me a solid grounding in many topics related to computers.

This approach might lead to programmers thinking that they can do things in C because they have a particular mapping to assembly in mind. This might lead to undefined behavior. Casting pointers around for example is no problem in assembly (there are no types...) but has undefined behavior in many circumstances in C.

That doesn't mean that there is no defined behavior to achieve the same thing, though. Platform portability just requires a jump through a few more hoops and then some.

C is high-level language. It's portable across variety of platforms, while low-level languages are tied to hardware platform. Assembler is low level language.

Sounds very reasonable. Any specific books or tutorials you would recommend?

I personally found Programming From The Ground Up[0] to be phenomenal, though a little dated. Alternatively google around for NASM (or MASM if you're on windows) tutorials, such as asmtutor[1].

[0]: http://download.savannah.gnu.org/releases/pgubook/

[1]: http://asmtutor.com/

Any recommendations for assembly learning material?

Why not use LLVM IR instead, which is simpler?

From my experience in teaching C beginners, I think declarations are one of those things in C that people either get frustrated by very early on and give up, only memorising a small set of fixed patterns for the common cases; or see the pure elegance of the system, once it's presented to them in the right way. The two things to keep in mind to become one in that second group are "declaration follows use", and the operator precedence (array and function-call are on the right and have higher precedence than pointer indirection on the left.) Declarations must not be parsed left-to-right; in the same way that

    2 / 3 + 4 * 3
is not parsed left-to-right but as (2 / 3) + (4 * 3),

    int **foo[10];
is parsed as

    int *(*(foo[10]))
. What "declaration follows use" means is that the ultimate type of the object is exactly as it says after all the operators are applied (in the correct order): when the array subscript is applied to foo, and then that object dereferenced twice, the type of the expression is int. Thus foo is an array of 10 pointers to pointer to int, foo[i] a pointer to pointer to int, * foo[i] a pointer to int, and * * foo[i] an int.


If the system were elegant it wouldn't be quite so wildly inconsistent between prefix and suffix notation. "Middle-out" is not elegant.

I mean, I've been writing C for 20 years and would still prefer to break down a declaration with intermediate typedefs rather than write "pointer to function taking array of pointers to int-returning functions taking a void pointer, which returns a pointer to int-returning function taking a void pointer". If you can write that as a declaration right first time you can give yourself a serious pat on the back.

Challenge accepted, no cheating, no references, just editor and me. Hope that I got it right:

    int (*(*p)(int (*[])(void *)))(void *);
The process:

    ?which-returns (*p)( ?taking           )
                 ? (*p)( int (*[])(void *) )
             int (*(*p)( int (*[])(void *) ))(void *)
Testing with cdecl.org [1]:

    declare p as pointer to function [taking] (array of pointer to function (pointer to void)
     returning int) [, which] returning pointer to function (pointer to void) returning int
F-yeah. Can I has my pat? Seriously, it is not so hard if you used each component at least 100 times. That's few years of practice of non-hiding behind typedefs. You can't learn if you don't practice. (I'm not arguing that this should not be decomposed.)

[1] http://cdecl.ridiculousfish.com/?q=int+%28*%28*p%29%28int+%2...

IMHO pointers are mostly hard at the beginning because the asterisk-operator is overloaded: On one hand it declares (and optionally initializes) a pointer (type * x = ...;) and on the other it dereferences it (type y = * x;). At least for me that was the thing that puzzled me the most. I would prefer something like that:

  // declare and initialize x
  @uint32_t x = 42;

  // dereference x
  uint32_t y = ~x;
I.e. have different symbols for the type and the dereferencer ('@' and '~' as an example).

//edit: thanks for the help, fixed snippets - HN's parser does not get pointers either apparently :)

I've been programming C++ for too long to really understand why people find pointers hard, but I do remember that I did too at the beginning, and I do remember my epiphany (reading on the train, funny how I remember such senseless details 15 years later yet can't memorize a 4-item shopping list): given a declaration of

    struct Foo {
        int a;
    Foo* f;
, the following two are equivalent:

    int b = f->a;

    int b = (*f).a;
Never had it spelled out for me like this, probably because it's so obvious once you get it - thought I'd share in case it gives anyone else an 'aha' moment.

The trick to teaching pointers is to model memory with boxes. You have a stack, and a heap, and you draw arrows to represent pointers. At some point, you need to reveal that those arrows are just numeric indices, but the visual helps early on. The * and & operators just let you follow the value in the pointer. I've had a number of students who were baffled up until I showed them this model.

Notational Machine for the win!

Back in the 80s the programming books I used always showed memory as little lines of boxes.


In general, thinking of memory as an array of bytes is helpful, and each byte has an address which is itself a series of bits that can be stored in memory --- and the concept that ties it all together and what I find makes a lot of people just "get it" is the fact that memory can itself store addresses that denote locations within it.

That's a perfect example of how knowing some assembly code helps with learning C.

i also had trouble with pointers while learning C, and, like everyone else, it seems so obvious now that i wonder what the hell my problem was. like learning to ride a bike, or drive a car with a manual transmission. i learned 8086 assembler before C and had no difficulty with indirection there.

for what its worth, your "aha" scenario would have just confused me more. even today, i have to think a bit before your second example makes sense to me.

You can consider them the "same" thing if you regard the asterisk as binding closer to the var instead of the type.

Which the declaration syntax hints at:

   int i, *p;

Can you elaborate?

My point is that the concepts "having a datatype, that stores an address" and "reading (typed) data stored at a specific address" are different, but share the same keyword (which is in this case just one symbol - maybe because C programmers have to use it often).

E.g. assuming C99's bool (0|1) datatype, you might define the "negate" functionality, which is often done by '!'. So we have something like that:

  // init a new bool
  bool success = 1;
  // negate the bool
  if (!success) {
For me that makes sense. However if we take the pointer approach (as it is implemented), it would look like this:

  // init a new bool
  bool success = 1;
  // negate the bool
  if (bool success) {
I hope this makes sense. For an experienced programmer * 's role is obvious from the context, but for me that was the most troubling concept when starting with C's pointers.

> My point is that the concepts "having a datatype, that stores an address" and "reading (typed) data stored at a specific address"

They are different things, but C's syntax was deliberately designed to imply the former from the latter. "Declaration reflects use" is the term they used for it.

The idea is that in something like:

    int *i;
You don't read it like, "Declare a variable 'i' whose type is 'int star', which is a pointer to an int". You read it like "declare variable 'i' whose type is such that taking 'star i' (i.e. doing a dereference) would give you an int". The type in this case is a pointer to an int.

This is always why function pointer syntax is so totally bizarre in C.

"Declaration reflects use" was a neat idea, but I think it practice it ended up causing more confusion than it solved. At the time, maybe they thought users would be tripped up by compound type expressions and thought it would help if they focused on the operations performed on the variable being declared.

In practice, it turns out that composed types don't seem to be that hard.

Wow this is really interesting, I did not know that, and it explains their choices. And C's function pointer types are indeed written in a bizarre way.

I definitely think of (int * i) as (i :: Ptr Integer) and (* i) in an expression like 42 + * i as (* :: Ptr Integer -> Integer).

Yeah, as long as we’re mixing notations, it’s really:

    *i :: Integer
From which you can conclude i :: Ptr Integer.

And the dereference operator does have the type you specified, but not in an lvalue—there the operator is really a mixfix one:

    (*_ = _) :: Ptr a -> a -> a
This isn’t specified directly by the standard, but follows from the rules about how an assignment operator must examine the structure of its first operand.

> In practice, it turns out that composed types don't seem to be that hard.

Depends on what you mean by hard. Hard to read or hard to reason about?

* -> * and * -> * -> * are easy to read and reason about, i.e. Int -> Char -> Int. But what about types of types?

(* -> * ) -> * is confusing to reason about IMO. i.e. Applicative Functors:

(<$>) :: Functor f => (a -> b) -> f a -> f b

Not really intuitive at all, the only reason I can understand it is that I know what functors and Monads are.

I'm not talking about higher-kinded types, just type annotations that are more than just a single bare identifier. Stuff like:

    (Some, Tuple, Type)
    (Para, Meter) -> ReturnType

I'm probably misunderstanding you, but can I rephrase your comment as saying 'int i is the natural way of declaring a pointer to an int, and not int i'? Because if so, I don't recall that from the K&R (and I've read it front to back several times, not that that means much - I can't even remember if they use int* i or int *i, and I don't have my copy at hand here).

Alas, HN mangled your asterisks.

> I don't recall that from the K&R

I don't recall either, but this Wikipedia mention of "declaration reflects use" cites K&R:


Uh sorry, the first one had the asterisk next to the variable name, the other next to the type.

Yes I agree about the 'declaration reflects use'; what I meant was: does that imply that the declaration should consider the asterisk (the 'make this a pointer' part) to be 'part of', and thus right next to, the variable name or the type?

I'm strongly in the 'type' camp myself, and therefore I think that int *p; is nonsense; only to be used out of necessity when declaring multiple pointer variables on one line. So I'm wondering if 'declaration reflects use' reaffirms that, or contradicts it.

I don't see how the bool example follows; in what way is negating a boolean like dereferencing a pointer w/r/t types? ! is bool->bool; * is * P -> P.

The reason the same symbol is used is because of the way C types are read:

    int x; // (x) is an int
    int *x; // (*x) is an int
    int **x; // (**x) is an int
    int x(double); // (x(1.)) is an int

Yeah, sorry, my example sucks.

However, I am still believing a different operator would make more sense.

And what of [a-zA-Z0-9_]? Why should we allow them in both types and variables and functions? That's really confusing.

My elaboration is a bit late ... but.

    int i, *p;
Tells me that i is an int. And so is *p.

This had a way of always confusing me. For clarity, I'd declare them as:

  int i;
  int* p;

Think of it from the perspective of types:

    int *p; /* What is the type of p? Pointer to int */
    int a = *p; /* What is the type of *p? int */

I first learned pointers with Pascal and believe it was simpler with the ^ operator, but it has been so long I don't remember.

For some reason I also found pointer/references easy in Pascal, but struggle with them in C. I also find it easier to work with direct/indirect addressing in assembler than in C. I'm not entirely sure why - seemingly the differences between C and Pascal syntax and behavior for references seem trivial -- and yet somehow in Pascal it feels intuitive, while in C it feels complicated.

If I recall correctly (long long time ago) there was only one pointer type un Pascal and it was considered a number that it was legal to modify without warnings.

C insists that int * , char * and int are totally different types that you should not mix. It makes sense most of the time but it can be confusing when you do not realize that internally they typically are the same thing <insert here the disclaimer about 32/64 bits systems>.

I think it seems simpler because Pascal differentiates the type ^integer (“pointer to integer”) from the expression i^ (“dereference i”) and didn’t originally have an address-of operator—you could only get the address of a new anonymous value.

It's been so long, I don't really recall - but I was using object pascal which is fairly similar to free pascal:


It's quite similar to c - yet I find it a little bit clearer.

For me the reason pointers were hard is because I didn't get clear feedback when I dereferenced them. When pointers are taught conceptually they don't tell you that there is a kernel terminating the program because you access invalid memory.

I confused that termination with the idea that my pointer has never been dereferenced in the first place. Now, I don't know if the kernel intervenes before the invalid address is going to be dereferenced or just after it. The thing is, I understood the concept, but getting this error was giving me doubt since I made an error of accessing valid memory.

Once I understood that I tried to force a normal int to become a pointer and eventually it worked (needed two ints and an 8-bit variable to cast it to a 64 bit address), proving to me that I understood the concept.

Accessing (valid) memory and pointers should be explained together.

I recall them being a way of seeing how much you studied at uni...or some professor wanting to bring up the morris worm

I agree, the notation was the hardest part of learning about pointers. Having a 'cheat sheet' - a program with examples of common ways of using pointers - helped me quite a bit.

I well remember that if how arrays decay to pointers was properly explained to me in the beginning, I would have "got" pointers a lot sooner

You may be interested in the SPECS design. They used a prefix ^ for this purpose.


This exactly captures the struggle I've been having with pointers in Go - I have no problem grasping what a pointer is and how to use it, but I keep getting tripped up on the syntax because of exactly this.

I was about to tell you to escape it but I actually cant find a way to print an asterisk on HN

EDIT: you can print one by leaving a space after it: * aa

Yeah thx, I just realized that myself.

Do it like * this

I love Zed but I couldn't recommend Learn C the Hard Way to a beginner. The book starts out good, but then greatly ramps up complexity around chapter 17 (heap and stack memory allocation) without a good introduction to many new concepts, explanations are just completely missing. Most people I know who've all tried to go through the book ultimately start to lose interest and give up there unfortunately.

LCTHW is for absolute beginners. It lacks accuracy necessary for people writing C. It's good as an introduction, perhaps to capture interest, but no more than that.

"Stack" and "heap" are implementation details. I don't think the C standard mentions those words at all.

>> Syntactic differences aside (declarations in C can get pretty hairy), the steepest learning curve while writing anything useful in C is with a) pointers and b) memory management which this guide doesn't seem to cover.

Coincidently, I was making small talk at the office today with an undergrad (EE major) intern who's unenthusiastically taking a course in C programming to satisfy requirements. I told her that although those two points you brought up were precisely why traditional CS types generally hate/avoid C, it's imperative that she grasp these concepts as soon as possible (along with picking up an HDL) if a career in hardware floats her boat.

Coming from a formal hardware background myself--and ever so envious of the traditional CS types who were always hacking the wee hours away--I ended up picking up a used copy of Cormen and took some CS baby self-steps over a few semesters, using both Facebook Puzzles[1] and Project Euler[2] as a condom so it'd be much harder to become impregnated with my own bullshit. And yet, to this day, I still feel like a bag of suckass compared to our resident graybeards; last time I hacked some C was only a few months ago to make a JEDEC[3] interpreter that metaprograms source in an obscure, obsolete proprietary language, but I'm admittedly ashamed to share the source with anyone out of fear of affirmation that I still suck hind tit at C after all these years. ><

[1] https://web.archive.org/web/20091130184215/http://www.facebo...

[2] https://projecteuler.net/

[3] https://www.jedec.org/standards-documents/docs/jesd-3-c

I enjoyed K&R C, but I know it's a bit outdated - I was thinking of picking up a more modern C book.

Build Your Own Lisp implicitly claims you don't need to know Lisp to learn from the book:

> We will be covering many new concepts, and essentially learning two new programming languages at once.

Do you think that true? Did you know Lisp before reading it?

(I've done the first half or so of SICP so I know some Lisp.)

Right now I'm working on designing and implementing a virtual 16 bit CPU and an assembler, I guess the logical next step would be an OS class, then something like Build Your Own Lisp.

(For context, I write Python applications at work, but the work done by our embedded system C guys seems more interesting; I'm trying to learn enough C that I could be useful on one of their projects.)

A slightly more modern book that's a really excellent book to follow K&R is Hanson's C Interfaces and Implementations.

It's a book written in a literate programming style that describes how to build a flexible and modular library of data structures.

You will end up learning only that subset of lisp that this book implements. So this book is definitely as not overwhelming as it sounds.

In my case, I did know some lisp (Clojure) before starting out, but I'd strongly recommend that you don't hold back based only on that requirement.

I wrote a blog post that goes over pointers in C that may be useful:


It covers, but very slightly (see below, I skimmed it exactly to look for it). My observation is that this is really hard for many people. I helped couple of my course mates with background mainly in Python and this was the only thing they struggled with, and they struggled hard. It forces you to be very observant and conscious about the situation. Until you get that far, you will continue to suffer. C is really unforgiving.

Every once in a while, you'll see a C program crash, with a message like “segmentation fault” or "bus error.” It won't helpfully include any indication of what part of the program is at fault: all you get is those those two words. Such errors usually mean that the program attempted to access an invalid memory location. This may indicate an attempt to access an invalid array index, but typically the index needs to be pretty far out of bounds for this to occur. (It often instead indicates an attempt to reference an uninitialized pointer or a NULL pointer, which we'll discuss later.)"

As always, I recommend valgrind.

  $ gcc -Wall -g prog.c -o prog
  $ valgrind ./prog
  ==25494== Process terminating with default action of signal 11 (SIGSEGV)
  ==25494==  Access not within mapped region at address 0x0
  ==25494==    at 0x400532: main (prog.c:5)
  ==25494==  If you believe this happened [...]
This helps a lot while debugging. Note that it is also helpful when debugging memory leaks and can give you the exact line in your code where you issued alloc() that you did not free later.

Very cool, this would help a lot indeed.

Are there other such tools a casual C programmer should be aware of?

Turn on all compiler warnings (-Weverything for Clang and RTFM for GCC (no, -Wall doesn't turn all the warnings on)).

Scanbuild (LLVM based, Linux and macOS)

Perf (recent Linux) or DTrace (macOS, SmartOS, Solaris and FreeBSD)

Valgrind (Linux, macOS) with the KCacheGrind/QCacheGrind (CPU usage), Valkyrie (leaks and undefined behavior) and Massif-visualizer (memory usage) GUIs

The compiler sanitizers (GCC, LLVM, Linux and macOS)

strace (Linux)

HeapTrack (Linux, a true hidden gem, best memory profiler ever)

Binutils (Linux, macOS)

And really, get to know your debugger. Both GDB and the Visual studio debugger are extremely powerful. If you think a debugger does beakpoints and nothing else, you really, really need to get to know your debugger better. LLDB is getting there too, it will catch up soon (surprising given how long it took the other 2 to mature).

For people from a Python background, know that both GDB and LLDB are available as a Python shell with full access to the C program internal. You can add triggers to execute Python callbacks, conditional breakpoints and even gather stat and have them display in MathPlotLib or iPython Notebook.

Any good references to gathering stats using debugger and visualizing with Python?

Good luck finding anything. It is in many performance ninjas arsenal, but very little documented. Here is an untested mix of some other scripts that might less or more almost work. At least it can get you started:


Yes, there are gazillions of useful tools, but another important point is that C is tightly coupled to *nix systems and the best platform to develop C is definitely a Linux-based system (ignoring embedded systems for the meantime, where C also shines), as otherwise most quality information that you find in the interwebs won't work for you.

This has changed since 2011. Both GCC and LLVM/Clang support

export CFLAGS="-ggdb -fsanitize=address"

And few other "sanitizers". They add a runtime to the binary so you get the backtrace, detect runtime "silent" errors (undefined behaviors) and overall help develop C/C++ programs.

Surprised you didn't try C++ if you found C quite difficult. C++ teaches memory management but without the insanity level dialed to 11 like with C (or more so with Assembly).

I've always discouraged learning Python first. Python disconnects you so much from what is really going on and does so much for you that you don't develop algorithm and conceptual skills and computer knowledge.

All the developers where I'm at, from SQL to Python, Node to Angular, everyone, all say the same thing about Python and other "hyper-package-assisted" languages.

> That and of course spending countless hours debugging segfaults and memory leaks.

Maybe that's because you think that:

> the best way for learning C has been [0] Build Your Own Lisp and Zed Shaw's [1] Learn C The Hard Way.

Care to be more specific about what you mean? What would be your preferred way of learning C?

Really, how can you get proficient at programming C without spending countless hours debugging segfaults and memory leaks?!

The resources you refer to are abysmal, criticism is available and widespread online if you want details.

You get proficient at C programming by properly understanding what you learn rather than spend countless hours in trial and error cycles.

There isn't much magic to it and the concepts are quite simple. In my opinion if you're looking to get up to speed quickly instead of carefully writing code then C is not for you.

To be honest I learned from K&R but I guess it's too outdated and not really about "modern C". But of those resources one of them seemed quite fun and probably harmless ([0]) and the other one ([1]) pretty practical and "bad style" but error-free as far as I read through it, with the added benefit of teaching newbs that tools matter a lot and getting them pretty fast into using make, valgrind etc....

In your opinion, what would be:

1. the modern equivalent of K&R?

2. a solid reference to keep around and consult afterwards?

(with 1 and only 1 answer to the questions above, please, not a list, 'cause every time I search "learn c" I get over lists upon lists, of high quality comments - which is bad because you can't actually dismiss them as rubish, you actually have to read them -, that only throw you into paralysis by analysis, so you just say, "fuck it, just grab K&R or LCTHW, then grab a C open source project you feel like hacking on and start banging you head on it"...)

1. For "modern C" it would be K&R, then learn Haskell; especially take note of its type system and how you can apply what you learn there in C.

2. The C standard draft, your compiler(s') manual and man pages if you're on POSIX. Also pencil and paper for drawing your arrays.

Clarification about 1: Haskell has good solutions to a lot of problems that have plagued programming, learn them because they are not language-specific but are mindset-specific. The advantage is that once your mindset changes, you start paying attention to off-by-one errors and potential overflows and overruns. You also start paying attention to using the correct types in a C program and you start taking advantage of enums to reduce the space of possible states in each function in your program. Once this clicks, memory management is a breeze.

Also learn the mindset of solving the problem at hand and nothing but the problem at hand. C is not a language for creating abstractions everywhere, it's a language for writing down as little steps as possible to solve that specific problem.

This ^

Syntax and data structures are usually the easiest part of learning a new language. Leveraging implicit conventions and trying to build anything useful is much harder.

This is still a useful reference for anyone looking to jump into C.

I understand pointers pretty well, having programmed them in TTL logic, assembly language, Pascal, and BASIC (via PEEK and POKE). Python is a relative newcomer to my tool chain.

What confuses me about pointers in C is simply the syntax, and I'm sure it's because I just don't write enough C to be able to read it with any kind of fluency. I find C programs to be harder to read than any other language that I'm regularly in contact with. Of course this could be because of what C is used for.

Both of these are listed under "Stuff that should be avoided" here http://www.iso-9899.info/wiki/Main_Page#Stuff_that_should_be...

Use the address sanitizer and valgrind on your tests and you'll keep your sanity.

Thanks for this. I've failed C courses in university and in MOOCs when things get to pointers.

In your opinion, did learning C make you a better programmer in Python?

Just knowing C probably not, but knowing how PHP is implemented internally did make me write PHP differently; and I suspect the same would be true for any scripting language.

(for example being much more conscious about pass by reference, the cost of various functions, the overhead of calling functions and doing data conversions etc).

That's why you shouldn't put anything on the heap in C.

No mention of malloc or the struct keyword? You'll probably want to learn about that before dealing with real C code.

> #define forever while(1)

> Expert C programmers consider this very poor style, since it quickly leads to unreadable programs.

So why even mention it??? There are more important subjects which could have been introduced in this space.

"Python for C programmers" would probably make much more sense (following "C for Assembly programmers").

I agree. My computer science classes were taught almost entirely in C. Learning C-like languages is pretty easy, but I constantly feel like I'm using Python incorrectly. I feel like I'm trying to take what is elegant in C and implement it in Python, even though what is elegant in Python is very different.

I think it's largely the opposite for current students. My university's CS program is taught entirely using Python and other higher level languages. CS students here don't have to learn C at all anymore; if you want to learn C here, you have to get a Computer Engineering degree.

Aren't most OS courses taught in C? At least two years ago when I took operating systems (in the US) we had to hack up Linux source in C.

Edit: And thinking on it our computer graphics course was in C++ (yea yea C != C++).

There's a huge variety in CS programs and you can't really assume anything about the languages they all use. My understanding is that accreditation for CS degree programs is based on concepts, not implementation details like programming language.

My OS class (a while ago) used assembly language for a simple VM, with one or two assignments requiring us to modify the VM itself to implement new instructions required for new OS features (task switching, virtual memory, etc.).

A lot of CS programs aren't even accredited. My (highly ranked US university) doesn't have any accreditation for the CS program.

Most OS are written in C. Thus to interface to get good use from the OS you use the libraries which are in C.. (To get networking, shared memory, message passing, process scheduling and all that good stuff).

I wrote c wrappers for ada, and it was a pain. Much easier to use C (or C++) to get to OS functionality.

As long as OS are written in C, it will be with us. Also useful for embedded.

see: /usr/include

OS (or computer graphics) courses aren't necessarily mandatory for CS degrees. In fact, if I'd put my mind to it, I'm pretty sure I could have gotten a CS degree without too much programming at all by loading up on as many math and theory courses as possible.

Personally I did study C via a couple of electrical engineering courses I took.

At the same time, understanding the underlying ideas of memory management, process isolation, IPC, IO/CPU sharing is really, really useful for not making silly mistakes.

Oh absolutely. My point was mainly that just having a CS degree isn't a guarantee of, well, anything really.

I've heard many colleges have switched to Python for Comp Sci. I've seen linear algebra and computer graphics courses done in Python, too (including writing a ray tracer). The goal is learn the concept without the language getting in the way. Performance isn't a big concern (with poor performance you can still play with sparse data structures and other optimizations).

While I imagine it was good to get familiar with writing C, even assignments for operating systems classes didn't have all of the performance concerns or strict checking that production code would, right? (but C still is probably the best choice for other reasons in an OS class)

While we used some C for manipulating kernel stuff, we implemented a lot of the algorithms in Python since it let us do multi threading, multi processing, signals and a couple of others and my college uses a lot of Python...A bit too much Python now that I think about it.

My undergrad OS course was taught in Java shudder. It was a very theory heavy class, and the only code we actually wrote was a toy scheduling algorithm.

It really depends on the school you attend. The CS program at my school was very theory heavy, and light on programming. The entire department used (and still uses) C because you can use it for every class and every professor knew it. I think from a computer science perspective, C was preferred because you have very low-level control and it is a very verbose language where everything is performed explicitly.

Interesting. My college used a lot of Python for nearly all the classes except for language seminars and some parts of other classes.

I asked my professors why and they argued that Python "doesn't get in the way" (static typing, compilation etc.) allowing us to focus more on learning the algorithms and theory.

I think I'd prefer to use Python in most cases. My data structures class was taught by a professor who LOVED a ten year old book. Unfortunately, more than half of the assignments in the book wouldn't compile with GCC. Everyone I know who took that class said it was the hardest class they took in college because you basically had to rewrite every program from scratch.

I still preferred C to Java. I'm sure Python would have made that class way easier, though I do love pointers.

CS50 used C almost exclusively, that was very recent.

Why not "Assembly for C programmers"? I would say both are useful directions.

I've written a Python for Java programmers book - http://antrix.net/py4java

Do you have a link to C for Assembly Programmers? A google search doesn't turn up anything.

Most books from the early days were essentially that, although they weren't called that because it was just assumed. The K&R is, in a way, even if it doesn't mention assembly. It's obvious to see how a for loop relates to the assembly instructions (of course it isn't today, with all the trickery compilers do underneath).

Yeah, K&R assumes at least some familiarity with assembly programming - it uses assembly-level terms like "address" and "register", without ever defining what they mean. The chapter on pointers starts out with:

"A pointer is a variable that contains the address of another variable." [1st edition, 1978]

It was just an idea. C is as low as most people bother to go these days.

"C For Assembly Programmers" would an interesting read. I've looked at some mission-critical C code before, where much of the language is abandoned in favor of "safety"; in those cases I got the feeling that C was being used as little more than an Assembly preprocessor.

But how do you learn assembly?

Assembly Programming for Chip Designers

I learned basic 16 bit x86 assembly by making simple COM executables and boot sectors with MASM. I don't know if its still possible run COM on modern Windows (this was in the 32 bit XP days), but getting started with boot sectors using BIOS interrupt service routines for IO should be an interesting project. You should be able to use the assembler output directly in a VM as a disk image this way, or dd it to something and watch it run on bare metal.

You can actually inline assembly into your C code. I think this could be the easiest way to practice it.

There are lots of tutorials on the web. For example,


Oh how we've come full circle: in the "old days", this would be "Python for C programmers".

Offtopic: when I saw "C for Python programmers", I immediately thought it was going to be this beaut/horror - https://twitter.com/UdellGames/status/788690145822306304

People are criticizing this, but this is very valuable to someone like me, who just wants to brush up a bit on my C, having some minor experience, and a lot of python experience, so thanks!

Not to diminish the usefulness of this brief intro (it is very well written) but, as the saying goes "a little knowledge...".

Two of the biggest stumbling blocks in C (and their ilk) are pointers and memory/garbage collection... which this brief intro barely mentions.

It actually reminded me of many programming courses I've taken over the years where the instructor spends an inordinate amount of time on the easiest concepts and quickly glosses over the meaty bits.

Soon this will be me - freshman CS student who will be learning C the second semester. A lot of people say it will be difficult, but there is a certain fun in building things from scratch.

It's actually required if you want to write efficient software. Not saying it can't be done e.g. in Python, but you have to know what's going on underneath.

This is oft-repeated, but I don't really agree in most cases anymore. These days 90% of software like web application front ends & back ends, mobile applications, desktop applications, data pipelines, etc. are all written with some library in a language like Java, C++, Obj-C, Python, Ruby, or maybe Go. Arguably modern software is going to be more efficient for the vast majority of cases if the author sticks to the best practices for their chosen platform/library than trying to roll their own C implementation or tweak what's going on underneath. Writing C from scratch for anything other than an embedded device or low-level system library is probably ill-advised.

"Java, C++, Obj-C, Python, Ruby, or maybe Go."

If you want to write efficient software, which is certainly not always necessary, you're looking at a ~50x factor difference between the top and the bottom of that list, which even in this age of "let's just spin up another couple dozen instances" is non-trivial.

50x? That would surprise me, though I'm no expert in Ruby or Go.

The problem with writing things in C or C++ is in the time it takes you to finish the first working cut of a nontrivial program you could have done the same thing in Java and finished four or five tuning cycles. If you stop there the Java will most likely be faster.

How often do you get to write, say, a driver or a utility where "fast enough" won't do and there is enough of a business case to wring out those last few cycles?

Sure, even within languages you can be looking at nearly a 50x difference based on things like framework & database choice. Hence my suggestion that raw language speed should not be a deciding factor for most projects these days.

For example, look at the difference between the gemini-postgres and dropwizard-mongodb: https://www.techempower.com/benchmarks/#section=data-r12&hw=...

I think you're both correct... the big question here is whether there is a good library to do the heavy lifting in your app. If there is, then choice of library will matter more than the app language, as long as there are bindings for your language of choice.

IF, and this is a big IF, you're doing something weird enough that you have to do the heavy lifting yourself, choosing an efficient language will matter a lot. At least for the heavy-lifting parts - you could still implement most of the the app in an easy inefficient language, and only do the heavy parts in C.

That's what happens in Python, and that's why my Python apps are only 3-10 times slower than pure C ("only" considering that Python is likes 100x slower than C).

Only if your app is CPU-bound. How many are?

And even then often a better algorithm is going to get bigger gains than a faster language.

Face it, often the available processing time is very much finite, and the slower version just doesn't cut it.

And the same holds for memory. Your system has only so much of it, and for applications with many small objects most high level languages are very memory-inefficient if written in an idiomatic style. This can easily amount to a size factor of 20x in what's processable at all (disregarding execution time). Add garbage collection (which can "drown" processing) and the factor might even be higher.

Try adding a few Integers to a Java map and you will be surprised how inefficient it is.

I suppose in many applications it is. My career has pretty much been in Web apps and generally most work is I/O-dominated

"If you want to write efficient software, which is certainly not always necessary, ..."

Just take Harvard's CS50x on edx. It's probably the best intro to programming online and primarily uses C.

Depending on your university's CS program, it might cover the equivalent of 2-3 semesters' worth of material without feeling like drinking from a firehose.

If you already know how to program, then you could just pick up K&R and work through it.

Take a look at K&R C, and work through the problems without looking at the answers. Though it's a bit dated, I'd still recommend it to anyone who wants to learn C.

K&R = Brian Kernighan and Dennis Ritchie.

The book is small and pretty good. I found it useful. C is a small language thus the book is small (I think they say something to that effect in the forward).

Its book famous enough to have its own wiki entry:


Absolutely agree. It's not a long book either.

It's not a long book, but it certainly isn't an easy book, either. I've been doing C for several years and some of the exercises still give me pause.

You can also get the answer book.

Hopefully you will be learning C++ and not straight up C. Modern C++ has pretty much solved all the shortcomings of C, and it's sad that people are still stuck learning archaic C for no good reason. If you are actually learning C++, do NOT even look at K&R C. It is extremely outdated and straight up teaches bad practices.

That is definitely some kind of https://xkcd.com/386/ - moment for me.

I totally disagree. I am not sure if your comment is satire or not, I would do it the other way around. Learn C as it is fundamental to modern OSes and just learn C++ if you have to (e.g. maybe later at a job, or because you need a specific library, like OpenCV). However I do not think that fully grasping C++ is a worthwile endeavor, as there are many things implemented due to historical and/or compatibility reasons (to C) and not because they provide a real benefit. Not that C is perfect but I consider it less fucked-up than C++.

That is the path to write unsafe and unidiomatic C++.

CppCon 2015: Kate Gregory “Stop Teaching C"


Although I am not the big C++ fan, this is a wonderful talk and she has a great approach to teaching programming languages. Thanks!

Two totally different languages with two totally different use cases. Their use cases should not overlap if applied correctly.

C++ is a very high level language compared to C. C++ has low level constructs which closely resemble C but should only be used for high performance implementations of high level abstractions.

You are right, it is crucial to emphasize to beginners that these languages are different (and C++ not being some kind of add-on to C, which is - considering its name - a valid guess, but we have the same discussion with Java and JavaScript). But I think learning C, Python and JavaScript (and maybe some Java) to a degree that let's you get useful work done, is a lot simpler (meaning takes less time) than learning C++.

Not that I would discourage anyone from learning C++ (ok, probably, I might) I just consider it a bad PL for people starting out with programming.

Not only that, but Cython really takes away a huge amount of performance bottleneck encountered in Python code by generating pure C from a Python language superset.

More details here: http://scikit-learn.org/stable/developers/performance.html

There are still some of us that write (and hire for!) straight C.

Yet I only find job postings where they're looking for someone to code C/C++. I still don't know what that means. Does that mean both or one or the other? Or that they're still undecided as to what language to use? Maybe they don't know what language they are using..?

Basically what this means is that you will compile all your code using a C++ compiler, but because C++ is backwards compatible with C, they just call it "C/C++".

A lot of developers still write in a C-esque fashion because that's what they were taught in school or just haven't learned a new way of doing things. Developers still use the C standard library for file IO instead of C++ fstream. Developers use new and delete with raw pointers and manage the memory themselves instead of using a vector. I was recently on a project where all the code was written in an old C style- variables declared at the top of each function, raw pointers everywhere, usage of C strings, etc. Memory leaks and segfaults were abundant. Uninitialized and unused variables a'plenty. And they were using a C++ compiler.

In my group's case, it means both. We expect new hires to know C++ for most of our work, but for portability across different architectures C is sometimes still necessary.

In cases like that, I think it'd be so much better if people wrote "C and C++" to make it clear that both are indeed a requirement.

Firstly, let me qualify all of this by saying I try to be a true polygot and use the best tool for the job. Among my regular languages the past few decades include C, C++, Lisp, Java, Clojure, Smalltalk, Python, Ruby, C#, Haskell, Rust, and many more. As such, I have no attachments to any particular language.

C++ and C, while both using the letter C, are for all purposes, entirely differently languages with different goals. Use the one that meets your goals best. While I can agree that K&R is outdated in some areas, there are still plenty of reasons to learn and even use C. If for no other reason, some of the most popular, active, and most used code bases in the world are written in C.

C is essential if you want to understand a lot of the ecosystems around you, fix them, interact with them properly, and maybe one day contribute to or patch them if necessary. I know some people will argue that if they know C++ (or programming in general) they can read C, but hands-on experience is the only real way to learn. You will not know why things are done or recognize when they are done poorly/incorrectly. Moreover, C is a great teacher of how things work, and sometimes also how not to do things. To be ignorant of C is generally to be woefully unaware of how a huge part of a lot of the software you probably use works.

I worked/work in game programming, distributed computing, AI, and many other fields. I find C to be extremely valuable in all fields I have touched and at home and at work. If nothing else, learning about memory, byte ordering, alignment, packing, bit manipulation, etc is a huge solid base. C++ can teach you a lot of that too, but it deals with certain issues on fundamentally different levels (especially modern C++) and has just as many potholes and here be dragons areas. It is true that I don't start as many new projects in C as I used to, but it doesn't stop me if lets say the main thing I need to do is interact with something already in C/embedded, or have a certain level of control that C affords me while being able to also hire people that can work in it (factors when evaluating project constraints).

One argument I consistently hear with regard to C that you hint at is that if you learn C first, before C++, whenever, it will somehow taint you. I would argue that with every language, you are not learning the language if you do not how to write programs in it idiomatically, or even when to counter-balance idiomatic code with code that does what you need to do (ex: boost performance or other tradeoffs). At that point, you are not a programmer but a parrot or like someone who memorizes math formulas but can't apply them.

Programming is not regurgitating countless lines of syntax, it is critical thinking, problem solving, creativity, and many more things. Learn C, and learn other languages. Do not listen to anyone who tells you to learn any particular languages. Match the language with your task, problems, and other constraints. When in C, do C. When in C++, do C++. When in Python, don't do C. And so on. It's really not that hard for anyone experienced.

What could be really interesting for a Python programmer is [Nim Lang] (http://nim-lang.org/)

Python like syntax, statically typed, garbage collected, C like perf.

...and it can be embedded into Python or embed Python.

LearnXinYminutes tuts are pretty cool. I learned C++ before I learned Python (CPython ~2.0) before I learned C:

C++ ("c w/ classes", i thought to myself):

- https://en.wikipedia.org/wiki/C++

- https://learnxinyminutes.com/docs/c++/

Python 2, 3:

- https://en.wikipedia.org/wiki/Python_(programming_language)

- https://learnxinyminutes.com/docs/python/

- https://learnxinyminutes.com/docs/python3/


- https://en.wikipedia.org/wiki/C_(programming_language)

- https://learnxinyminutes.com/docs/c/

And then I learned Java. But before that, I learned (q)BASIC, so

I like the style this uses to explain the else-if construct. A few fundamental concepts are explained and combined in a way that it not only makes else-if obvious but also anything else using these fundamentals.

SICP also uses that style throughout and I love that. Wish I could explain things that well.

The best way to learn C for Python programmers is to dive into CPython interpreter or C-API extensions. Also use Cython annotation outputs to see how Python translates into C calls.

As someone who first learned Python and then picked up C, the two biggest challenges for me were: string parsing and mixed (or unknown)-type collections. Took me a good while to change my mindset since those operations are so easy & widely used in Python.

There are many libraries for "high level"-like string operation for the C programming language, e.g. https://faragon.github.io/sstring.h.html

"C does not have an support for accessing the length of an array once it is created"

Well, there is sizeof(array)/sizeof(array[0])

This only works for arrays allocated on the stack, not for heap-allocated arrays via malloc or similar.

I remember trying that, and I think it only works in the scope where you declared the array. So for example if you pass the array to a function, the size info is lost. Might be because of the "arrays decay to pointers" thing

It works, but keep in mind that arrays != pointers.

not as detailed, but more on a higher level this post from JavaScript to C might be interesting for some too: http://thinkingonthinking.com/learning-c-for-javascripters/

while writing c i found this free book helpful as a reference. http://publications.gbdirect.co.uk/c_book/

Let me point out two alternatives: Cython which is very Pythonic looking, and compiles to C and produces amazing projects including UVLoop which is a drop-in module for asyncio for Python 3 that will speed up asyncio:



Note how GitHub claims it's all mostly Python code ;) That's because Cython like I said looks Pythonic.

There's other examples, but I think this is one of the one's that come to mind the most to me.

There's also D which is called Native Python by some (unlike projects like Go and Rust, you can have your Object Oriented Programming (optional like in C++), and concurrency / parallelism too and other goodies like built-in unit testing, when you compile your code your unit tests are ran but not included in the final binary):



If it's been more than a few years since you've evaluated D you might want to check it out again, it may be worth your time. D is a language I knew about for years, and recently is where I've come to appreciate it for it's many features.

D has things like array slicing, [OPTIONAL] Garbage Collection, an amazing Web Framework called Vibe.d with it's own template engine called Diet-:



Things I like is Vibe.d is not just a Web Framework but a full networking stack too, also it supports Pug / Jade like templates (see Diet-NG) you make and compiles them when you compile your project, so your website runs off a native executable using fibers instead of threads. Vibe.d is undergoing a period where the core is being rewritten to where it is more compartmentalized so that you pick and choose which parts you need, MongoDB, PostgreSQL, layout engine and other goodies, there's even a templating library whose syntax is based around eRuby called Temple (though the syntax can be tweaked) that supports Vibe.d:


Another vote for Cython. All python programmers owe it to themselves to take a look at Cython. The fact that it's compatible with (almost?) all existing python libraries and you can almost certainly run your existing python code without any changes to the source code makes it very easy to get started with. Then you can start making (often relatively minor) changes to your source code and often get 10-100x speed ups on CPU bound tasks. And the resulting libraries are all automatically importable back into normal cpython.

Is there anything like this for Rubyists?

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact