Hacker News new | past | comments | ask | show | jobs | submit login
C Primer (enlightenment.org)
353 points by chauhankiran 10 months ago | hide | past | web | favorite | 84 comments

Not a bad summary. Albeit with some odd choices.

[EDIT: Reading a little more I'm revising down as summary. There's quite a bit that's amazingly tortuously explained, or slightly wrong. e.g. '//' doesn't have to be the first 2 chars of a line, it's anywhere to EOL. He refers to both parentheses and braces as braces! Explanation of needing ';' at EOL is not helping any beginner who;d be better off with the traditional "all C statements end with a ';'".]

Couldn't call it a primer though as it takes an express trip through the features from my first program to pointers to functions in a dozen page downs.

I don't think "downside of function pointers is the added mental load to handle the indirection" rates as enough of a warning, at all. How's about letting our beginners trip up on arrays and pointer arithmetic, in a few example programs, before giving keys to all the self inflicted weaponary?

"void An undefined type, used often to mean no type (no return or no parameters) or if used with pointers, a pointer to “anything”

Undefined? That strikes me as a way for the beginner to get totally the wrong idea.

It's used to indicate no return value or parameters in function definitions. A void pointer is generic - it can be cast implicitly to anything. You can't do pointer operations until it has been cast to a type. It is not undefined.

"Explanation of needing ';' at EOL is not helping any beginner who'd be better off with the traditional "all C statements end with a ';'".]"

Almost all C statements and declarations end with a semicolon. But compound statements don't.

Thanks. Missed 'individual'... Mind I'd stick with the K&R approach and refer to compound statements as blocks in a beginner's primer. What I wouldn't do is "explain" in a way that will probably have a beginner rescanning multiple times - I had to do a double take:

"Other lines that are not starting a function, ending it or defining control end every statement in C with a ; character"

"In C an array is generally just a pointer to the first element."

No, arrays are not pointers.

"A String is generally a pointer to a series of bytes (chars) which ends in a byte value of 0 to indicate the end of the string."

No, a string is by definition "a contiguous sequence of characters terminated by and including the first null character". It is not in any sense a pointer.

Anyone who things arrays or strings are "really" just pointers should not be writing C tutorials.

(See section 6 of the comp.lang.c FAQ, http://www.c-faq.com/ .)

Indeed, "arrays are pointers" is a polite fiction we use to more easily pass array references around, but there are subtle differences that the compiler usually hides from you. Until it doesn't.

I wouldn't call "arrays are pointers" a polite fiction. I'd call it a dangerous misconception that makes it more difficult to understand what's really going on.

The language is partly at fault for a couple of rules that implicitly convert or "adjust" arrays to pointers in certain contexts, but the solution is to explain the differences, not to gloss over them.

Ok. But I do not blame people who unify the two things, because C does not really stress the difference.

On the contrary - even in small details, like for example the equivalence of p[i] and * (p+i), so you can legally write 3[p] in C...(==p[3])

It's because that C makes it easy to confuse the two concepts that it's very important to stress when and how they're different.

What creates the confusion in C is that arrays try very hard to decay into pointers at any occasion and on top of that you have "fake" arrays when they're declared as function parameters (something that's probably the most insane "feature" of C IMO. And don't get me started to the a[static n] syntax that solves nothing and introduces even more confusion on top of adding yet an other completely new meaning to the 'static' keyword). Most crucially they behave very differently when it comes to using sizeof.

It's also very important to understand the difference to be able to understand `const char p = "abcd"` vs `char p[] = "abcd"` or `struct s { / ... /; char data; }` vs. `struct s { /* ... */; char data[]; }`.

C really does stress the difference. Look up the definitions of "pointer" and "array" in the C standard. They're entirely distinct concepts.

p[i] is equivalent to (p+i), and that's possible because both the indexing "[]" operator and pointer addition "+" operator require a pointer and an integer as operands. (The pointer needs to point to an element of an array object, but that's not enforced at compile time.) For p[i], it's very common for the pointer operand to be an array expression that "decays" to a pointer.

(3[p] is valid because pointer addition, like integer and floating-point addition, is commutative. It could* have been defined to require a pointer as the left operand and an integer as the right operand. The decision was made when the distinction between integers and pointers wasn't as strong as it is in modern C.)

Curious, where are you quoting from on the definition of a string? I wouldn't imagine the C standard differentiates strings from other pointer types other than when talking about string literals.

Exactly. I think this point is most easily observed through extern.

It's been way too long since I've learned C for me to put myself in the shoes of somebody attempting to learn it for the first time but it's a bit steep for a "primer", isn't it? I mean just looking at the examples it goes from introducing structs to a complex examples dealing with malloc, free, enums, pointers, NULL, arrays and for loops.

I also think the section about "the machine" is interesting but it's a bad idea in a C tutorial. When you code C you code for the C abstract machine, not your CPU. It's very important to understand that difference, especially if you want to write portable code. For instance integer overflow is most likely very well defined on your machine, but it's not in the C virtual machine. C is not a macroassembler. Furthermore I really don't think the details in this sections are relevant in the early stages of learning C.

There also are a few technical inaccuracies:

>So bytes (chars) can go from -128 to 127, shorts from -32768 to 32767, and so on. By default all of the types are signed (except pointers) UNLESS you put an “unsigned” in front of them.

That's true on most systems but "char" can be either signed or unsigned (IIRC on ARM they default to unsigned) and the C standard only gives minimum ranges, so 32bit chars and 64bit shorts are theoretically possible (and can occur on systems where the smallest addressable value is larger than 8 bit such as some DSPs). It's pure nitpick of course but why even bother mentioning that in a C intro? Introduce stdint.h instead, that's actually very useful for any C coder.

>You can have a pointer to pointers, so de-reference a pointer to pointers to get the place in memory where the actual data is then de-reference that again to get the data itself

>So keep this in mind as a general rule - your data must be aligned.

>Note that in addition to memory, CPUs will have “temporary local registers” that are directly inside the CPU.

Are we seriously talking about nested pointers, alignment and CPU registers before we've even introduced printf? If a newbie manages to get past that and continue through the examples I congratulate them.

I don't want to sound too harsh, it's definitely very commendable to try and share your knowledge with other people and writing tutorials is very time consuming and not always very rewarding. That being said I definitely wouldn't advise anybody to start learning C using that document, especially since there's already a wealth of resources for learning C, both online and off.

> Are we seriously talking about nested pointers, alignment and CPU registers before we've even introduced printf?

No, printf is introduced in the very first hello world example. But the concept of format strings is not explained, nor are variadic functions, and no documentation for printf is linked.

Also, the example

    #define MY_TITLE "Hello world"
    printf("This is: %n", MY_TITLE);
is really quite broken. %n needs and int* argument and writes the number of characters written so far to it. A string constant is not only of the wrong type, even more importantly it is not writable (though not const).

Of course a modern compiler with -Wall would catch this, but the tutorial never mentions any flags you might want to pass to your compiler...

That has to be a typo where "%s" was meant.

I'm curious how do you manage to confuse the two...

Oh, in the Dvorak layout, "n" and "s" are next to each other. Maybe that's how.

Perhaps it was supposed to be "%s\n".

> 32bit chars and 64bit shorts are theoretically possible

Indeed. I worked at Cambridge Silicon Radio. Their Bluetooth chips often had two processors: a XAP for the BT stack and a Kalimba DSP for audio processing. They had 16-bit and 24-bit chars respectively. About 1 billion such chips have been sold, so it's not just some wacky architecture invented by a mad bloke in a shed.

Writing code for the XAP2 processor from CSR gave me headaches. Did not help that the compiler is some old forked off version of GCC.

It's been a few years but it was GCC 3.3 as I remember, always being called in C89 mode. Worst part was that there was a debugger, but it couldn't actually be used. All you get is printf()

Thank god it's not GCC 2.95. I still have nightmares about embedded systems stuck on that compiler.

I still occasionally see the people responsible for that. I'll pass on your regards :-)

The article makes clear it's a "practical" tutorial for the purposes of EFL on common architectures. EFL is a set of mostly low level libraries underlying Enlightenment. Things relating to things that are theoretically possible (like 64 bit shorts), is irrelevant if it doesn't occur on the platforms they plan on supporting, while a lot of basic stuff won't be relevant because it's not likely to be something you'd use as part of those libraries.

I think most of the things you point would be valid for a generic C tutorial intending to teach C, but less so for a tutorial intending to teach the bits of C the author thinks is most important to be able to understand and/or contribute to EFL.

EFL is famously known as not being a good example of good and safe C quality.


I'm not really sure I see who could benefit from something like that.

If the coder already knows the basics of C and you want to bring them up to speed on the details they'll have to know to work on your library then I don't think you'll need to explain that "A function is a basic unit of execution" or that "Memory to a machine is just a big “spreadsheet” of numbers" for instance. This is clearly meant for complete novices, but at the same time it introduces too many concepts at once to work as a tutorial to learn C from scratch.

Instead focus on how your libraries are architectured, how the pieces fit together, the rationale for why things are this way etc... See for instance "linux device drivers"[1] as an example of what I'm talking about. I know C and this "primer" doesn't give me anything valuable if I actually wanted to hack on their libraries.

On the other hand if the coder really doesn't know what a function or a struct is then it's probably a better idea to tell them to follow some actual C tutorial and come back when they're done because this particular text will probably overwhelm and confuse them, not actually give them the skills necessary to actually contribute decent C code anyway.

[1] https://lwn.net/Kernel/LDD3/

>When you code C you code for the C virtual machine, not your CPU

I code for my cpu, when I code C.

You might think you do but I can assure you that your C compiler is not aware of that.

Here's a quick example to demonstrate why I mean, consider the following code that increments a variable and returns it. If the incrementation overflows it returns 0 instead:

    int increment_or_0(int a) {
      int res = a + 1;
      if (res < a) {
        /* Overflow */
        return 0;
      return res;
If you code "for your CPU" and your CPU is, say, x86-64 then you might think that this code ought to work. You add one to a and if the result is smaller than the original value of a then an overflow occurred. With 2's complement arithmetic that makes perfect sense. Yet built with gcc 7.3.0 using -O3 the compiler generates the following assembly:

    lea    0x1(%rdi),%eax
In other words you have the incrementation but the overflow check is nowhere to be found. That's a perfectly valid optimization for a C compiler to make, because your code relies on an undefined behavior.

In this case if you really want to write your test that way you can use the "-fwrapv" to tell GCC to assume fully defined integer overflow but then you're coding in a nonstandard dialect of C.

I believe K&R refers to it as an "abstract machine", not a "virtual machine." On (most) platforms, there's nothing virtual about it.

Ah yeah, when I wrote it it felt wrong, although I think in context it's not ambiguous. I'm going to correct it, thanks.

You are correct about every published version of the C standard. C-next plans to standardized on 2's complement for signed integers, for what it's worth.[0]

(For developers: additionally, as a non-standard feature, most popular open source compilers (i.e., gcc and clang) support an option called "-fwrapv", which does not treat 2's complement overflow as Undefined Behavior. This is useful for getting a large legacy codebase working on newer compilers, if it makes assumptions about 2's complement. Until the new C standard is published it's probably best to avoid relying on 2's complement overflow in new code.)

[0]: https://twitter.com/jfbastien/status/989242576598327296

Signed overflow will still be undefined behavior even after the planned change (making signed integers 2's complement).

This is explicitly mentioned in the page linked in that tweet:

> Status-quo If a signed operation would naturally produce a value that is not within the range of the result type, the behavior is undefined. The author had hoped to make this well-defined as wrapping (the operations produce the same value bits as for the corresponding unsigned type), but WG21 had strong resistance against this.

I think you're talking about code that doesn't work, while the person you replied to is taking about code that works. Yes, one should always strive to write conformant code, but in practice For e.g. no large x86 C project can 'just' be recompiled to PowerPC or ARM or X64.

My point is that it's not possible to write code that works in C if you don't actually target the C abstract machine, unless you're using compiler extensions. It's not even about portability, as my example shows, it's just about basic correctness and having your code behave as expected. Knowing what kind of assembly to expect as the output of the compiler for a given C code is a useful skill for an advanced C programmer looking to write optimized programs but it's really an advanced topic. When I learned C in high school I only had a very fuzzy understanding of what a register was and how memory worked, it didn't stop me from writing useful C code.

That's why I think it's a bad idea to give all these nitty-gritty details about CPU registers and the way the heap and stack are laid out in RAM because in practice C doesn't say anything about that (well, there's the register keyword I suppose). For a C tutorial it's more important to teach the C abstract machine and its rules than the way they map to a given piece of hardware.

Take for instance this sentence from the article:

>You can even tell the compiler to make sure it has an initial value. If you don't, its value may be random garbage that was there before in memory.

I don't think that's a good thing to say. What you want to say is that the value is undefined and reading it is forbidden because that's what it really is. If you decide to use an undefined value somewhere in your code the compiler might swoop in and optimize it away, not use "random garbage that was there before in memory". Undefined is not random, undefined is not garbage, undefined is undefined. You might print it twice and get two different results. You might have a condition "if (uninitialized > 0)" and a condition "if (uninitialized <= 0)" and both may run, or maybe none will. Because there's no such thing as "random garbage in memory" in the C abstract machine, only defined and undefined values which are semantically different.

I have mixed feelings about this, since it's got a lot of great information for someone getting started with C. I can see why you'd want to avoid scattering dire warnings all over the place, but it skips blithely past a whole host of hazards, stopping briefly only to mention that synchronization can be a cause of threading bugs. It's leading people down the garden path, and then leaving them at the bottom where there's a hungry tiger.

> I can see why you'd want to avoid scattering dire warnings all over the place, but it skips blithely past a whole host of hazards

Much like the enlightenment codebase itself, unfortunately. If you want to see some quite bad C code, just check out the source.[0]

[0]: https://what.thedailywtf.com/topic/15001/enlightened

Wow. Glad I use Qt!

The threading section has a reasonable warning sign before the tiger…

All threads can read and write to all memory within that process. This is extremely dangerous, so avoid threading until you have fairly well mastered C without threads.

If you must use threads, even if you are experienced, sticking to a model where threads share as little data as possible and very carefully hand off data from one thread to another and then never touch it again is far safer.

Whenever I was trying to learn C (and as ever making small bits of progress), what I always wanted was a great visualisation system for pointers. I've seen various attempts at this online but I've never really found something that felt intuitive. I think a good on-paper means of reasoning about pointers would be a huge help in becoming fluent in C.

What about this analogy? You have a piece of paper with a street address on it (the pointer value). You go to that street address (derefence the pointer), open the mailbox, and inside is another piece of paper with what you're actually looking for (like someone's name). For pointers to pointers, add another level of indirection..

And after you've learned that (basics of pointers in C), a nice quick test of your knowledge (but not a comprehensive one) can be to read this standard declaration of main() in C (based on ANSI C, BTW, as described in 2nd edition of K&R C book, not sure if changed in later versions of standard):

  int main(int ac, char **av)
and then try to figure what the following mean (what type they are, and whether each of them is a pointer or a pointed-to thing, or both:





I was once teaching a class on C to a group of experienced programmers (10+ years each, but new to C), and after I explained the char

stuff, one of them said "Now it is overhead transmission" :)

I don't think really anyone has difficulty understanding the concept of a pointer. It's their usage that trips people up, and that most tutorials I've seen gloss over, which is where all the confusion stems from.

Exactly. There is the concept and the syntax of pointers, and those are relatively straightforward to understand. I gave up on learning C++ — twice — when I was a kid because no book explained why there are pointers. It’s all very “a pointer points to something! Now on to the next chapter.” (The third time was the charm, luckily.)

One simple example showing value semantics would’ve made it click for me, and things only get more interesting when you talk about performance.

I had the same problem.. I kind of flopped and floundered around for a week or 2 before it just suddenly clicked one day and I was off to the races. Pointers are still my favorite part of C, but I struggle to explain them to new programmers.

Old school now, but back in the late 90s I used GNU ddd. See some example screenshots here: https://www.gnu.org/software/ddd/manual/html_mono/ddd.html#S...

I have used C Tutor when I was teaching C to undergrad.

Here is a visualization I created to illustrate linked list: https://goo.gl/12juwA

Hey! I actually did see this before because I was searching for exactly that. I think I found it helpful in a specific way - maybe I should try using it for longer to see if it sticks in my mind. I think that's what I'm really lacking - a kind of mental visual short hand that holds together in my head.

> what I always wanted was a great visualization system for pointers

Look at 68000 assembly language. In particular indexed addressing modes.

In C really that's almost all they are.

David malan’s CS50 course has some nice material on this. Some of the clearest well presented and enthusiastic lectures I’ve seen.

Where can we watch them?

There are multiple websites and communities. The /r/cs50 [0] is popular. Also, there is cs50.tv [1]. Tons of information and where or how to work through the course. Many options.

[0] https://reddit.com/r/cs50

[1] https://cs50.tv

As far as I can tell, only the last lecture is available.

K&R is the C primer. Read the entire comp.lang.c FAQ for more advanced stuff.

I love the K&R, as a good example of a concise, well-explained, with exercises, introduction to a language. I measure every other programming book relative to K&R.

That said, I wish there was a 21st century update to the K&R book, to take into account improvements made the at least the C99 standard, maybe even C11, update the programming style a bit to avoid some of the more egregious risky behaviors (always use strncpy), and some coverage of the preprocessor. All of which while still maintaining the clarity of the original book.

"always use strncpy"

I disagree; you should almost never use strncpy. Its name implies that it's a "safer" version of strcpy, but that's not what it is at all. If the target array isn't big enough, it can easily end up not containing a null-terminated string.

As it happens, I wrote about this a few years ago. http://the-flat-trantor-society.blogspot.com/2012/03/no-strn...

Alpha is certainly not big endian. And MIPS today is probably little endian more frequently than big endian.

A little light on pointers, considering how important they are and what a conceptual stumbling block they can be.

I wrote C/C++ for years in a scientific setting and didn't completely conquer the damn things until I had to write some assembly for a hobby project.

It's interesting that this is from the enlightenment website. It's been years since I tried their desktop/window manager. Glad to see they are still around.

I checked the link expressly to see if it had indeed come from the Enlightenment Project. If you want an easy way to try it again, I recommend Bodhi Linux, which packages Ubuntu fundamentals with a modified Enlightenment Desktop. It amazes me every time I use it. So gorgeous, and resource intensive (remember when Enlightenment was considered "heavy"? All the major DEs are far heavier now).

The UI was great, it had some kind of Amiga feeling with its gorgeous graphics.

The actual internals not so much, to the point not everyone is pleased with its use on Tizen.

Nostalgia... Rasterman was light years ahead of his time with enlightenment. Same programmer's league as Torvalds and Bellard.

See also: Summary of the "C" language (https://www.csee.umbc.edu/courses/undergraduate/CMSC104/spri...)

It always amuses me which "names" some "people" put in "quotes", and if they distinguish between "C" and C, or JAVA and Java. Probably some subtle technical distinction a programmer wouldn't understand...

Great introduction for a beginner. C remains my favorite programming language, even after all these years. It's one of the most simple languages, yet also very powerful. Yes, it is easy to make horrible bugs- but that's the price you pay for speed.

I would consider it a very compressed summary of key terms and aspects and not something to build knowledge from if you are a beginner. The trip to a healthy understanding of all these concepts goes through a lot more detailed examples while solving relevant problems and realising the motivations for the existence of each concept.

This is great, wish I'd seen something like this when I was first interested in C. Most introductions are afraid to go into anything complicated, which is ultimately patronising to a reader and doesn't help in the long term.

Incidentally, a long time ago I was looking for GUI libraries under Linux, and looked at Enlightenment code. I was very, very impressed. I expected a mess, because Enlightenment was mostly (superficially) known as a fancy window manager with all the doodads, bells, and whistles. What I found instead were cleanly structured libraries with great concepts and nice C code.

Some people would disagree with you...


redis is also sometimes held up as an example of well-written C code.

I often find tutorials about C language miss details about standard object-oriented programming with C. How to do inheritance, type checking, fields and methods, ref counting and so on. The practices are used in almost any C software today but rarely covered systematically.

Color me interested. Where can I find a codebase that uses this? Especially inheritance.

Like here https://www.codeproject.com/articles/108830/inheritance-and-.... For examples of this you can check gnu tools like gcc or gdb, or glib-based libraries or the kernel source.

who still call a log significand a "mantissa"?

specially in a easy to understand the basics kind of article.

This is the kind of article that touches on so many issues and aspects, not giving the experienced any new knowledge, while giving a fake sense of "completeness" to beginners, letting them build an overly simplified and artificial view.

You understand what linking is when you understand how addresses in machine code work, you understand the call stack when you understand the principal layout of variables in memory, etc.

Yours is not ideal . There are always multiple paths , and none is perfect. Let's all keep walking, mentors and mentees , noobies and experts .

BTW I understood the power stack when I tempted to code a language...

I actually found it great and took some information from it, so it was useful for me at least :)

"This is not a theoretical C language specifications document. It is a practical primer for the vast majority of real life cases of C usage that are relevant to EFL on todays common architectures. It covers application executables and shared library concepts and is written from a Linux/UNIX perspective where you would have your code running with an OS doing memory mappings and probably protection for you. It really is fundamentally not much different on Android, iOS, OSX or even Windows.

"It won't cover esoteric details of “strange architectures”. It pretty much covers C as a high level assembly language that is portable across a range of modern architectures."

> int bobs

> struct sandwich

> enum filling

C for people who know what EFL stands for. I believe I know C but I can’t for the life of me think of what EFL might mean. Yes I’ll google it but what does that say about this author?

EFL are Enlightenment Fundamental Libraries - the underpinnings of the Enlightenment Desktop Environment. And the author is writing on the Enlightenment website: it's clearly intended for current developers/users of Enlightenment, who would probably know what EFL stands for. So it doesn't say anything about the author, who is writing for a known audience.

Enlightenment Foundation Libraries?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact