
So you think you know C? - thewisenerd
https://hackernoon.com/so-you-think-you-know-c-8d4e2cd6f6a6
======
awalton
If you write a quiz like this, and "Don't Know" is an answer but "Undefined
Behavior" and "Implementation-Dependent" are left off, you're doing it wrong.

~~~
cpkpad
Indeed. The quiz doesn't start with "What does the ANSI C standard guarantee
about..." It says "So you think you know C." C is a cross-platform assembly
language. I use it when I need to code to the metal. I know exactly how it
will behave in most of those scenarios.

I understand, in abstract, I might be on an EBCDIC system, and things will be
different. But that's not the reality of C. C isn't just a formal
specification.

I spent a while on 1, thinking about how my compiler would handle that. 2 I
was 90% confident about. 3 and 4 I was 100% confident about.

~~~
rkangel
> C isn't just a formal specification

No, C IS just a formal specification.

What you are talking about is C on your particular compiler on your particular
platform. If you are writing C code just to be used for that scenario then you
can use all of the implementation understanding you want.

If, however, you want to write C that will have consistent behaviour across
multiple compilers and multiple platforms, then you need to limit yourself to
the behaviours that the standard guarantees. Otherwise the compiler behaviour
may change and your program behaviour will change.

Even between different versions of the same compiler implementation defined
behaviour can change (although assumptions about sizes probably won't).

~~~
yoklov
A lot of the implementation defined behavior can be ensured with static
assertions (either via new-fangled C11 _Static_assert or via the old fashioned
negative length array hack), particularly WRT sizes.

~~~
rkangel
Absolutely, and that's a big step forward to writing code that will silently
do bad things on a different platform. I use that technique on my current
project to assert a load of things that I know are true.

It doesn't get you as far as the ideal goal which is to write code that will
silently _work_ on another platform. Static asserts mean that you then have to
go and write more code when they happen.

------
tptacek
This is pretty silly. Being fluent in C has not much at all to do with
language quirks. Since (I assume) the 1980s, the comp.lang.c crowd has been
reminding us that compilers are fundamentally unknowable and that sizeof(char)
could very well be a random walk averaging 1.35. Meanwhile, in the real world,
network and kernel programmers have been using some variant of u_int32_t since
forever to ensure that they can parse network packets and driver data
structures by copying or even casting things into structs.

Also, why would anyone write ' ' * 13? For most of these questions, if your
first instinct is "that doesn't look like good C code", _your answer is better
than the "right" answer_.

~~~
paxcoder
Actually, since C99, sizeof char is defined to be 1 (byte). The number of bits
in a byte is implementation-defined.

However, "it's implementation specific and it's most probably this, or that if
you use such and such a compiler flag" is different from "I don't know".

As a rule of thumb, don't assume things about C's abstractions, read the
standard instead, or ask friendly humans who do if they check out.

~~~
greenshackle2
> read the standard

Casey Muratori, who's a better C programmer than me, is of the opinion that
you should check your compiler's behavior rather than read standards.

(Not that you should take his word as gospel, I don't fully agree myself, but
it's worth considering.)

~~~
pksadiq
> you should check your compiler's behavior

There is one thing I have noticed in gcc and clang which is done against the
standards: The sign of bit sized int in a struct.

eg: In a struct with member 'int a:8', The standard says that 'a' can be
signed or unsigned (based on the machines default of the sign of char). But In
gcc and clang, this is signed always regardless of the machine.

~~~
lmm
> eg: In a struct with member 'int a:8', The standard says that 'a' can be
> signed or unsigned (based on the machines default of the sign of char). But
> In gcc and clang, this is signed always regardless of the machine.

Wait, are you saying gcc/clang's behaviour is permitted by the standard, or
not?

~~~
pksadiq
> Wait, are you saying gcc/clang's behaviour is permitted by the standard, or
> not?

It's permitted. The same way as a char can be signed or unsigned:
Implementation defined. But in this case, it's always signed regardless of the
architecture (against how a char is handled).

From c99 draft (n1570), J.3.9 (Implementation-defined behavior): Whether a
‘‘plain’’ int bit-field is treated as a signed int bit-field or as an unsigned
int bit-field (6.7.2, 6.7.2.1).

But say, another compiler (ARM compiler) treats this differently [0]: Untill
version 5 of the compiler, such an int was unsigned by default. Later versions
defaulted to signed (as do gcc/clang).

[0]
[http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc....](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15377.html)

------
kosma
I did this test as a break from sitting in front of Keil C51. These questions
highlight a very important issue: there are platforms and compilers out there
that will bite you in the ass if you try to unconsciously apply what you
learned while using gcc.

Under Keil C51, alignment is always 1. This is legal C.

Under Keil C51, int is 16-bit. This is legal C.

Under Keil C51, sizeof(pointer) is anything from 1 to 3. This is an extension,
but a very popular one.

Don't assume that just because you can write crash-free code on x86, you
"know" C.

~~~
zokier
> Under Keil C51, sizeof(pointer) is anything from 1 to 3. This is an
> extension, but a very popular one

I think this is the funniest thing in this all. You might cross all your t's
and dot your i's, and you still may trip in a pitfall when your
compiler/platform vendor happens to actually diverge from the standard. Such
is life with decades old language with at least as many implementations.

~~~
kosma
Such is the price of using C on Harvard machines. It's even more pronounced on
'51 as even the smallest $.5 chip has four (!) different address spaces
(direct/register ram, indirect ram, xram and code) - and bigger chips have
even more as they support memory banking. The real question is: why doesn't C
support Harvard architecture natively, without ugly hacks?

------
te_platt
I'm reminded of an old story about a man trying to fill a job for driving a
stage coach. He asks each applicant how close they can drive to the edge of a
cliff without going over. The first few applicants brag about being able to
right up to the edge. The job goes to the first to respond "I don't know, I
try to stay as far away as possible".

------
bjterry
I sort of understand the point of this post, but I basically disagree with it.
From a pedantic perspective, you aren't writing "good C" if your code isn't
100% standards-compliant, but if we applied this same rigor to dynamic
languages, no one would get anything done in them. How do you know your Ruby
code works? The community has settled on the following solution: because it
passes thousands of tests on every aspect of every function. Extending that to
C code, your code works if it passes your tests.

Test on every platform and every compiler you support, or it's not going to
work. Avoid undefined behavior, sure, that's probably good practice. But there
is no need to bend over backwards for compilers from the 90s targeting the
AS/400 if you are writing a linux application that only needs to support
Ubuntu 16.04. If your only means of avoiding broken behavior is "we're going
to try really hard not to do anything undefined or implementation-defined" you
will fail. But if you pass your functional tests, your user doesn't care
whether you are invoking undefined behavior with every function call.

~~~
nialv7
Well I would agree that undefined behaviors are unavoidable. Indeed, if you
insert checks into your code to completely rule out UB, that will have a huge
impact on performance.

On the other hand, I don't agree passing tests is enough. No matter how many
tests you run, you can't guarantee there's no bug in your code. This is
important especially when security is a critical requirement.

------
pipio21
The right answer to all this questions is:

"Nobody in their right mind should use code like this in production".

You should avoid code like this at all cost. If you want to use this code
because you feel smarter, you will be creating enormous amount of problems in
the future and you are actually dumb.

The way to win the game is not playing it. Specially if you are programming
nuclear plants.

If you see this code what you have to do is replace it, not understand it. It
works today but when you change the compiler or the architecture 10 years from
now, it will be such a huge pain in the ass to find all the bugs and undefined
behavior it creates.

~~~
HelloNurse
And the key to not playing this undefined behaviour game is using integers of
known size. C has them since C99, so using int, short, char etc. like in the
article is reckless and inexcusable.

------
hal9000xp
In 2008, I've started to learn C language by solving these amazing puzzles:

[http://www.gowrikumar.com/c/index.php](http://www.gowrikumar.com/c/index.php)

I think this page contains high quality C puzzles.

If you know something similar, let me know, I would love to solve.

~~~
pksadiq
A note: 'int main()' and 'int main(void)' are different in C (but the same in
C++).

Most online puzzles uses the former (I have also seen this in hacker rank),
but the standard says that the main function should be defined as void if no
argument is accepted (or some implementation defined manner)

~~~
72deluxe
Yep, we write C++ at work and the technical director loves writing (void) in
parameters due to years of habit after writing C. To me it looks redundant (I
went straight to C++, skipped C).

------
ninjakeyboard
I don't even know C and I got 100%.

~~~
Kenji
See? C is so intuitive : )

------
jcoffland
I know the correct answers to these questions on Linux with an ARM or x86 and
on 8-bit AVRs and that's good enough for me.

~~~
pksadiq
Do you know the value of 'i' after executing the following statements (assume
it's written inside the main function):

int i = 0; int j = sizeof (i++);

// Thanks. :-)

~~~
wruza
Zero with great amount of indeedity.

------
bnegreve
Another C quizz, (which I find a little bit more interesting):

 _Will it optimize:_ [http://ridiculousfish.com/blog/posts/will-it-
optimize.html](http://ridiculousfish.com/blog/posts/will-it-optimize.html)

edit: it's actually about gcc rather than the standard itself, still pretty
interesting though.

------
dxhdr
I don't understand this critique of an outdated version of the language. Let's
see questions written in C99 using stdint types. Dealing with integer widths
is a non-issue.

Yes, inc/dec operators take extra care. Structure packing requires platform
awareness, similar to how endianness can matter for data manipulation.

~~~
kosma
Unfortunately licenses for outdated versions of outdated compilers for
outdated versions of an ancient language still sell for 3995€ a seat - I know
because I'm using one right now. Nothing beats good old 8051 in terms of
price. :)

------
tome
It seems the padding one is wrong for an even simpler reason. You don't know
what sizeof(int) is.

------
cjensen
Real world things in C that may bite you:

(a) char may be signed or unsigned so

    
    
      char a, b;
      ...
      short c = (short (a) << 8 | b);

will not work to construct a 16-bit value from two 8-bit values.

(b) NULL is not a useful macro. For example,

    
    
      printf ("%p", NULL);

can have unexpected results.

Those things mentioned in the article? Outside of DSPs and GPUs, you can
ignore them.

~~~
problems
Yup, I deal with the unsigned char issue at work all the time, I've got a pile
of code with casts to unsigned all over the place due to decoding some legacy
data formats. I'm not sure what the cleanest way to fix it is, but I find
myself writing nasty stuff like:

    
    
        (((short)((unsigned char)a)) << 8)
    

Which works but goddamn is it ever hard to read. Bracket highlighting makes it
just bearable. Fortunately I've abstracted this kind of stuff out as much as
possible so I only have it in a few places, but it still bites me in the ass
sometimes. Curious if anyone has a cleaner way of expressing this?

~~~
cjensen
Definitely use the new stdint.h types[1]. That will make your intent a lot
clearer. Also consider creating some macros for all common operations (like
creating a uint32_t out of four uint8_t).

[1]
[http://en.cppreference.com/w/c/types/integer](http://en.cppreference.com/w/c/types/integer)

------
nano_o
There is an interesting paper that recently explored the de-facto C standard:
Into the Depths of C: Elaborating the De Facto Standards, by Memarian et al.
[http://www.cl.cam.ac.uk/~km569/into_the_depths_of_C.pdf](http://www.cl.cam.ac.uk/~km569/into_the_depths_of_C.pdf)

------
sickbeard
We teach people to write proper code, then they get interviewed on the most
dumb half-assed code that nobody writes in the real world because we like
mental gymnastics

------
blt
IMO, this is actually a fairly good test of thorough C knowledge. All of these
issues are regularly described in articles, blogs, forums, HN posts, etc.
Anyone who has programmed C for a few years and put in effort to master the
language should know this stuff.

------
wyldfire
IMO the real answer is to validate any assumptions by using UBSan wherever
possible. Many compilers and many targets are supported.

Even if by some chance half of your C team knows all of the ins and outs,
subtle things like this can slip by even when under review.

------
netule
This reminds me of a presentation called "Deep C (and C++)" (2011):
[http://www.slideshare.net/olvemaudal/deep-c/](http://www.slideshare.net/olvemaudal/deep-c/)

~~~
Nadya
I was about to post something similar, CTLR+F and found this post. :)

"Deep C" illustrates the difference well between people who know C and people
who _know C_. A bit of it is language lawyering, but some of it can be
practical - although perhaps specific to certain projects or environments.

IMO this sort of knowledge is most practical if you're maintaining legacy
software or in specific environments that you would otherwise be making a
mistake in. People writing evergreen software or in specific targeted
environments don't really need to worry about all these types of "Gotch'ya!".

------
TillE
The lack of specified type sizes is something that bit me recently. On pretty
much every x86-64 C/C++ compiler, sizeof(long) is 8, except on MSVC where it's
4. Thank goodness for explicitly sized, now-standard types like int64_t.

~~~
marcosdumay
sizeof(long) >= 4

sizeof(int) >= 2

sizeof(long) >= sizeof(int)

That's all you can expect.

~~~
klodolph
You literally cannot expect the first two. On word-addressable hardware, you
could easily get sizeof(long) = sizeof(int) = 1. This happens on DSPs.

~~~
marcosdumay
Yes. Didn't think about machines where the memory isn't composed of bytes.

The best approach is still to not expect anything.

------
rekshaw
Don't know C. I got all the answers write...do I know C?

~~~
yellowapple
Yep. You're now a C programmer. Congrats! Linus Torvalds is happily awaiting
your first patch to Linux.

------
dwarman
Sharing data structures in RAM between 64 bit and 32 bit and 16 bit
architectures brings this home painfully. Data communications is equivalently
plainful, possibly moreso becaude one has no knowledge of the receiver's
architecture. Sending multi-byte values between big- and little-endian
machines, for example.

The test could also have included questions on things like packing bools and
enums into structures. Again critical to get righht when sharing or
transmitting structs, or anything bigger than a char.

~~~
3r1kB
Yep, unfortunately a universal 'Network-Byte_order'[1] stil doesn't exist.

Things like this were especially a pain with industrial network protocols that
we used in our applications during the transition from PowerPC to Intel on OS
X. All the CFSwapInt16HostToBig() and CFSwapInt16BigToHost() stuff...

And earlier before I learned about '#pragma packed' it took me a while to find
out where all these extra zeroes suddenly came from :-D

[1]
[https://en.wikipedia.org/wiki/Endianness#Networking](https://en.wikipedia.org/wiki/Endianness#Networking)

------
chetanahuja
To all those kvetching about weird C (and by extension, C++) behaviors... what
is your alternative for an unmanaged, system level language with broad
compiler/library support?

~~~
xenadu02
We're still fighting to get people to acknowledge that C is a problem. Way too
many software engineers are still in denial about that.

D, Rust, and Swift are the only languages I know of that have "systems
programming" as a focus.

------
dreta
THIS is why C is bad. It’s not because there are no classes, and you have to
manage your own memory, or because it has unsafe pointers. It’s this. And it’s
the same reason why C++ is awful even before you get to the "++" part. I wish
that by now we had a proper ASM abstraction language that allows the
programmer to work with the architecture they’re programming for, instead of
all the “undefined behaviour” we have to deal with since the 70s.

------
user5994461
How to recognize a terrible quizz in any programming languages?

Simple. It keeps asking code questions about undefined behaviors that MUST
NEVER be used in real world programming.

#StandUpAndLeaveTheInterview

------
Kenji
I knew it was Undefined Behaviour, but I played along and tried to guess what
it would return on GCC/Windows/x86/no optimisation, in the assumption that the
author is clueless about UB and just ran some tests and did a quiz about it.

------
OMGWTF
Would question 4 have a different answer if it was:

    
    
      unsigned int i = 16;
      return (((((i >= i) << i) >> i) <= i));
    

? (I changed `int` to `unsigned int`.)

~~~
guyzero
Nope, on 16-bit platforms a 16-bit left shift has a very different result from
a 16-bit left shift on a 32 bit platform.

It's basically a question about knowing when you'll get burned by the size of
int varying.

~~~
OMGWTF
So?

    
    
      (i >= i)                           // 1
      ((i >= i) << i)                    // 0x10000 or 0
      (((i >= i) << i) >> i)             // 1 or 0
      (((((i >= i) << i) >> i) <= i))    // ((0 or 1) <= 16) == 1
    

Where did I go wrong?

~~~
loeg
> ((i >= i) << i)

UB.

~~~
OMGWTF
Ah, you're right.

    
    
      > If the value of the right operand is negative
      > or is greater than or equal to the width of
      > the promoted left operand, the behavior is
      > undefined"
    

([http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1570.pdf](http://www.open-
std.org/jtc1/sc22/WG14/www/docs/n1570.pdf) \- ISO/IEC 9899:201x (C11 Working
Draft N1570): 6.5.7 Bitwise shift operators)

~~~
loeg
Yeah. It seems like something easy to define to me, at least in the case the
compiler can recognize the shift is a constant. But that's the standard for
you.

------
dep_b
> And I had to learn it the hard way too.

I'm worried. What blew up?

~~~
vortico
Well, the author is a Ukrainian nuclear engineer. ~

------
golergka
Correctly answered all the questions.

I'm not C expert, by far; I only have an instinct that if it's about C and
memory layout, I shouldn't rely on my understanding and check it instead. On
every architecture I'm targeting.

------
bogomipz
I enjoyed this post. I am curious do any of you have any C-specific blogs that
you read regularly? Maybe not exclusively C but predominantly C or mostly
C-related?

------
Sir_Cmpwn
I realized these were trick questions at #5. My answers till then were "well,
on gcc on i686, probably ____". The trick questions are in poor taste imo.

~~~
acveilleux
I would've thought the first question was a big red flag but I guess in
Windows and x86 linux int is 32 bit even with 64-bit processors. Not the case
in MIPS64 (under IRIX) or on DEC Alpha.

I had my formative C experience in the DOS era with 16-bit compilers and then
SGIs with 64-bit compilers (and I had the "fun" of porting grad student
written code from SGI to Linux) so you get burned with expectations about int
length. Packing makes it even worse as many RISC machines (like MIPS) are word
addressed and non-aligned loads require a bunch of bit fiddling or you get a
Bus Error (SIGBUS).

------
paulftw
While I get the overall message I'm not sure last question is that
unpredictable: I++ + ++I If postfix is executed first it returns 0 but prefix
returns 2 and the result is 2. If prefix is executed first it returns 1 and
then postfix also returns 1, result of the addition is again 2. What else is
ambiguous about this snippet? I can't think of a parse tree that'd evaluate
addition before any of the increments - that'd be a syntax error. If the code
were (++i + i++) there could be alternative interpretations, but again ++
requires lvalue...

~~~
loup-vaillant
The order of evaluation in such an expression is entirely unspecified. There
are points in C programs that separate a "before" and an "after". The
semicolon is a typical such point. There are others. Now between two adjacent
points anything goes.

In this particular example the evaluations and effects are allowed to be
interleaved:

    
    
        pre    = I + 1;      // evaluating the pre-increment
        post   = I;          // evaluating the post-increment
        I++;                 // effect of pre-increment
        I++;                 // effect of post-increment
        return pre + post;   // evaluating the sum
    

Simply put, pre and post-increment are allowed to occur "simultaneously".

~~~
thick18cm
what you are saying may be correct, but you are "overloading" the term "order
of evaluation" to encompass atomicity (atomicity of operations like i++); is
that kosher?

~~~
loup-vaillant
Indeed I am. What I want to convey is, between 2 sequence points, C is
actually non-strict. Any effect is like unsafeInterleaveIO from Haskell. Which
is why I rarely mix up effects together or with a computation in a single
instruction.

------
maxxxxx
This is a good reminder that once you deal with different compilers and
operating systems a lot of assumptions don't work anymore.

------
davidgerard
> That’s right, Mozart also wrote in C.

 _applause_

------
kilon
You confuse knowing with expertise.

------
mikestew
In my case, I'd normally say that Betteridge's law applies. But I did
surprisingly well on the quiz.

Though I'd agree with the comments on how this is kind of silly. The answer is
not always "I don't know" but rather, in some of those cases, "I didn't define
my data types well enough to know for sure, depending on CPU architecture and
the compiler".

~~~
mark-r
I think the point was that if you're writing portable code to the standard,
you _can 't_ know the CPU architecture and the compiler!

------
smegel
> Yes, the right answer for every question in “I don’t know”.

Do you think you know the English language?

------
chris_wot
Even when you know the standard, you will still get undefined behaviour.

------
shmerl
So the author thinks undefined behavior is the esoteric knowledge of C?

------
faragon
C is beautiful.

