

Dangerous Embedded C Coding Standard Rules (2011) - momo-reina
http://embeddedgurus.com/barr-code/2011/08/dont-follow-these-5-dangerous-coding-standard-rules/

======
kabdib
Well, let's see.

#1: Avoiding shifting versus doing division: Nope. Crappy compilers and bad
architectures will yield bad results. Know what you're doing.

#2: Use a typedef instead of a bare C type, then. Also, on many systems space
is king (the last embedded system I wrote had 6 bytes left over in code space,
and 16-bit operations were incredibly expensive and nearly unaffordable).

#3: Just make the code clear. This rule is lunacy.

#4: Well, maybe. Except that variable initialization can often be made quite
compact by the runtime (e.g., an expansion of a compressed data segment into
RAM), and so save you code space.

#5: Again, the compiler can screw you here. Also, it's possible to take the
address of a constant and have it be wiped out (for naïve implementations,
anyway).

My basic rules:

#0: Don't make assumptions. Know what the heck is going on, fractally. Look at
the generated code.

#1: Be a responsible engineer. Design and code for production and testing, and
for the next guy. This often means not being clever, but if you are being
clever, be forthright about it and don't leave your brilliance as a puzzle.
(Don't be clever just to be clever. That will get you fired, no foolin').

#2: Avoid platitudes. Be wary of people who tell you not to use something,
always, or to use a particular contruct, always. Be suspicious of code that
follows someone else's dogma. [I once worked with a code base where the
engineer had used the ternary -- "?:" \-- operator _everywhere_ because he
thought it was more efficient than using 'if'. Of course it was just guesswork
on his part and not only was his code a train wreck, but he had never measured
his results. We fired that guy).

Embedded systems are just software, they're not magical, they're not (usually)
that hard, and for some reason people are afraid of them. Some of my best
moments in the industry have been writing these little things, they're fun as
hell.

~~~
gioele
> #1: Avoiding shifting versus doing division: Nope. Crappy compilers and bad
> architectures will yield bad results. Know what you're doing.

Which compiler, among those still used in 2014, does not convert a division by
2 into a shift?

~~~
angersock
Memory escapes me...is a shift guaranteed by the standard (and if so, which)
to sign-extend or not sign-extend?

~~~
kps
No, it isn't.

A right shift of a signed, negative value is implementation-defined (which
means, something in particular happens, and the compiler or its documentation
is supposed to tell you what).

A left shift of a signed value is undefined (which means, the compiler is
allowed to launch the missiles, or fail to launch the missiles, whichever is
worse for you) if a nonzero bit could conceivable get into the sign.

------
sophacles
This is one of those blog posts where the comments/discussion are really good
- maybe even better than the article itself. I recommend reading the
discussion on-page if you haven't.

------
wtracy
Another reason to pay attention to #2: Most CPUs perform simple arithmetic
operations fastest on their native word size. On a 32-bit machine, addition
and subtraction on 8-bit variables is probably slower than on 32-bit
variables. (Often the hardware will transparently cast to 32 bits, perform the
operation, then cast back to 8 bits.)

I'm not an authority on this topic, so I don't want you to take away the
message "always use the machine word size". What I _do_ want you to take away
is that there are many cases where an "obvious" optimization is
counterproductive in practice. Always start with simple, working code, and
never optimize without benchmarking.

~~~
alecrn
For this case, the "fast" int types are provided, eg. uint_fast8_t is the
fastest unsigned integer which can hold at least 8 bits; uint8_t is required
to have exactly 8 bits.

[http://pubs.opengroup.org/onlinepubs/7999959899/basedefs/std...](http://pubs.opengroup.org/onlinepubs/7999959899/basedefs/stdint.h.html)

------
mcguire
With regards to Bad Rule #3, "Avoid >= and use <," if I saw

    
    
        if (speed < 100) ... else if (speed > 99)
    

in code, my only response would be to question the author's sanity.

But then, I question the sanity of C standards committee members all the time,
specifically regarding his actual complaint related to Bad Rule #2 and
unspecified-width integer types[1].

[1] [http://www.netrino.com/Embedded-Systems/How-To/C-Fixed-
Width...](http://www.netrino.com/Embedded-Systems/How-To/C-Fixed-Width-
Integers-C99)

~~~
pascal_cuoq
The phrase “C standards committee” is eminently ambiguous, as the standards
that define the C language (e.g. ISO/IEC 9899) are the work of a committee.
These committee members have constraints coming at them from all sides and do
a very good job or maintaining compatibility with the past and future-proofing
the C language at the same time.

This has nothing to do with coding standards, which any fool can define (and
some have), and for which the only instance that really deserves the name of
“standard” is MISRA-C. The other such “standards” have a scope typically
limited to a company and its providers, and reflect the style and
idiosyncrasies of the person who was in charge of defining it (they are rarely
the work of a committee).

------
optymizer1
I'm not completely on board with #4 (initialization of variables). I agree
that 'declaring and then assigning a value' _within the same block_ vs
'initialized declaration' has no speed gains, only downsides.

However, if the assignment can happen in a _different block_ (maybe inside an
'if' block) you could save 1 memory write, depending on how many times the if
condition is satisfied.

This obviously optimizes for speed at the expense of maintainability, and it's
for the programmer to make intelligent trade offs, but the fact is that one
method is faster than the other.

Silly example that decrements 'a' repeatedly if condition is non-zero:

    
    
       //temp = 0 inside if block
       int dec(int a, int condition) {
           int temp;
    
           if (condition) {
               temp = 0;  //<-- memory write depends on condition
    
               //compute something here in a loop
               while (temp < a) {
                   a -= (temp++);
               }
           }
    
           return a;
       }
    
       //temp = 0 at declaration
       int dec2(int a, int condition) {
           int temp = 0; // <-- memory write always executed
    
           if (condition) {
               //compute something here in a loop
               while (temp < a) {
                   a -= (temp++);
               }
           }
    
           return a;
       }
    
    

We can disassemble the output to verify that dec() will only write to memory
if the condition was satisfied (see 400458), while dec2() will always write
into temp (see 400482):

    
    
        # <dec>
        400448:  push   %rbp
        400449:  mov    %rsp,%rbp
        40044c:  mov    %edi,0xffec(%rbp)
        40044f:  mov    %esi,0xffe8(%rbp)
        400452:  cmpl   $0x0,0xffe8(%rbp)  # if (condition)
        400456:  je     400473 <dec+0x2b>
        400458:  movl   $0x0,0xfffc(%rbp)  # temp = 0 <-- depends on condition
        40045f:  jmp    40046b <dec+0x23>  # while
        400461:  mov    0xfffc(%rbp),%eax
        400464:  sub    %eax,0xffec(%rbp)
        400467:  addl   $0x1,0xfffc(%rbp)
        40046b:  mov    0xfffc(%rbp),%eax
        40046e:  cmp    0xffec(%rbp),%eax
        400471:  jl     400461 <dec+0x19>  # loop
        400473:  mov    0xffec(%rbp),%eax  # return a
        400476:  leaveq
        400477:  retq
    
        # <dec2>
        400478:  push   %rbp
        400479:  mov    %rsp,%rbp
        40047c:  mov    %edi,0xffec(%rbp)   
        40047f:  mov    %esi,0xffe8(%rbp)
        400482:  movl   $0x0,0xfffc(%rbp)   # temp = 0 <-- always executed
        400489:  cmpl   $0x0,0xffe8(%rbp)   # if (condition)
        40048d:  je     4004a3 <dec2+0x2b>
        40048f:  jmp    40049b <dec2+0x23>  # while
        400491:  mov    0xfffc(%rbp),%eax
        400494:  sub    %eax,0xffec(%rbp)
        400497:  addl   $0x1,0xfffc(%rbp)
        40049b:  mov    0xfffc(%rbp),%eax
        40049e:  cmp    0xffec(%rbp),%eax
        4004a1:  jl     400491 <dec2+0x19>  # loop
        4004a3:  mov    0xffec(%rbp),%eax   # return a
        4004a6:  leaveq 
        4004a7:  retq
    

The above code was compiled with gcc 4.1.2 on amd64/linux. gcc -O2 and gcc -O3
completely do away with the 'temp' variable and generate fewer instructions.

~~~
ant512
I'd suggest that, if a variable with an undefined value is a valid state for
the program, the declaration is in the wrong place. How about this
alternative?

    
    
       int dec(int a, int condition) {
           if (condition) {
               int temp = 0;
    
               //compute something here in a loop
               while (temp < a) {
                   a -= (temp++);
               }
           }
    
           return a;
       }

~~~
optymizer1
That's the obvious next step, but it's not valid C89 and that may or may not
be relevant for people doing embedded programming. Personally, I'm all for
using newer standards and declaring variables locally when it makes sense.
Sometimes I like to see all variables at the top of the function though.

------
deletes
Completely agree with no. 5. After a year with C I realized that macros are
only to be used as control structures for headers and ifdefs to enable
compatibility.

If you need macros to hack C to enable some functionality not inherent to the
language, you should change the approach or switch to a different language.

Also macros are usually used to "speed up" the code. You should never optimize
before finding the real bottleneck in your code. And thus always use functions
instead of macros.

~~~
theseoafs
> If you need macros to hack C to enable some functionality not inherent to
> the language, you should change the approach or switch to a different
> language.

I think if you continue to do serious C hacking, you'll find a lot of places
where macros are legitimately a good choice.

For example, I've found that macros are very, very useful when it comes to
implementing error handling in a robust way. In C, the only real way to detect
whether a function has failed is to return an error code. The caller then has
to test for the error code and take the appropriate action if there was an
error. I'm working on a side project now that was doing a LOT of this because
it does a lot of IO and memory allocation (and basically any operation you do
on a file or any call to `malloc` or a function that calls `malloc` can
spontaneously fail in C). So my source code, after a while, ended up looking
more or less like this:

    
    
        if ((f = open_file()) == NULL)
            return ERROR_CODE;
        if ((s = allocate_string()) == NULL)
            return ERROR_CODE;
        if (write_string_to_file(s, f) < 0)
            return ERROR_CODE;
    

... and so on, and so forth. It got to the point where basically all of the
function calls I did needed to be manually checked for an error, which made
all of my code really messy and hard to read. The only real way to refactor
these checks in pure C without sacrificing robustness is to use a macro:

    
    
        #define MY_ASSERT(c) do { if (!(c)) return ERROR_CODE; } while (0)
        MY_ASSERT((f = open_file()) != NULL);
        MY_ASSERT((s = allocate_string()) != NULL);
        MY_ASSERT(write_string_to_file(s, f) >= 0);
    

This has the same effect as the code above, without sacrificing
comprehensibility or readability.

> Also macros are usually used to "speed up" the code. You should never
> optimize before finding the real bottleneck in your code. And thus always
> use functions instead of macros.

I think a lot of people on HN and /r/programming fail to understand why
premature optimization is a bad thing. A lot of programmers hide behind the
"premature optimization is the root of all evil" thing to justify their pre-
existing biases or their inefficient code. The point is that you shouldn't
_spend a great deal of time_ optimizing functions or algorithms unless
optimizing those functions or algorithms will be productive. It's wasteful to
spend a bunch of time optimizing your IO operations, for example, if 95% of
the time is being spent in the database. The macro-vs-function debate has
nothing to do with that. Writing a macro definition takes no more effort than
writing a function definition, so you shouldn't be opposed to writing

    
    
        #define SOME_OPERATION(x) (((x) * 100) / 3 + 500)
    

rather than

    
    
        int some_operation(int x) { return (x * 100) / 3 + 500; } 
    

That isn't premature optimization, you're just guaranteeing that the compiler
won't introduce the overhead of a function call.

~~~
drivers99
The first block of code seems more readable to me than the second. I'm used to
seeing the first; it's idiomatic C. I don't understand why it says "do ...
while (0)" in the macro, for instance. Wouldn't you want to handle different
errors in different ways anyway? In your example, you're just going to leave
the file open and the string allocated? Isn't trying to influence how the
compiler compiles it by using a macro instead of a function a clear example of
(premature) optimization?

~~~
theseoafs
> The first block of code seems more readable to me than the second. I'm used
> to seeing the first; it's idiomatic C.

The benefits of the macro (IMO) aren't obvious from the example I gave. Like I
said, I'm currently working on a personal project where macros have been the
saving grace. Attaching a conditional return to every individual statement was
making my code extremely hard to maintain (even for myself, who wrote it) --
wrapping common error cases in a macro made everything a whole lot simpler,
and allowed me to work on my algorithms at a high level without having to run
my eyes over the same error-checking `if` statements a million times.

> I don't understand why it says "do ... while (0)" in the macro, for
> instance.

It's normal for many macros, so that you can use them like normal statements.

> In your example, you're just going to leave the file open and the string
> allocated?

I sort of just made that example up off the top of my head -- you're right
about the memory leak.

------
__david__
Even better rule: Know the limitations of your target processor and compiler
and code accordingly.

Example 1: Some (bad) compilers just ignore the const keyword (and therefore
treat consts like normal variables). Use #defines with these compilers (or get
a better compiler).

Example 2: Some (older, smaller) processors don't have an integer division
instruction. Avoid division _at all costs_ on these processors unless you are
not worried about space or execution speed.

~~~
twoodfin
_Some (older, smaller) processors don 't have an integer division instruction.
Avoid division at all costs on these processors unless you are not worried
about space or execution speed._

...unless you know your compiler is capable of converting the division into
cheaper operations via strength reduction, per your well-considered rule.

~~~
kps
… which will (normally) only happen if the dividend is unsigned.

Another reason to prefer unsigned ints is that the C standards make signed
overflow _undefined_. This is becoming increasingly important as code for
embedded systems is built using ‘big’ optimizing compilers, rather than the
simple ‘portable assemblers’ assumed by the old old rules of thumb like
manually converting divides to shifts.

    
    
      “If you lie to the compiler, it will get its revenge.” — Henry Spencer

------
dllthomas
Regarding division and shifts, one thing worth noting is that you can now
(with C11) bust out static assertions to make sure no one changes the count
and the shift incompatibly. It's still probably not worth doing unless there's
a good reason to make sure that particular operation is a shift rather than a
multiply.

