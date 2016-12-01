/*
* A program to print The Twelve Days of Christmas
* using the C fall through case statement.
*
* Jim Williams, jim@maryland, 2 May 1986 (but first written ca. 1981)
*/
/*
* If you have an ANSI compatible terminal then
* #define ANSITTY. It makes the five Golden rings
* especially tacky.
*/
#include <stdio.h>
char *day_name[] = {
"",
"first",
"second",
"third",
"fourth",
"fifth",
"sixth",
"seventh",
"eighth",
"ninth",
"tenth",
"eleventh",
"twelfth"
};
int
main()
{
int day;
printf("The Twelve Days of Christmas.\n\n");
for (day = 1; day <= 12; day++) {
printf("On the %s day of Christmas, my true love gave to me\n",
day_name[day]);
switch (day) {
case 12:
printf("\tTwelve drummers drumming,\n");
case 11:
printf("\tEleven lords a leaping,\n");
case 10:
printf("\tTen ladies dancing,\n");
case 9:
printf("\tNine pipers piping,\n");
case 8:
printf("\tEight maids a milking,\n");
case 7:
printf("\tSeven swans a swimming,\n");
case 6:
printf("\tSix geese a laying,\n");
case 5:
#ifdef ANSITTY
printf("\tFive ^[[1;5;7mGolden^[[0m rings,\n");
#else
printf("\tFive Golden rings,\n");
#endif
case 4:
printf("\tFour calling birds,\n");
case 3:
printf("\tThree French hens,\n");
case 2:
printf("\tTwo turtle doves, and\n");
case 1:
printf("\tA partridge in a pear tree.\n\n");
}
}
return 0;
}
main = mapM_ putStrLn $ reverse verses
where
verses = combine <$> enumeratedSlices songLines
combine (i, s) = "On the %s day of Christmas, my true love gave to me\n%s" % dayNames !! i $ concat s
enumeratedSlices = zip [0..] . init . tails
switch (x) default: if (false);
else if (valid_command_message(x))
case CMD1: case CMD2: case CMD3: case CMD4:
process_command_msg(x);
else if (valid_status_message(x))
case STATUS1: case STATUS2: case STATUS3:
process_status_msg(x);
else
report_error(x);
To be clear, it isn't that I find the code difficult to grok: if you don't understand how switch works, please stop using C.
The issue is that this particular use case for the switch is a lazy performance optimization, as it should just totally replace the valid_* calls if we do that; it is like the author doesn't want to update the switch cases, but wants the code to keep working? Did they just forget the switch exists, or is it that they can't edit it? I just don't get it, particularly when any modern C compiler supports "give me a warning if I don't consume all the possible switch values in this enum", which is the feature that this code should be using, not if < END_*.
BTW: it isn't clear to me that this would even be a performance optimization; in an ironic twist, many compilers are going to choose to compile that switch statement into the moral equivalent of two if statements with range checks, as default has to be implemented as a range check anyway, and if you work out how many range checks you are doing combined with the code cache benefits of having explicit branches instead of implicit jump tables, this switch statement is going to feel extra repetitive when you look at the machine code.
Oh please. C is perfectly usable without knowing switch is actually a bunch of gotos. The idea that one has to know all the hidden edges of a language to use it is pointless, because every language has something like that. Even Python can be a minefield if one tries hard abusing metaclasses, and I'm pretty sure 95% of the Python users don't know how to use them.
If you pursue the idea of knowing every deep place to the extreme, the only language left for general developers to use is Go. A language so concerned at removing hidden features that almost no features are left.
switch is hardly a hidden edge, it's an important control structure in the language. You want a "hidden edge", look at designated initializers [1] which I just heard of yesterday (thanks to HN), despite having 17 years experience using C. Or bitfields, or function pointers.
[1] http://www.drdobbs.com/the-new-c-declarations-initialization...
I'd never write something like this in code that might make it to production. The odds of the next guy misunderstanding it, or of me misunderstanding it in six months, are way too high. I'd only consider using it as part of an IOCCC entry.
(And to preempt any cries of "well, you just don't understand C very well," I do have a winning entry in the IOCCC.)
if(x) return true; else return false;
Why not just
return x;
It's clear the author is trying to optimize something, but they don't even start with the obvious places first.
And there is pretty much never an "obvious" place to optimize. Profile first. If you want to call it "obvious" if something takes up 90% of the runtime in the profiler when it shouldn't, then that's acceptable, but don't just look at the code and say "obviously this part is slow."
To be fair, in embedded programming, you are sometimes stuck with the compiler a particular vendor hands you.
> And there is pretty much never an "obvious" place to optimize. Profile first.
Yes, a thousand times yes! ;-) I remember writing a rather convoluted piece of C code a couple of years back that involved stuffing data into a data structure, then looking up data in that data structure. My first thought was "I'm gonna need a hash table", but I used an array and qsort/bsearch, so I could get the rest working; when the rest was working (as in giving me correct results, but at glacial speed), I ran a profiler, fully expecting it to tell me the array/qsort/bsearch-thing was wasting huge amounts of time. I was rather surprised that it amounted to ~2% of the total running time. I've had people tell me before to profile, then optimize the parts that matter, and not to make any assumptions about what parts of my code are going to need optimization. And I did believe the people telling me this, but only at that moment did I understand how right those people were.
> "obviously this part is slow."
Telling whether a given piece of code is slow (in the sense of "this part could be optimized to run 10x as fast) or not is not too difficult, IMHO. Telling whether it makes up for a significant percentage of the total running time, is. Very hard.
As an aspiring C programmer, is there anywhere I can read about interesting applications of the switch statement? Pretty much everything I've learned has come with neat little break;s and default at the end of each case.
The remaining 0% is stuff like Duff's Device.
Any context where you would want "if x == this, goto here; if x == that, goto there" is a potential place to use the switch statement. Duff's device is a good example, but there are many things that can be built this way.
https://en.m.wikipedia.org/wiki/Duff's_device
Can you provide an example? We use state machines all the time, but I have not seen a good example of mixing other flow control statement into a switch/case outside of Duff's Device.
That code does timer, radio clock decoding, scrolling, blinking and all that using multiple 'protothreads'. As you can see the main loop sleep() and wait for a timer interrupt (or anything) to fire to wake up, so it's not even running most of the time; only when the 'tick' timer fires.
[0] https://gist.github.com/buserror/9407adb6d52153e16caad5e8a08...
I use duffs device to concurrently read and debounce multiple inputs in my old dj midi controller
http://dunkels.com/adam/pt/expansion.html
*to = *from++;
*to++ = *from++;
Personally I don't like the what I consider an anti-pattern he used in his refactored functions. I'm talking about doing this:
if (condition) return true;
return false;
return condition;
The only justification I've seen/heard is that it makes it more "debuggable", you can single-step through and see that the expected path is taken, i.e. that the condition is properly evaluated.
Still, I hate it and would much rather check that some other way (by inspecting the return value before the function exits, for instance).
bool result = (condition);
return result;
(Also, at least for C/C++, you generally want condition and each case on separate lines, so that you can actually put breakpoints on each thing separately - tool support for multiple independent breakpoints on a single line is still an utter shambles, and you're best off just assuming nothing supports it.)
Unconditional breakpoints that aren't being hit are a lot cheaper (if they even cost anything at all - which usually they don't) than conditional breakpoints that are always being hit...
bool result = (condition);
if( result )
{
do_some_shit();
} else
{
do_some_different_shit();
}
I would add that when using the standard `_Bool` type (IE. `bool` if you include `<stdbool.h>`) all assignments to it are converted to their 'logical value' first. So even if condition is an integer or pointer, if you did what you suggested the `condition` will still be converted into `true` or `false` on the return exactly the same as if you did the `if`. It's exactly the same as doing `return !!condition`.
Now if you're function is not returning a `bool`, then that conversion won't happen - but of course that doesn't matter if `condition` is already a logical value of 0 or 1 (Which it is in this case). And IMO, if your value isn't a logical value I'd much rather see `return !!condition` instead of the `if` anyway (Though I concede that those not familiar with C may not immediately recognize what that syntax does).
Other similar ones are things like:
if ((count > 10) == true)
instead of:
if (count > 10)
while not done
(or while not found)
....
# set done or found here based on some condition
function(filename) {
if (!filename) {
this.setState( {x : null});
}
var s = filename;
if (s) {
this.setState({y :''});
}
else {
this.setState({z: false});
}
}
And I've seen plenty of:
x == true ? true : false
if (x == 'true') ...
var data = {option: 1}
var context = this;
this.someMethod.bind(this, context, data, data.option)
Such practices are all too common ... should be the rare exception, really.
Even though it is an abuse, these sorts of libraries are fascinating to me: making use of language features in unexpected (most likely unintended) ways.
We asked that they refactor it, and they agreed because the sheer size of it was giving the compiler problems. So they divided it into two separate 25,000 case-statement monsters.
(You may also be interested in -Wswitch-enum.)
The reason for using a switch statement in the first place was so that you'd have constant performance. The argument against the simple implementation was:
> Let’s assume in v2.0 of the system we want to extend the message set to include two new commands and one new status message ... the if-else-if version will handle the change without modification, whereas the existing switch statement treats the new commands still as errors.
Treating the new commands as errors is a feature of the switch statement style, not a problem. They need to be added to the jump table, not implemented with reduced performance.
If you're going to write something this obscure for the sake of "performance", you want to be damn sure it's worth it -- that performance is even an issue, and that this is a large enough improvement to justify not doing the simple thing.
Honestly it feels like the author wants to do it this way because it's clever, and the "performance" reason is just rationalising it.
I run wiki.luajit.org on a VM, using Gollum. We got hit by HN a while back. CPU went from 1% to 2%.
People saying "CPU is commodity" tend to write software which uses all available CPU... for little purpose.
Heck, I worked at a company which did just that. "CPU / memory / disk is commodity, use it." And they did! Soon enough, all CPU / memory / disk was in use, and they had to go back and re-architect their software so that it wasn't crap.
It would have been cheaper to do it right in the first place. But religious beliefs about engineering over-rode actual engineering.
In all seriousness, this is.. interesting! I had no idea you could have a case inside default like that.
(In terms of restrictions: case/default have to be inside the switch statement somewhere, and not inside some additional switch statement nested within it. But aside from that, as demonstrated by the article, you have a good deal of freedom about where they go.)
#define while if // make code faster
#define struct union // use less memory
So this might be useful if you want the jump table performance now on the known values, but you suspect that in the future some maintainer might add more values to the enums and forget about the switch, because maybe the enums are hidden in some header file far away.
But this is C. C is used on a wide diversity of contexts, and on some of then the order above changes.
We're told we have "process_command_msg" and "process_status_msg" functions. Each of those functions is already doing a switch or if/else to determine the exact message type. If you care about the cost of those lookups, the correct thing to do is have a single switch statement that handles all the messages at once.
If you don't care enough about the cost of lookups to split up those functions, you should stick with the obvious if/else test.
Another thing you could do to improve the design is to have a "get_message_type" function returning a "COMMAND_TYPE" or "STATUS_TYPE" enum, then switch on that. That can be made efficient if you really care about execution cost (for example, check a single bit, and inline it so the switch can be optimized).
Abusing the switch statement not only gives you unreadable and bug-prone code, it simply isn't any better than a sensible approach.
I don't dispute that one can use C for class based coding but it isn't common.
Simulating the C switch statement in Python:
https://jugad2.blogspot.in/2016/12/simulating-c-switch-state...
Not a proper simulation of C's switch, has limitations mentioned in the post, just something I whipped up for fun.
case CMD .. CMD_END: /* whatever */ break;
Refactored implies performance improvement to me, and inlining is almost always faster than the more modular representation that the author ended up with after "refactoring" (putting the comparisons in functions).
Go "O(1)" on all the CMDs or don't, for the sake of ease of comprehension. In the hypothetical, or may be practical world[0] where size+computation constraints require you must goto some of them and if-else the rest of them, I suppose the "1/2goto,1/2if-else" optimization would be one of the latter considerations for optimization over a host of other things you can consider.
[0] I admit I don't hack in that world, but I do computational physics, which is more about vectorizing things, not optimization of branching typically.
Never had cause to implement it, though, and never saw this particular hack.
Personal opinion: Unit tests should be used to let you know when you forgot to add in new `case`s.
Checking for switch/case completeness is the job of the compiler and -Wall -Werror.
Rust would give an error when an enum isn't exhausted in a match clause. So the issue being described doesn't exist :D
As to why it's still a popular language... The C99 spec document is shorter than the Ecma-262 (Javascript) spec by around 50 pages. The parts dedicated to the _languages_ themselves (as opposed to the builtins/standard libraries) weigh in at 130 pages for C, or around 300 pages for JavaScript. C, despite its warts, is a small, simple language.
If the C standard is so useful, try programming using the C standard as your sole debugging tool.
I was going to say something like that, but the equivalent code would have a "_" case in the match statement. I don't know rust yet, but would a rust programmer typically make a point to not use a "_" (default) statement so that you can catch these at compile time and expect you'll never have anything else?
But when you have a really big enum and need to handle only small subset of it's values it didn't work.
When _this_ problem can be avoided in Rust, it can be avoided in C also. The only Rust bonus here is compiler default behaviour.
enum Foo {
Bar(i32),
Baz(i32),
}
fn matchfoo(f: Foo) -> i32 {
match f {
Foo::Bar(x) => x,
Foo::Baz(x) => x,
_ => 42
}
}
