

Consistently Formatting C - df3n5
http://blog.9lines.org/?p=26

======
taeric
This is great, except for the times that you take advantage of the fact that
programming is essentially ascii art sometimes. I grant that these examples
are edge case in occurrence, but breaking the formatting really helps there.

And... I just can't bring myself to think that a rigid adherence to formatting
will benefit any codebase as much as folks would like to think. As "edge case"
as my reasons for wanting to break it, the straw men built up to show why it
matters are even more ridiculous.

And this is completely ignoring that, once again, the largest collaboratively
maintained source tree on the planet (that I am aware of) does not need this
level of formatting adherence. Why do we think other projects do?

~~~
unmole
The largest collaboratively maintained source tree on the planet (that I am
aware of) is the Linux Kernel. Just submit a 'badly' formatted piece of code
to someone like Greg Kroah-Hartman and see what happens.

Quoting a formal study[1]:

"It is not merely a matter of aesthetics that programs should be written in a
particular style. Rather there is a psychological basis for writing programs
in a conventional manner: programmers have strong expectations that other
programmers will follow these discourse rules. If the rules are violated, then
the utility afforded by the expectations that programmers have built up over
time is effectively nullified."

[1] Soloway, Elliot, and Kate Ehrlich. 1984. “Empirical Studies of Programming
Knowledge”, IEEE Transactions on Software Engineering SE-10, no. 5
(September): 595-609

[2] <http://goo.gl/2bHwq>

~~~
taeric
My point was more that I would wager most "auto styler" tools will change code
that is in the kernel. You are really just bringing up more of what I consider
"straw men" in this argument. Terribly formatted code is of course terrible.
Much of what folks argue about being "good code" is just nitpicky wastes of
time. Take this snippet from the kernel:

    
    
                if (pages) {
                    pages[i] = virt_to_page(start);
                    if (pages[i])
                        page_cache_get(pages[i]);
                }
                if (vmas)
                    vmas[i] = vma;
       

This is just an arbitrary snippet, as I really have no familiarity with the
code. But, alone in this snippet I already see at least two things that would
make most format obsessed people go stupid. (Three in the actual code, but I
don't know if the tabs will be visible to folks on this forum.)

Edit: I forgot to add that I realize this segment actually conforms to the
linux style guide. So... yeah, doesn't really say much other than to show that
the kernel crew disagrees with most anyone you will ever hear online. It does
not take too much effort to find snippets that actually do violate their own
rules.

------
super_mario
You are better off putting your options into .astylerc. This is what I have in
mine:

    
    
        # Use K&R formatting style
        style=kr
        
        # Indent class and struct blocks so that the blocks 'public', 'private' and
        # 'protected' are indented. This option is effective in C++ files only
        indent-classes
        
        # Indent 'switch' blocks so that the 'case X:' statements are indented with 
        # the switch block. The entire case block is indented.
        # 
        # For example: 
        # switch (foo)
        # {
        # case 1:
        # 	a += 1;
        # 	break;
        #
        # case 2:
        # {
        # 	a += 2;
        #	break;
        # }
        # }
        # 
        # becomes
        # 
        # switch (foo)
        # {
        #     case 1:
        #         a += 1;
        #         break;
        #
        #     case 2:
        #     {
        #         a += 2;
        #         break;
        #     }
        # }
        #indent-switches
        
        # Indent C++ namespaces (this option has no effect on other file types)
        # Add extra indentation to namespace blocks.
        # For example:
        # namespace foospace
        # {
        # class Foo
        # {
        #     public:
        #         Foo();
        #         virtual ~Foo();
        # };
        # }
        #
        # becomes
        # 
        # namespace foospace
        # {
        #     class Foo
        #     {
        #         public:
        #             Foo();
        #             virtual ~Foo();
        #     };
        # }
        indent-namespaces
        
        # Indent multi line preprocessor definitions ending with a backslash
        # For example:
        #
        # #define Is_Bar(arg,a,b) \
        # (Is_Foo((arg), (a)) \
        # || Is_Foo((arg), (b)))
        # 
        # becomes:
        # 
        # #define Is_Bar(arg,a,b) \
        #     (Is_Foo((arg), (a)) \
        #      || Is_Foo((arg), (b)))
        # 
        indent-preprocessor
        
        # Indent C++ comments beginning in column one. 
        # For example
        # 
        # void Foo()\n"
        # {
        # // comment
        #     if (isFoo)
        #         bar();
        # }
        # 
        # becomes:
        # 
        # void Foo()\n"
        # {
        #     // comment
        #     if (isFoo)
        #         bar();
        # }
        # 
        indent-col1-comments
        
        # Pad empty lines around header blocks (e.g. 'if', 'for', 'while'...).
        # 
        # isFoo = true;
        # if (isFoo) {
        #     bar();
        # } else {
        #     anotherBar();
        # }
        # isBar = false;
        # 
        # becomes:
        # 
        # isFoo = true;
        # 
        # if (isFoo) {
        #     bar();
        # } else {
        #     anotherBar();
        # }
        # 
        # isBar = false;
        # 
        break-blocks
        
        # Insert space padding around operators. Any end of line comments will remain
        # in the original column, if possible. Note that there is no option to unpad.
        # Once padded, they stay padded.
        # 
        # if (foo==2)
        #     a=bar((b-c)*a,d--);
        # 
        # becomes:
        # 
        # if (foo == 2)
        #      a = bar((b - c) * a, d--);
        # 
        pad-oper
        
        
        # Insert space padding after paren headers only (e.g. 'if', 'for', 'while'...).
        # Any end of line comments will remain in the original column, if possible.
        # This can be used with unpad-paren to remove unwanted spaces.
        # 
        # if(isFoo(a, b))
        #     bar(a, b);
        # 
        # becomes:
        # 
        # if (isFoo(a, b))
        #     bar(a, b);
        # 
        pad-header
        
        # Remove extra space padding around parenthesis on the inside and outside. Any
        # end of line comments will remain in the original column, if possible. This
        # option can be used in combination with the paren padding options pad‑paren,
        # pad‑paren‑out, pad‑paren‑in, and pad‑header above. Only padding that has not
        # been requested by other options will be removed.
        # 
        # For example, if a source has parens padded on both the inside and outside,
        # and you want inside only. You need to use unpad-paren to remove the outside
        # padding, and pad‑paren‑in to retain the inside padding. Using only
        # pad‑paren‑in would not remove the outside padding.
        # 
        # if ( isFoo( a, b ) )
        #     bar ( a, b );
        # 
        # becomes (with no padding option requested):
        # 
        # if(isFoo(a, b))
        #     bar(a, b);
        # 
        unpad-paren
        
        # Delete empty lines within a function or method. Empty lines outside of
        # functions or methods are NOT deleted. If used with break-blocks or
        # break-blocks=all it will delete all lines EXCEPT the lines added by the
        # break-blocks options.
        # 
        # void Foo()
        # {
        # 	
        #     foo1 = 1;
        # 	
        #     foo2 = 2;
        # 	
        # }
        # 
        # becomes:
        # 
        # void Foo()
        # {
        #     foo1 = 1;
        #     foo2 = 2;
        # }
        # 
        delete-empty-lines
        
        # Attach a pointer or reference operator (* or &) to either the variable type
        # (left) or variable name (right), or place it between the type and name
        # (middle). The spacing between the type and name will be preserved, if
        # possible. To format references separately use the following align-reference
        # option.
        # 
        # char *foo1;
        # char &foo2;
        # 
        # becomes (with align-pointer=type):
        # 
        # char* foo1;
        # char& foo2;
        # 
        # char* foo1;
        # char& foo2;
        # 
        # becomes (with align-pointer=middle):
        # 
        # char * foo1;
        # char & foo2;
        # 
        # char* foo1;
        # char& foo2;
        # 
        # becomes (with align-pointer=name):
        # 
        # char *foo1;
        # char &foo2;
        # 
        align-pointer=name
        
        # Set the minimal indent that is added when a header is built of multiple
        # lines. This indent helps to easily separate the header from the command
        # statements that follow. The value for # indicates a number of indents and is
        # a minimum value. The indent may be greater to align with the data on the
        # previous line.
        # The valid values are:
        # 0 - no minimal indent. The lines will be aligned with the paren on the
        # 	preceding line.
        # 1 - indent at least one additional indent.
        # 2 - indent at least two additional indents.
        # 3 - indent at least one-half an additional indent. This is intended for large
        # 	indents (e.g. 8).
        #
        # The default value is 2, two additional indents.
        # 
        # // default setting makes this non-bracketed code clear
        # if (a < b
        #         || c > d)
        #     foo++;
        # 
        # // but creates an exaggerated indent in this bracketed code
        # if (a < b
        #         || c > d)
        # {
        #     foo++;
        # }
        # 
        # becomes (when setting --min-conditional-indent=0):
        # 
        # // setting makes this non-bracketed code less clear
        # if (a < b
        #     || c > d)
        #     foo++;
        # 
        # // but makes this bracketed code clearer
        # if (a < b
        #     || c > d)
        # {
        #     foo++;
        # }
        # 
        min-conditional-indent=0
        
        # Set the  maximum of # spaces to indent a continuation line. The  # indicates
        # a number of columns and must not be greater than 120. If no # is set, the
        # default value of 40 will be used. A maximum of less than two indent lengths
        # will be ignored. This option will prevent continuation lines from extending
        # too far to the right. Setting a larger value will allow the code to be
        # extended further to the right.
        # 
        # fooArray[] = { red,
        #          green,
        #          blue };
        # 
        # fooFunction(barArg1,
        #          barArg2,
        #          barArg3);
        # 
        # becomes (with larger value):
        # 
        # fooArray[] = { red,
        #                green,
        #                blue };
        # 
        # fooFunction(barArg1,
        #             barArg2,
        #             barArg3);
        # 
        max-instatement-indent=9

~~~
wging
Do you mean to have indent-switches commented out?

~~~
super_mario
Yes :D.

~~~
blaabjerg
Freak!

------
claudius
That Bash script looks a little irky…just imagine more than one level of
directories, and `` is both none-portable and overly ugly, at least compared
with $()…not to mention that Shell expansion (for i in $basedir/⁕.{c,h})
should work equally well :)

Anyway, I would rather suggest something like:

    
    
      find . \( -name '*.c' -o -name '*.h' \) -exec printf "Doing Stuff on {}\n" \; -exec head -n 1 {} \;

~~~
df3n5
Your bash foo eclipses mine, thank you for the bash schooling! I've updated
the script on my original post and gave you credit, hope that's okay.

~~~
claudius
Sure, feel free to do whatever you like with that. Bash is more or less the
only scripting language I know/use, so it is only natural to find shortcuts
over the years :)

~~~
csense
> Bash is more or less the only scripting language I know/use

I feel sorry for you.

Try Python, especially if you're doing anything involving string processing.

~~~
claudius
Thanks, but I used Python once for a little FUSE module (long story…) and
didn’t particularly like it. If anything, I’d rather try Perl - but then, you
can do so much with a sensible Shell and the coreutils, the need to even write
a proper script rather than a large concatenation of pipes is a rare event for
me.

So I guess I’ll stick to Bash and use whatever-is-best if that’s not enough.

------
JasonFruit
Having it as a git pre-commit hook seems dangerous to me — what if there's a
bug in the formatting tool that breaks something in the code? Then you have
non-functioning code committed. It's a lot better if you can just have the
self-discipline to run the formatter when you're done making changes.
Alternatively, you could have it in your build script, so you'll likely catch
the formatter-related problem when you're testing your code.

~~~
wging
Or what about a pre-commit hook that, instead of transforming the version
you're committing, just compares a transformed version to the version you're
trying to commit? You could warn on commit if they don't match, plus maybe
provide an option to override the warning and commit anyway.

~~~
df3n5
I like this. If someone writes a version which does this let me know and I'll
post a link on the post.

------
angersock
Two stories regarding coding standards.

~

At my old day job (sprawling planet-eating Java), the codebase had gone
through many hands over the course of ten or so years. The "age" of some code
or feature could usually be determined by looking at how it was written:
newest code used the Java foreach construct, older code (usually by one
particular developer) used explicit for constructs, and the oldest code
reimplemented (poorly) collection classes without generics.

Additionally, the fist (coding style) of each person who'd worked on a section
became apparent: my own code tended to have a large number of anonymous inner
classes and monkey closures, as well as somewhat heavy use of interface and
abstract classes. Another coder liked static factory methods a lot, as well as
very shallow inheritance hierarchies (in order to make debugging simpler). The
oldest code was the poster child for 90s Java:
EnterpriseFactoryProxyFacadeAdaptorSingletonProviders, overly wordy code to
accomplish simple tasks, etc.

It was interesting to open a file or function to implement a feature or bugfix
and then twitch when you realized that it was from one of the more...time-
pressed...veteran... developers. By contrast, opening code in a particular
style (indenting, commenting, etc.) I recognized as my own or another coworker
brought relief.

~

At the same time I was working on this, I worked on a large library with some
friends doing gamedev. We agreed on a coding standard, had some rather arcane
arguments, and ultimately settled on something that drew heavy inspiration
from Insomniac's coding guide--not exactly alike, but we at least had an
opinion on nearly everything they cover.

Whenever working on that code, I have a very different feeling. Despite the
fact that we're all working on it at the same time, and despite the fact that
we all have some remnants of our fists, the codebase still feels fairly
uniform and like home.

Variable naming is basically the same, choice of verbs and methods is the
same, indentation is nearly the same (as are the cases where liberties are
taken), commenting is consistent where it matters (fists show through in
individual writing styles, but documentation for interfaces and function
signatures is standardized), and overall there is a high amount of
consistency.

Even deeper, the amount of cleverness is roughly equivalent throughout the
code--as we use C++, this cannot be undersold. We've come to a consensus on
when and how to use templates, and in what ways, and when to use defines, and
how we treat inheritance and interfaces and abstractions. We have house rules
on how things should behave; in a language where you can overload and redefine
operators, restraint is important.

~

Overall, having a common coding standard--and moreover, one that everyone has
closely examined and discussed and agreed on--really helps decrease cognitive
load. It gives better "faith" in your codebase, and reduces the annoyance of
working on somebody else's code. It reduces the impedance mismatch of
abstractions as they work together, and helps to reassure you when its three
in the goddamn morning and why the christ am i getting these compiler warnings
sheezus didnt anybody read through this before committing argh at least we
have unit tests.

It's also something a lot of people seem to dismiss as bike-shedding, and
sometimes it can seem like that. It's at least worth considering, especially
if you work with people you care about--coding on a team should be like a
relationship, and any successful relationship involves developing a common
background and shorthand for communicating.

~~~
SiVal
I think this has been one of the impediments to the commercial use of Lisp.
The syntax of a language is sort of the ultimate style guide. You loop using
the built-in loop construct. You do something for each element using the
built-in foreach construct.

But without even much built-in syntax, much less a universal style guide,
every Lisp programmer creates his own favorite language. That's part of the
joy of it for an individual project and part of the cost for a group project.

------
0x09
There is actually a tool in the clang extras repository for this called clang-
format. It supports three different styles ("LLVM", "Google", "Chromium")
although I'm not sure where those are documented exactly.

<http://clang.llvm.org/docs/ClangFormat.html> (a little outdated from HEAD)

~~~
bla2
Google style is probably [http://google-
styleguide.googlecode.com/svn/trunk/cppguide.x...](http://google-
styleguide.googlecode.com/svn/trunk/cppguide.xml) , Chromium
<http://dev.chromium.org/developers/coding-style>

------
chris_engel
Hah. I thought the article was about cleaning harddrives :D

