Hacker Newsnew | comments | show | ask | jobs | submit login
Google's Shell Style Guide (googlecode.com)
227 points by shawndumas 529 days ago | 185 comments



"Indent 2 spaces. No tabs." Well fuck it.

Thats the only coding style that remotely makes sense to me: https://www.kernel.org/doc/Documentation/CodingStyle

While I'm at it, half the justifications Google gives for the shell guide are inaccurate. Looks like a "overall people are used to this style so we're using it and we'll try to justify it without knowing why"

This is a much better guide:

http://devmanual.gentoo.org/tools-reference/bash/

Example: Gentoo's [[ ]] explanation actually makes sense.

-----


Thread's gotten too deep and wide to reply to everyone, so I'd like to vent right here at the top...

We have these machines that are meant to take the tedium out of our lives, let's USE THEM. Why do we not have tools that format things they way I like them? Who cares how the other team members like it? They can use the tool, too. Who cares how it's stored in your VCS? The tool should just be part of the workflow and format spaces/tabs the way I want them and maybe revert to a canonical format for committing to the repo.

Personally, I prefer tabs. I read code best when it's indented four spaces. But John wants to use two spaces in this code. And Ralph likes three. Holy Hell, don't make me wade through someone else's preferences to understand this codebase! Use tabs and people can set them the way they want. Yes, tabs are broken in many places (e.g. Objective-C methods can get so verbose that they need to wrap; convention is to wrap and align the colons; tabbing as far as possible and then spacing to fill breaks things when Ralph formats his tabs to three spaces; in this case, there is indeed a solution: tab to the start of the original line of code, and space to format from there ... but then the IDE has ideas of its own about what should be spaces and what should be tabs...)

Maybe you can read that code with only two spaces of indent, but I can't. I'm gonna need that reformatted to put a larger visual separation between scope changes or whatever required the indention. Rather than have this debate, a tool should be forged.

-----


> Yes, tabs are broken in many places (e.g. Objective-C methods can get so verbose that they need to wrap; convention is to wrap and align the colons; tabbing as far as possible and then spacing to fill breaks things when Ralph formats his tabs to three spaces; in this case, there is indeed a solution: tab to the start of the original line of code, and space to format from there ... but then the IDE has ideas of its own about what should be spaces and what should be tabs...)

Which is the problem. If you think it's possible to make a tool to solve this, great - do it, I'll happily adopt your tool as a precommit hook and use tabs to indent everything (or go on using spaces - your tool would turn them into tabs, right?) But given that this problem has existed since the '70s and no-one has solved it, I'm inclined to think it's impossible. So I'll use spaces in the VCS, which means agreeing an indent width so that our diffs make sense (and I don't really care whether that's two or four, but it does need to be defined and adhered to across the team).

-----


Disagree. Readability is not about 4 or 2 spaces, it is about strongly enforced convention that makes these typographic details literally disappear from the eyes of the readers. This effect is very noticeable in book publishing: if you ever notice the typography, the typographer failed at his task. It is the same in code, just file the rules and it will become your preferred guideline in no time.

And repeated experiences everywhere point to the fact that code text is already very hard to manage, very sensitive to alterations, misreading, etc. I'd rather remove layers and tools than add more.

Edit: obviously, readability is ask about convention when they are within the realm of sane acceptable conventions. And by everywhere I mean in teams where the loc count is above a few thousands.

-----


Disagree with your disagree. There is a difference between the actual code and how the code is styled. It makes sense to have conventions on how to format a for statement but not where to add brackets - making them be on the same line or on their own line makes little difference to noticing bugs and is purely a stylistic preference.

-----


So you mean that the where-the-brackets guidelines found in core FOSS mega-projects, big corps like Google, down to many github personal projects having their own "please follow the guides" in the readme, all of them are misguided, useless, wrong and even, let'y say it, plain stupid?

I, on my side, claim that guidelines are there for a reason, and should be followed.

-----


  > Use tabs and people can set them the way they want.
If members of a team set tabs the way they want, how do they limit line lengths to 80 columns, which is a requirement that's specified in the Linux kernel coding style that was linked to in the comment you replied to?

-----


not an issue. by assuming it is two spaces.

it is still better to have to switch to 2 tabwidth before committing to check line length then to be forced to use 2 spaces forever.

-----


How about lining things up? Take a long function call where you want to wrap the params:

    hello( world,
           solarsytem,
           universe
           ... );
You can't if you just use tabs - you would have to mix tabs and spaces.

-----


Use

   hello(
     world,
     solarsytem,
     universe
   );
which doesn't require perfect line-up with the first entry, and is still clear to the reader what's going on.

-----


this is how i code. 1st pass: func( ); 2nd pass: put in the args.

this ensures the closing paren immediately and makes it possible to comment out things that are inside the parens.

-----


so thats the reason why people are set on using spaces? man :(

-----


This looks at it only from the context of the single dev's machine. I care more about the consistency of the source code that's checked into the repository and not getting false file diffs based on white space differences, and other formatting discrepancies. And for some reason, this seems harder than it should be.

I'd like to see a diff tool designed around this.

-----


That wouldn't be hard at all. All you need to do is assign a default style and create the diffs in that style.

The hard part is creating a tool that can change between styles. Such a tool would need to be made for every language separately which is a pain. I've been wanting to try something like this for years but have never built up the courage.

-----


This is part of the purpose of lint tools, and they exist for many languages. e.g. http://www.jslint.com/

-----


Sorry, I think I misunderstood the comment I replied to. I thought he was saying it would be hard to do diffs if people restylized each file to their liking.

-----


It really doesn't matter how many spaces or tabs, what's important is that all code and scripts are written in the same style, which greatly reduces time required to understand someone else's code. At Google, that's a norm.

-----


Yup. After a decade of programming, the only thing I care about is consistency. Even the weirdest code style people can come up with is just fine in a clean, consistent code base.

Mildly surprised to see people arguing for specific tab widths...

-----


Agreed, consistency matters most. Regarding whether it matters whether tabs or spaces are used, I've come to realise that I cannot control the tab length when viewing code outside of my editor (obviously!). So, let's say we have a 80 char line length limit. If I set my tab size to be 2 spaces, the line length is obviously going to be different to someone who sets their tab size to be 4 spaces. Just having the option to control the size of tabs means tabs should never be used as it will result in inconsistencies across editing/viewing environments. Also, when viewing code online, generally you have a limited viewing area. If you're using tabs, the length of the lines are generally going to be very long, and it's very difficult to view the code. In conclusion, if you care about things like line lengths (you should), and if you care about keeping your code in a certain and consistent format, the only way to manage it is with spaces, imo.

-----


It actually does, 2 spaces indents are hard to read, period.

-----


Is there any scientific/empirical proof on this? I indent 4 spaces, no tabs, but I would like to see the real rationale there, not just the PEP-8 "rules".

-----


There is this: "Program indentation and comprehensibility (1983) by R J Miara, J A Musselman, J A Navarro, B Shneiderman"

http://www.cs.umd.edu/~ben/papers/Miara1983Program.pdf

I'd classify it as an attempt rather than something decisive but whatever floats your boat.

-----


p866:

  The level of indentation that seems to produce optimal 
  results in comprehension is between 2 and 4 spaces; as the 
  number of spaces increase, the comprehension level 
  decreases.
On the other hand, the sample program hard parts that were indented nine times (!) in places. It's an interesting read, but I'd not try to use it in a discussion.

Personally, I don't care; just pick one (hopefully one that matches what most people are using with that language, eg. 2 spaces for Ruby), and get on with it.

-----


Yup, they seem to have used an overly indent-happy indentation scheme. Two separate indents: he 'begin' after the 'for' and for the block body is superfluous. This raises the concern that their results are biased towards smaller indents.

-----


Zero spaces makes indentation harder to spot than infinity spaces.

Infinity spaces gives less space for code than zero.

Conclusion: use as wide an indentation as possible while still keeping enough space for the code itself.

-----


Is there nothing to be said for consistency?

-----


When I suggest using that algorithm to pick an indent size, I mean to use long-term averages to select a standard, not to indent every line as much as possible individually :P

-----


No. Indent space is just an opinion/preference/habit.

-----


I prefer 2 space indents (tabs)

-----


you're getting used to it within few months.. (currently we have tabs though)

-----


I've always been conflicted on this. It's simply untrue that more than 3 levels of indentation indicates bad code. For example, HTML, XML, and JSON often benefit from deeper levels of indentation.

Tabs have the benefit of having a dynamic size, so I could simply adjust the tab width within my editor. Spaces also do not play as well with editors. Unfortunately, browsers offer no means to change the width of tabs, so it's impossible to properly display code using tabs. This is especially frustrating when reading the source of a page indented using tabs. It would be much easier to read with spaces, but not as easy to read as if I were able to set tab width.

-----


On tabs vs. spaces, read the old classic: http://www.jwz.org/doc/tabs-vs-spaces.html

tldr; don't use tabs.

I read that piece many years ago and thought who cares, but I've later changed my editor to never use tabs. He's right.

Tabs are nice in theory, but in practice you end up with garbled code. It's better to train yourself to get used to seeing code bases formatted slightly differently, just one of the things you have to get over, IMHO. Like camel notation version underscores versus dashes.

-----


> It's better to train yourself to get used to seeing code bases formatted slightly differently, just one of the things you have to get over, IMHO. Like camel notation version underscores versus dashes.

That's a pretty good point. Indentation is indeed just one small aspect of the overall code style - why would you want to have that configurable and nothing else?

-----


Thanks for posting the original s-expr based, tab bashing article!

It has always been my own argument in favour of tabs:

- The original argument against tabs was based on Lisp indentation rules.

- Languages with very complex indentation rules like Lisp should use spaces. There's no exception to this rule.

- Languages from the C family should use Tabs, as Lisp indentation rules don't apply to them.

- Nowadays almost all popular languages can be considered to be part of the C syntax family. Python and Lua are notable exceptions. PHP and Objective C files should use tabs.

-----


Never let get facts in your way.

http://trac.common-lisp.net/mit-cadr/browser/trunk/lisp/zwei...

MIT's Zmacs, the editor of the Lisp Machine indents with tabs.

-----


Good find, thanks.

However, one very old editor using tabs doesn't mean Lisp programmers actually prefer them. It may even be the cause of tabs hate.

-----


If the editor of the Lisp Machine, which was used by people like Steele, Stallman, Pitman had no problems using tabs, it can't be that Lisp has a technical requirement not to use tabs. Steele wrote a book about C. Stallman wrote a Lisp in C and a C compiler...

Basically your arguments are from a strange parallel world.

-----


it's also been rebutted to death.

-----


Sure they do, add this to your user styles:

pre { tab-size: 2; }

https://developer.mozilla.org/en-US/docs/Web/CSS/tab-size

-----


Thanks, I wasn't aware of this. Hopefully we'll begin to reach a point where tabs are used and sites like GitHub adjust tab-width according to the developer's preference.

-----


It's a pity that elastic tabstops never got any traction. it's a really nice idea: http://nickgravgaard.com/elastictabstops/

-----


This looks horrible if I want the opening brace on the same line as the closing parens around function parameters, but still want the function parameters to be aligned, like this:

  int someCodeDemo ( [tab] int fred,
  [tab]                    int wilma) {
  [tab]                    return fred + wilma;
  }
I guess it needs some vertical space between blocks in order to distinguish them, hence why putting { on the next line works.

In my opinion, it also looks horrible with Lisp code (where I pretty much consider indentation a solved problem).

Edit: Oh god, how do I format this... Ah, there we go.

-----


There is a Sublime Text plugin compatible with ST2 and ST3 which does this. https://sublime.wbond.net/packages/ElasticTabstops

The only problem I've noticed is that it muddies the undo stack.

-----


HTML, XML, and JSON are document description formats. For most programming languages, more than 3 levels of indentation is a red flag.

-----


While it's true of some languages, it's certainly not true of all of them. I'm not sure most is accurate either. JSON is a subset of JavaScript, so it stands to reason that JavaScript could have as many indentation levels as JSON without a loss of clarity. It also stands to reason that any language with hashes, dictionaries, etc could feasibly exceed 3 indentation levels without it being a red flag, particularly if those hashes are in a method that is already indented more than one or two levels.

-----


Well, Torvalds said that about C where you have no indent at the class/namespace level. So I'd argue it can be a bit more for languages that have this (most of the popular ones), 4-5 I guess.

-----


So for those three levels, do I have to count the object and method-level indents, or is it a rule for languages with unindented includes?

-----


Real men and women indent by prime numbers.

-----


+1 for the kernel coding style

I can understand (grudgingly) 4 spaces for Python

But for C/C++ 4 spaces are awful. Maybe because C lines end up being longer?

Tabs, they exist. Use them

-----


Tabs should be abolished, it is a pain to edit source code full of spaces and tabs characters mixed everywhere.

Then the additional issue that everyone maps tabs to a different amount of space characters.

Only spaces can give a proper consistency without relying on external tools.

-----


What external tool can't interpret a \t character correctly?

To me, the problem is consistency. I'm a 4 space man myself, but tabs don't outright offend me, as long as they're everywhere. I'd rather commit edits with tabs than attempt to convert the whole project to spaces, or worse, commit some spaces and some tabs to a single file/project.

-----


For example when everyone has a different meaning how many spaces a tab represents, you end up with multiple amounts of tabs on your files.

This fully breaks the indentation, which is exacerbated when multiple teams working on the same code basis use different editors and tab settings.

So in the end one needs to rely on external tools that go over the source code to normalize the use of tabs.

-----


Indentation should be done with tabs.

However, after a character that's not a tab, i.e. any printable character, spaces should be used.

It is the best of both worlds, and can never be misconfigured by changing the number of spaces that a tab represents.

It is also what tab advocates have been saying for years now.

-----


Something I wrote up a long time ago:

https://docs.google.com/document/d/1HMxg6fv_yig_mVvnxsC8CwTf...

Unfortunately I haven't had time to update the code, and it doesn't work with more recent versions of emacs. Now I'm using vim and all-spaces-all-the-time.

-----


What breaks the alignment is editors inserting spaces instead of tabs. A tab is always the same length, but the number of spaces varies between users.

-----


No, that's what a coding style is for. Either specify tabs or spaces and you enforce it

A tab means 8 spaces. If you want your editor to show something different that's up to you, but it's non standard.

-----


A tab does not mean 8 spaces. That's the whole problem, people keep thinking tabs and spaces have an exchange rate.

If you have a source file with only tabs (and you should, then nobody has to run an re-indent script because they can't read it), you should be inserting tabs. If you have a source file with only spaces, you should map the tab key to insert the correct amount of spaces before editing. If you have a source file with both tabs and spaces for indentation, you should re-indent.

-----


For all purposes, yes it means 8 spaces.

Type this in a bash terminal: $ printf '1234567890\n\tx\n'

For the second paragraph: I couldn't agree more. There's even some "geniuses" that indent python with the 1st level being 4 spaces and the second level as a tab. * sigh *. Really.

-----


What about this?

  printf '1234567890\n1\tx\n12\tx\n123\tx\n1234\tx'
  1234567890
  1	x
  12	x
  123	x
  1234	x
Tabs are not fixed width, that's the whole point of their existence.

-----


Yes, this is how it works for alignment, so you don't need spaces to align the x there

This is a non issue when you have nothing to the left of the code (that is, when indenting)

(and the x should be under the 9 btw)

-----


Which only works on small hobby projects, startups and a few companies of the likes of Google, Facebook and friends.

On the real corporation world, it is a whole different thing.

Good luck enforcing coding styles across multi-site projects, with rotating developers and off-shoring subcontractors.

-----


Yes, it is hard. There are ways of putting a trigger in your source code control to enforce it, but it's not always possible.

Still, projects like this have something very thorough and that goes much deeper than just tabs vs spaces: http://www.stroustrup.com/JSF-AV-rules.pdf

-----


tabs for indentation, spaces for alignment

-----


And if you mix both every language that relies on indentation (Python e.g.) throws you a nice warning or even an error.

-----


The only exception I can think for the 'use tabs' rule in languages without Lisp syntax is precisely Python.

Therefore, to me, your 'every language' reduces to a list of one.

-----


Indentation is also significant in Haskell, for example.

-----


If there was an automated tool that could apply that I would agree with you.

-----


I always felt that using tabs for indentation and spaces after tabs for "prettying" by aligning, say, function parameters nicely seperates the semantics of white space.

Then again, i haven't had the chance to define the coding style guide of anything bigger than a hobby project, so i might be lacking pragmatic experience.

-----


My editor automatically converts [TAB] to four spaces. Works just fine for me.

-----


And?

Every editor can do it, I'm not complaining about pressing the space bar 4 times.

I'm complaining about the appearance of code indented with 4 spaces.

-----


Well, it shouldn't be a problem if you keep your source files clean and properly use tabs for indentation. Then everyone can set their editor to render tabs at their desired width, whether it's 2, 4, or 8 spaces.

It's ludicrous that these space-wielding heretics (to borrow from the kernel style guide) are keeping everyone from having that.

-----


Every editor, or at least every good editor for programmers, can convert 4 (or whatever) spaces to tabs. If it bothers you that much, you can set up your editor so that it tabifies files when opening them and untabifies when saving. This way you won't even know if the file used spaces or tabs - and that's how it should be.

In short: configure your editor to display to your tastes and save to whatever style guide you use in your organization. Problem solved.

-----


You might be shocked by how many editors can't convert indentation spaces to tabs.

You might also be surprised by how many source files with indentation spaces have off-by-one indentation errors.

-----


> You might also be surprised by how many source files with indentation spaces have off-by-one indentation errors.

I will add that to my list of arguments against using spaces for indentation. It is so true.

-----


Ups. good luck then; that is bound to conflict with other editors; I think the Visual studio editor (or eclipse?) used to have 4 characters for tabs, if everybody is on Visual studio then you are fine. Of course if everybody in the shop has the same editor / editor settings then life is easier.

vim has 8 spaces for tabstop and that's what its docs say:

"Note: Setting 'tabstop' to any other value than 8 can make your file appear wrong in many places (e.g., when printing it)."

-----


Why is this still a thing that people have to care about? Can IDEs still not automatically convert between company-style and personal-style?

(Sad that tabs solved that problem decades ago (by having a byte that means "+1 indent" instead of having to build some ascii-art that looks like an indent) but people screwed up the implementations so badly :( )

-----


Proper indentation (tabs for indentation, spaces for alignment) is still pretty hard to configure in most editors in my experience. Even in emacs I had to install an external package to get it to behave properly (and it still breaks from time to time).

Failing that I'd rather people just used spaces and no tabs. Using tabs for alignment just means it's going to look fucked up everywhere else. At least spaces are consistent.

-----


There's a lot of fiddly bits.

And my IDE is called "vim".

Sometimes it's called "ed".

-----


Hah... if you think 2 space indents are bad in shell (and C++), you should try writing python with it (yes, google coding style, in a fit of madness, insists that engineers try to keep python blocking straight with 2 space indents. Remember, some of these hapless folks are maintaining large bodies of production python code (as opposed to itty bitty automation scripts).

-----


Google mandates 4 space indentation according to this: http://google-styleguide.googlecode.com/svn/trunk/pyguide.ht...

-----


Oh.. hmmm.. so this has changed since the (first and) last time I tried to get a piece of python code through a review at Google. Looks like better sense has prevailed finally. (There's a version number on that public document but as far as I see, no version history so I could tell when the change happened.)

-----


I think 4 spaces is for public code and 2 spaces for internal code. FWIW, I got used to 2 spaces quickly enough, and now it doesn't bother me using either 2 or 4, other than sometimes forgetting to change the setting on vim when changing between projects.

-----


Yeah, there are some minor differences between the internal and external style guides. Tab width is one of them (it's 2 spaces internally)

-----


Wayback Machine shows the document recommending 4 spaces when it was first released in 2009.

-----


Looks like the google's external style guide is saner than the internal one :-) I was talking about the internal style. I can't even begin the fathom the reasoning for the internal style.

-----


Weird, because the whole Google App Engine source uses 2 space indents.

-----


It's fun how the "tabs vs. spaces" war is relevant only where it doesn't really mean anything.

See, most of the languages (I've seen so far) have a well-defined rule of identation. The thing you get when you do "select all, ident file" in your IDE / editor. At this point it doesn't really matter whether you use spaces, or 2-space tabs, or 4-space tabs, or mix, or whatever - there's only one way non-whitespace characters can be positioned, so it will (in theory) look the same on every editor after you reindent it.

The only place where the choice of tabs vs. spaces actually matters is when you want, for some reason, to break the default rules of identation for your language and position something manually. But in this case, there's no debate; tabs are not suited for precise, manual positioning. Only spaces will be guaranteed to make the code look the same everywhere.

-----


Umm... No...

Yes, the IDE does this, but indent-length and tabs/spaces is mostly ALWAYS definable within the IDE's settings, and is more related to the IDE than the language you are currently editing.

-----


I am extremely annoyed when I try to click at the beginning of a line and I clicked a bit too far left and the cursor gets placed _before_ a space because spaces are being used instead of tabs.

-----


Totally. I love the kernel style over anything else. Even the GNU style for C isn't something I am very fond of.

Still there are some languages where 4 spaces indent is fine. But two is just horrible. Also I like tabs. Convert tabs to spaces in the editor and it works out pretty fine. 2 spaces is just too cluttery.

-----


Gentoo is extremely thought out and over engineered. They seem to do most everything right. . . It's just so much work though.

Note also on the [[ ]] usage... you'll see that [[ ]] won't work for boolean algebra ie. '||' and '&&'

-----


"The majority of people prefer it this way" seems like the perfect justification for a coding standard. Consistency is way more important to me than perfectly matching up with some specific ideal.

-----


The Gentoo documentation is a programming guide and doesn't advise on style. The Google style guide doesn't explain [[ because it is out of scope.

-----


The sheer number of replies to this point is staggering, for what really ought to be a total non-issue in this world.

Someone needs to make some kind of GitHub integration that lets you download code using whatever esoteric formatting you prefer, then transform back to some given standard on commit. Then everyone can finally just agree to disagree and get on with life.

-----


That reference is the best I've seen. Thanks for that.

-----


Re: long pipelines, the guide suggests:

  # Long commands
  command1 \
    | command2 \
    | command3 \
    | command4
However, by ending each line with the pipe the continuation is implicit and the backslashes may be omitted:

  # Long commands
  command1 |
    command2 |
    command3 |
    command4

-----


I usually use the trailing | but I can certainly see the sense in the leading pipe.

It makes it clearer that the lines following are continuations; especially the last in the pipeline.

-----


I consider the leading pipe clearer, and could be sold on it, but ending each line with an inconvenient slash character (non-US layout) makes me strongly favor the trailing version.

The trailing version seems more natural with operators like comma, but less natural with operators like minus, and is far clearer with semicolonless languages. I suspect if I were to go through old code, I'd find both uses, but generally I prefer trailing.

The exception to this is languages like Haskell, though indentation could serve the same purpose.

-----


Pipes, and logical operators (&&, ||) can continue lines in bash. I find it convenient, and generally clear, to break longer constructions using these operators with them.

-----


I'm surprised there isn't more commentary on when and how to use external but standard or common commands like grep, sed, awk, perl, join, find, tar, parallel, etc.

It's one thing to use bash consistently everywhere but as a heavy multi-machine shell user I've been bitten by incompatible or missing external utilities more often than I care to admit. You might be surprised how many systems aren't using the GNU utilities, have them running in a weird mode or are using ancient versions of them.

Maybe Google is religious about keeping all their environments identical?

-----


And when you bring OS X into the picture, guaranteeing portability becomes a nightmare, especially when it comes to scripts that have anything to do with networking.

-----


I almost always use set -e with my bash scripts; this way you will always notice if an invoked command failed or not; your script will not report that everything is OK if part of the process actually failed.

https://www.gnu.org/software/bash/manual/html_node/The-Set-B...

Also: set -x is the best debugger in the world; really.

-----


Yeah, surprised that got no mention.

I thought the ./* wildcarding for safety was really cool. they also could have covered find -print0 | xargs -0 for safety.

I'm glad they talked about "$@" being usually always the right thing to do. That's been a hard won learning experience for me in the past...

-----


Also useful trick is to have exit traps, doing stuff on exit.

http://linux.die.net/Bash-Beginners-Guide/sect_12_02.html

-----


The most important point:

    Shell should only be used for small
    utilities or simple wrapper scripts

-----


"It should be possible for someone else to learn how to use your program or to use a function in your library by reading the comments (and self-help, if provided) without reading the code."

I know a few "Use the source, Luke" people who will rage at that.

-----


The best argument I've seen here is that methods should be black boxes. Comments should reveal the input, output, and side effects of a method. There are two compelling reasons for this:

- I can add to my own code, even if it's years old, without re-reading every line every of every method I need

- I can contribute to a code-base built by multiple people without reading every line they've written

-----


That is a particularly useful approach in functional languages, where the function declaration itself is actually a fairly concrete guarantee of what the function will do.

Shell scripts are almost the complete opposite end of the spectrum:

Shell script functions are usually only created as a last resort.

Global side effects (creating temporary files, changing global system state, global variables, etc.) are what shell scripts are all about.

There are rare snippets of shell scripting that are different, using local variables and doing some sort of calculation, but that is the exception, not the rule.

While ideally comments would be prolific, poetic, and perfect, some commenting is always better than none and most developers have bad habits of not commenting their code, so pushing them gently in the direction of more, not less, usually works.

-----


Shell script functions are usually only created as a last resort.

Hence why they'd prefer you write Python, not shell.

-----


>> While ideally comments would be prolific ...

I would rather put that the comments should be succinct rather being prolific. Better to put some explanation on tricky parts of the code as comments, and have method/function/class behavior as javadoc, pod, pydoc etc.

-----


You could have just said: "comments should be succinct" :)

-----


Agreed. That's the purpose of the --help flag and equivalents.

-----


why would you need to re-read every line if you are looking at methods that have a single responsibility and that responsibility is clearly communicated through the name? (and parameter types/name, return types/names, in languages where some of those things are available)

-----


The purpose may be communicated by the name, but the behavior can't (reasonably) be.

-----


When I read that and also the section about using spaces instead of tabs I immediately thought - uh oh, here comes the nerd rage!

-----


Rest of languages here: https://code.google.com/p/google-styleguide/source/browse/tr...

-----


Style guides are the worst to me. Taking some of the small joys left in programming away, while making you feel like a cog at the same time. Especially since many are outdated or just plain wrong, and very difficult to change once established.

Tools that take the AST and output standardized code for peer review and documentation sounds a little better. It would not deal well with the only human problem really worth having a style guide for - naming things. But at least humans aren't forced to jump through hoops. And the naming thing possibly can be settled with an interface that asks something like - "what do you want everyone to call the 'BitWarper' symbol?", for all named symbols.

But well written software is no place to express individuality - we just want it to work and not make our eyes bleed when we have to fix it! Even better then, just have machines generate and test all the code based on systems of higher order rules and style guides in situations where factory manufactured code is necessary. The outputs should be reasonable if the requirements are well specified (NASA style). Humans can come in after and do the real fun work in optimizing and finding clever hacks (if environment is not mission critical and such liberty can be safely taken).

Having humans program character by character, with their bare hands, while also suppressing creativity, is unnecessary in this Post-Industrial Age.

-----


Some things I don't particularly care for (2 space indent? SRSLY?!!), but it warms the cockles of my heart to see an 80 character line limit.

Yes, it matters.

-----


Yeah, Google has 80 column limits for almost every language (except for Java, it's pretty difficult there). So does Mozilla and all major open source projects I can think of (again, Java being the exception).

It really does a lot for readability. I recently summed up my reasons for doing it in all of my projects: http://ubercode.de/blog/80-columns

-----


The argument-ender for me is: Look, when I'm at the DC, on console, with an 80 column screen, I cannot make my terminal any wider than 80 characters, and the site is down.

This also applies for a lot of remote-access tools -- serial console as well as direct, so even if I'm not at the DC, once I'm on those interfaces, something's likely fucked up or headed that way.

-----


An 80 character line limit almost mandates use of a 2 space indent, for most languages.

Personally I prefer a line limit closer to 100, and 4 space indents in most languages. Some languages end up with a lot more indenting than others owing to structural literals and lambdas, and they benefit more from a smaller indent.

-----


I'm happy with 80 columns and 4 spaces in C, Python, etc. Don't see the mandate. 2 spaces would make reading harder.

-----


I was surprised how quickly I got used to it. Occasionally I have to trace blocks but not very often, and the tradeoff feels wonderful (it's much more rare that I catch myself re-formatting things to avoid the 80char limit).

-----


I manage 80 chars (with very few exceptions) and a 4-space indent. Copious use of bash functions helps.

-----


Not me. Limiting your code to 80 characters is stupid. Code != prose and you read it differently, so the prose rules don't follow.

Skinnier prose is easier to read because of the sequential nature of it, which is why I do think comments blocks should be limited to 80-100ish characters. But code? No.

-----


Thanks for `go fmt`.

-----


I had the exact same thought. The Go community is the only one (i know), where such discussions are rare.

Write your code, run "go fmt", add, commit.

-----


I've seen many scripts wrapping steps inside a main() function and calling it just after declaration.

If the justification is to localize variables which should be global, and still will look like "global" to the rest of the program, it's ok.

But this is mainly used without sense, by personal tastes, and usually only makes sense to the author and does not have any real benefice.

I prefer the concept "consistency is to not read unuseful steps", instead of "add unuseful steps for consistency". That's how I think when I read code.

-----


Whats wrong with tabs?

-----


Nothing that hasn't been said a billion times already -- and likewise for spaces. A fairly comprehensive overview: http://c2.com/cgi/wiki?TabsVersusSpaces. No, the clever people in this thread are not going to put the issue to bed for once and for all.

-----


I like how the golang team solved the issue. Ship a formatter in the core package. Discussion closed.

-----


And, the formatter accepts a tabs option.

-----


It's still an issue, but arguably one you only have when starting a new project. I have yet to hear a single reason for tabs or spaces that justifies changing the indentation for a whole established code base - unless it had inconsistent indentation to begin with.

-----


Well, sometimes I need to align my lines in a certain way (e.g., if function arguments are on multiple lines, I want them to align with each other). It's often impossible to do with tabs, because they are fixed width, so I have to add spaces. But when other people view my code, and their tab width is different, it all goes to chaos. That's why I prefer spaces over tabs.

-----


You're doing something wrong, then. Here's a block of dummy code indented with tabs; copy-paste it into your editor, change the tab widths, and observe that the indentation stays readable.

  void my_function(int some_parameter, int another_parameter,
                   int the_third_parameter) {
  	if (some_parameter != another_parameter) {
  		here_we_are_in_an_if();
  	}
  	function_calls_work_fine_too(some_parameter,
  	                             the_third_parameter);
  }

-----


Maybe I don't understand the argument: what if `int the_third_parameter` lies on an odd column. How would one indent to that column using only tabs of an even shift width?

-----


The optimal mechanism is to indent with tabs, and align with spaces (that is, you have the same number of preceding tabs as whatever you're trying to align with, and then use spaces from there).

Though setting your editor up to do that automatically is an incredible pain, and it kind of forces you to use a monospace font.

-----


Ah, I didn't see the distinction between "indent" and "align."

Anyway, it seems like this system necessitates that the leading white space on a given line must be a mixture of both tab characters (\t) and spaces, unless one sets their editor to insert space characters as tabs (which is standard), and is obviously living in a state of sin.

-----


On every line that is right-aligned, yes. Most lines would have only tabs.

-----


Who isn't using a monospace font for code editing? I didn't even realize this was a thing.

-----


The primary reason people use monospace fonts is historical inertia. It's definitely worth trying out a proportional font if your environment is amenable to it (some tools don't cope particularly well, for sad and disappointing reasons).

-----


I tried doing that. It looks uglier than Sloth from the Goonies.

-----


IIRC, Stroustoup uses a non-monospaced font in his C++ book.

-----


tab to the indentation of the previous line, then pad with spaces

    def foo():
    TTbar(asdf, zxcv,
    TTSSSSqwer, oiuy)
no matter what anybody uses for tab stop size, the alignment won't get messed up.

-----


Nothing intrinsically, but there is value in consistency, and Google's decided on spaces. One of the rationales for these guides is to give the programmer a way to make a decision about questions where there is no single correct answer.

-----


If you use tabs, then you cannot have a column limit because the line length changes depending on what the tabstops are.

In Linux they say tabs are 8 characters, which no they are not, so they can have an 80-column line limit. Anybody programming with a less than 8 character tab can't get the column limit right (the number of characters on a line changes with the indentation level).

Tabs are invisible characters that don't have a standard width so are always causing problems like this. They are used because many programmers use editors where they would have to actually press space multiple times to indent/unindent (ie bad editors) and because source control doesn't know when an indent change actually means something vs just being cosmetic.

-----


One possible reason is that if you copy a line with tabs from the terminal, you're actually going to end up with the tab converted to spaces in the clipboard. If you use spaces everywhere, you won't have this problem.

-----


Coding in the terminal does have its drawbacks.

That can't be an argument for using spaces in code, right? Rather, the terminal should keep track of which parts of its display were generated by a tab character in case the user copies the output?

-----


Mostly consistency, but also because it's kinda funky when you have an 80 character line limit (also in the style guide). Tabs are one character but displayed as multiple.

-----


Other one from the summaries:

I don't see how

    $((${X} + ${Y}))
... is more recommendable than ...

     $(( X + Y ))
for example.

But well, guide styles are a good thing, and this one can help to many people "not used" to deal with shell scripts to follow some basics.

Edit: and help to people used to it, on working in a team.

-----


Unfortunately firefox 24(on windows) has problems with rendering the style of the xsl stylesheet and gives crap putput(Basically the full text content with no line breaks). It seems to work on firefox 21/linux though. Also ie8/windows and chrome 30/windows chromium 25/linux work.

-----


Under the section 'When to use Shell', why does the style guide say 'If performance matters, use something other than shell' ? I was under impression that since shell script is low level it should have superior performance.

-----


i'm going to have to bookmark this site, I'm no Bash expert but with some Bash and Java and a bit of Python here and there, sometimes I feel like I could save the world.

-----


>> ...One example of this is Solaris SVR4 packages...

I wonder what Google are using Solaris for, and if it's just legacy stuff.

-----


Google appear to use Oracle Hyperion Financial Management for consolidated reporting and planning (search their job pages - they usually have ads) - I would suspect that this runs on Oracle boxes running Solaris.

-----


Google is likely supporting Solaris for customers or for users of their open source software.

-----


These rules look like they came from the low-intelligence paper belt. We do not write rules like that. They must simply be part of the tool. Otherwise, they do not count. What they now did, was to create an opportunity for someone who knows that he is incompetent, to invent a new job for himself, that is, "checking up" with his more competent colleagues, who contrary to him are productive in writing code, on this style guide. Rule number one: Anybody who wants to "enforce" this kind of rules must demonstrate that he is capable of writing a parser that can apply them. It is simply bad practice to create that kind of opportunities. It is bad practice to create that kind of jobs. Therefore, this kind of documents must be rejected.

-----


I think that's a bit harsh.

Of course we write style guides and while I like the idea that it's part of the tool, that's rarely the case.

I don't follow the objection about a colleague helping ensure consistency across a team. I'm really not sure why competence comes into that equation either.

Agree with the idea any style guide should be automated. Several CI servers I know of can incorporate style checkers and their reports into their workflow so this can be made really hands off, even to the point of automatically failing code review stage before its been lumped in the review queue.

Don't agree at all with the idea this type of doc should be rejected. In fact it's completely wrong to jump into automation without having "found which way is up" manually first time around.

-----


Actually, they came from very smart, well intentioned people, who thought everyone would program Lisp in this century. And of course never saw C (and all its related family of languages) coming and ruling for decades.

All these rules make sense for Lisp, and no sense at all for C.

And they did wrote the parser that applies those rules, in Emacs you never indent Lisp code, you press a key and the current s-expr automagically indents better than you could ever dream of doing it.

-----


http://google-styleguide.googlecode.com/svn/trunk/shell.xml?...

> It is not necessary to know what language a program is written in when executing it and shell doesn't require an extension so we prefer not to use one for executables.

I disagree with their recommendation against using file extensions for executables, and I'd love to have my mind changed about this.

Using an extension gives you automatic syntax highlighting. It also lets you quickly glean the type of a file when exploring a directory for the first time, which is more helpful than simply knowing whether the file can be executed.

Why does a lack of necessity override those two benefits?

-----


Any serious editor will read the #! line for syntax highlighting, which has the benefit of being much more likely to be correct than the user visible and modifiable file name.

-----


Except when it doesn't have that line because it's a C/C+header file.

And then the editor doesn't know what to do with that header file without extension. Yes, if you see some Google C++ code they do that

Really, horrible practice.

-----


What are you doing making C header files executable?

-----


Where did I say I was making it executable?

-----


They were only advocating no extension for executable shell files, not all files.

-----


Yes, I remembered seeing a C++ google product without the .h extension in header files (or other similar extensions)

But apparently they stopped this nonsense

-----


It's not google, it's C++ convention.

http://stackoverflow.com/questions/301586/what-is-the-differ...

Btw, many editors understand // -- C++ --

Nevertheless Google's c++ style guide (https://code.google.com/p/google-styleguide/source/browse/tr...) in fact says that headers should have the .h suffix.

-----


Ah thanks for pointing that out

> Btw, many editors understand // -- C++ --

Maybe, but if I vim /usr/include/c++/4.2.1/iostream doesn't work (only if I set it manually)

-----


(This point is also made lower down).

When you run a program, your concern should be what it is called. Not how it is written.

A language-specific filename extension puts an implementation detail in userspace. If you run a binary executable, you shouldn't care whether it's written in C, C++, Fortran, or any other language (though source files, being used only by developers and compilers, generally do have extensions).

Worse: if you decide for whatever reason to change the implementation language, you're either forced to track down and change all references to the program name, or to retain the (now incorrect) filename extension for backwards compatibility.

And, as noted, magic(5) or the shebang line should correctly identify the file type and language for syntax highlighting -- if not, your editor is broken. Replace it with a shell script, "editor.sh".

file(1) will tell you the types of files in a directory with far greater accuracy than filename extensions can.

-----


That makes sense. I think of shell scripts as "things programmers execute", but that's obviously not always the case.

-----


They're very often things other programs execute.

Cron jobs, other scripts, production jobs, etc.

So having to hunt down and rename everything ... is a PITA.

-----


The extension is unnecessary for syntax highlighting. Shebang lines are supposed to be used for detecting the file type.

As for knowing the file type at a glance, I'm not sure how often I need this. I'm normally looking for the file by name anyway. If I needed to determine the file-type, I'd write a script to parse the shebang lines of executable files in the current directory and generate a list of the files with their hypothetical extensions (based on a hash/dictionary/whatever). I don't need that very often, so, for me, the trade is worth it.

-----


Your benefits are so to people who are going to be editing these files. They are not benefits to people who are going to be using these files (i.e. running the script). Most people are not going to be editing the script, they are going to be using it. Thus, it makes sense to cater to them, and to not have them have to know things they don't care about, like what language the tool they are using was programmed in.

-----


We don't use to run grep.c, or ansible.py, or git.c, or gunzip.sh, etc...

explore a directory? file -i directory/*

I follow the same convention than Google, just that I use .bash for libraries instead of .sh (as they state the interpreter should be bash, I think they should apply my naming instead of .sh)

-----


I'm new to shell scripting and I'm clearly missing something basic. Why can't you use file extensions? This is saying not to use .sh for bash scripts and .rb for ruby scripts, etc?

-----


One reason: let's say you write tool 'foo.sh' in Bash. It does what you want and you move on to other things.

It's suddenly a year from now, and your foo.sh tool needs some new features, or is too slow to do the job any more because your requirements have changed.

You decide it's grown too much for Bash and want to move to Python for maintainability, or Go for performance, or C++ to link with some library you need to use.

Now you have to tell your team (and any other teams that have found your tool useful): "We only want to maintain one version, so don't use 'foo.sh' any more. You have to use 'foo.py' (or 'foo.exe' or whatever). Oh, and have fun changing all YOUR scripts and tools that reference 'foo.sh'!"

That's one reason.

-----


Now I'm wondering how many shops have some legacy script with a 'foo.sh' filename and a #!/usr/bin/python shebang

-----


Well, you do have some sort of deployment process, right? (as in, you're not attaching your scripts to emails)

Just change it to zip foo.sh + foo.py. Then:

cat foo.sh

#!/bin/bash -e

./foo.py

cat foo.py

# Everything else has moved here.

Much better.

-----


> Much better.

You now have two files to do one thing, and your solution doesn't work (it doesn't pass the arguments from the shell script to the python, or pass the exit code back, and it's making an extra process which can screw up monitoring). You could add the extra code to make it work; but even if you did, you're going to a lot of extra effort to become equal, not better :P

-----


Most of the things you mentioned can be solved in ~15 minutes, surely not a lot of extra effort.

And I'm not sure how a script that calls another script screws up monitoring.

Plus, if someone cares about high level stuff like monitoring but is worried that people's tools might break if he changes script.sh to script.py later on, I think he needs to sort out the lower level stuff first. Like distribution and packaging :)

-----


But why would you want to? As said before, somebody executing a script doesn't need to see what language it's in.

-----


Portability across OSes, especially for non-shell scripts?

-----


Or you could just ... name it 'foo' to begin with?

-----


yeah, that's a good reason. I am convinced.

-----


Not really, I'd rather do: https://news.ycombinator.com/item?id=6688782

-----




Guidelines | FAQ | Support | API | Lists | Bookmarklet | DMCA | Y Combinator | Apply | Contact

Search: