Hm. Use one letter variable names for scoped variables? Excerpt from lines 92-112:
"* The length of an identifier determines its scope. Use one-letter variables for short block/method parameters, according to this scheme:
a,b,c: any object
d: directory names
e: elements of an Enumerable
ex: rescued exceptions
f: files and file names
i,j: indexes
k: the key part of a hash entry
m: methods
o: any object
r: return values of short methods
s: strings
v: any value
v: the value part of a hash entry
x,y,z: numbers
And in general, the first letter of the class name if all objects are of that type."
I cross-referenced to the Erlang documentation, and they provided no other real justification for this rule, other than to say that the burden lies on the caller. Is there a history behind this convention? Is there a considerable benefit, other than more stream-lined code, to coding this way?
Firstly, it's basic DRY principles. Rather than have code dotted around your system dealing with data integrity issues, put it in one place and make your business logic more readable and easier to maintain. When trying to understand a subroutine or Class I should not have to wade through all the lines of code that try to make sure the input is 'safe.'
Secondly, something I've noticed about defensive programming is that generally it creates a culture of "meta rules" that only exist in the system architect's head. When someone else reuses one of their models, then the danger is they don't know all these different rules because they are scattered around different controllers or scripts. This has been a particularly vicious source of bugs in our legacy systems. Domain experience becomes critical.
Lastly, defensive programming by its nature is also traditionally a big red warning sign that says "there is no data integrity strategy in place for this system." It doesn't matter if that data is coming from a CRUD form, a relational database or a file... there should be some validation layer that ensures the data is 'correct' before making it 'safe' to reuse across the rest of the system.
I've found that defensive programming is particularly rife in systems powered by MySQL databases. MySQL has traditionally lacked the data integrity mechanisms of Oracle or other Enterprisey systems and thus has created a whole bunch of MySQL 'experts' that actually don't know anything about data integrity. A system I'm working with this very minute has a function where the first 20 lines of code (out of 28 total) is compensating for potential orphan entries in the database. The reason? The admin tool for removing entries does not do cascading deletes. Rather than fix the root of the problem, we have this 20 line check everywhere that we need to access the data safely.
I could go on and on, but in short: defensive programming is one of the major 'code smells.'
I draw a line between defensive programming and paranoid programming. Paranoid programming looks like defensive programming run amok, and you've described it rather well.
Maybe I have a different idea of defensive programming. But for me, defensing programming is making sure that when you call a method (or something else) the data is assumed to be good or else it will go KABOOM. This is for example asserting for all the arguments that should exist and not for if (something) return;
For example, if I call a Draw method on an object, I assert if the object Color property hasn't been set. I don't create a default color for it. This makes sure that the Draw will work as it should, it will make no assumptions on how it is being called it just makes sure that when it was called it will have to be valid.
Yeah, there's two different things you could call defensive programming: first is asserting that code assumptions are true (and crashing loudly when they're not), second is trying to work properly with bad data.
I find the former really useful in quickly finding bugs in my code and documenting the function input assumptions. In general, I'd rather my code fail early and visibly (e.g. crash in debugging environment, return error and log the problem in production environment) than produce garbage because the input was garbage.
I agree the latter bad for defending against inconsistencies in internal data model or as a "precaution" when using API you don't trust, but it's still vital if you're dealing with data from outside world, when you do want to try and behave sanely in possibly damaged input (e.g. not ignore the entire RSS feed if an item from it doesn't have pubDate set).
I agree with your conclusions but for different reasons.
Defensive programming "done right" should be self documenting ("oh, I see this routine requires this parameter to be non-zero"), and throw exceptions back up to the ui/app layer so that unmet assumptions can't be ignored by the app programmer.
For many-layered code bases, there needs to be a level below which data is assumed to be valid and can go unchecked (at least in release builds) - otherwise a single check will be performed many times from a single high-level call.
I agree about defensive programming, but I think that systems are really good to the degree that there are meta-rules which are predominantly in people's heads. The thing is, the meta-rules should be few and obvious in the code. You should be able to look at it and say "well, it looks like X happens here, so I if I add some more X I definitely shouldn't put it with Y."
Defensive programming should be done like a process. Start with all the checks in your code and finally move the checks to a centralized place so that you have reliable exit.
For example, I would had a stored procedure for validating the data.
Or I would have made the smartest decision of using Postgres and live happily there after.
For example, I would had a stored procedure for validating the data. Or I would have made the smartest decision of using Postgres and live happily there after.
How does using a decent DB remove the usefulness of stored procedures?
I suppose it's a duck-type thing. A strength of dynamic typing is that you don't need to anticipate all uses of your api. By allowing anything as input, you make things more flexible. For example, instead of assuming that a parameter should be a string, you should just treat it like a string and have the caller deal with the consequences if he passes something else. One positive effect of this is that validation/error handling code tends to be kept in a single place rather than scattered out over the model layer. I'm not sure if I completely buy in to this sentiment, but that's how I read it.
I really hate other people's style-guides.
Most of it is common-sense, but there's always some aspect or another that really grates against my preferences, e.g. "Keep lines fewer than 80 characters" is a pet hate; when did we return to Green-screen terminals??
My current company enforces this, and it often creates worse multi-line dischord cf. adding an extra 10 to 20 char's to the current line.
And "Read other style guides and apply the parts that don't dissent with this list" is a little too arrogant for any coder in the entire world -- perhaps should be "Replace any parts of this list that don't conform to your style".
That's kind-of the point: emphasis should be on readability, not "this long, and no longer!", and that applies to all points in any style-guide, i.e. it's a "guide" not a rule-book.
This is the second time in the past few days I've been told to use two spaces instead of tabs. Can anyone provide any insight as to what difference this makes?
Unlike in some other communities, there is no contention in the Ruby community over whether to use two, four or eight spaces. Everyone is happy to use just two. Furthermore, never use tabs, which includes the practice of mixing tabs with spaces.
That's not to say spaces are better than tabs, or two spaces are better than four spaces. Just that idiomatic Ruby code uses two spaces.
I've always viewed this as a plus for tabs. Personally, I think two spaces is too small. If I'm looking at someone's code and it's hardcoded to two space indentation, I'm screwed. But if it's using tabs, I can set it to whatever I want. Some people like 2, some 4, and some 8. I prefer to use tabs to denote logical indentation, and then space for formatting. But not many people agree with me, so I usually just use spaces.
Generally I agree, but for the pedantic there are other issues. Tabs at the beginning of lines aren't a problem. But, for example, it's common to see multi-line declarations of, let's say, a hash where each line looks like this:
key <tab> => <tab> value
This might look perfectly fine for someone using 8-width tabs, but it will most likely be screwy for anyone using 4-width tabs or smaller. The tab's location is calculated as the distance where the cursor position modulo the tab width is 0.
E.g. one key is 2 chars long and another is 5 chars long. You tabbed it using 8-width tabs. Looks fine. If someone loads it with 4-width tabs, they see the two lines aligned differently.
I've heard all sorts of arguments for and against, but in the end it only matters that you consistently choose only tabs or spaces, and that you use whatever header-settings your IDE provides to enforce the decision across all the coders, e.g. for emacs:
Upvoted, but would like to add to bazookaaa: easiest way to change this habit is to set your editor to insert two spaces when you hit the tab key, that way you don't have to think about it ever again.
"* The length of an identifier determines its scope. Use one-letter variables for short block/method parameters, according to this scheme: