- Caching and cache invalidation are really hard - if you have to go there, make sure you can disable the cache at any time
- Using regular expressions for anything but processing lines of text means you're probably doing it wrong
- Few programmers can handle concurrency, provide APIs with lots of run-time checks
- Have coding guidelines and enforce them in code review
- Boring code is good code
- Lack of automated tests make every commit a crap-shoot
- Use consistent logging (with levels), if this is a major performance hit scrub the code out for the production build
- If there is no expert on the team, become the expert. If there is, learn from the expert and become one as well
- Always take responsibility for your code
- Sometimes ugly hacks are necessary. Deploy them, but make it a priority to factor them out at the start of the next cycle (don't leave this crap around for 'someone else later')
Oh man, I could go on like this for hours.
And since half the people noting this usually just handwave about the jwz quote* : Regular expressions have very definite limitations, which is why complex parsing is usually done with a second layer on top of REs. Regular expressions for tokenizing (AKA "lexing": breaking a stream of characters into individual tagged tokens - this is an operator, that's a floating point number, etc.), and then a grammar is made for those tokens with a parser.
If you aren't aware of the limitations of REs, you can just keep adding layers and layers and eventually end up with madness like this RE to recognize RFC822-valid e-mail addresses (http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html).
* "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." -jwz. Funny for people who already know, but not very enlightening otherwise.
Depends. The theory of regular expression works for all semirings (http://sebfisch.github.com/haskell-regexp/regexp-play.pdf).
On the other hand, if you are using backreferences or something crazy like that, you have left regular expressions.
The last thing I want to do is start another broad TDD flamefest, but I honestly believe that if a programmer hasn't even tried a test automation framework (xUnit or similar) then they are being professionally negligent. Would you want to work with a surgeon who was too set in his ways to sterilize instruments?
The last time I was doing developer candidate interviews, I would start the phone screen with "what test automation frameworks do you enjoy using and why". If they couldn't answer that, I did my best to make it a very short phone call.
TDD specifically calls not only for unit tests, but for writing them first, ie the test _drives_ the development. You write a test that fails, then write code to pass the test.
One can be anti-TDD but still believe in unit tests, full coverage, etc.
I'm in very strong agreement that automated testing (unit, regression, integration, etc.) pays for itself in most sufficiently large projects, but skeptical of TDD specifically - testing during exploratory programming often gives good feedback, but it's just one of many tools.
Although I've been operating with a slightly different (and hopefully less controversial) TDD definition. Instead of getting all hung up on sequence and test-first, I ask "are you using automated tests as the primary way of executing your code as you're working on it". If the answer is yes, then at least that part of the code is test-driven. If the answer is no, then at least that part of the code is not test-driven.
You can write the tests first all you want, but if you're not running them, often, and doing all of the "does this work" through the UI or the debugger, how test-driven are you really?
Hope this helps.
Should you really be looking for people who enjoy using testing frameworks? Testing is a necessary part of development, sure. Enjoyable? No.
If it isn't tested - it doesn't work.
- How to think. If you can't hold abstract thoughts in your head you should find something other than programming to do.
- How to learn. Progrramming is a dynamic and often changing discipline. You must know how to learn new stuff.
- How to find information. Programming is such a vast field that no matter what you do you'll often come across something you don't know. 99% of the time someone else has had the exact same problem, and the solution is out there if you know how to find it.
If you know how to think, learn and find information everything else will follow.
If you don't know how to effectively search for information that is already available out there then you are probably wasting valuable time re-inventing various wheels.
If there is one thing I would do if I had a company with employees of any kind it would be to teach newcomers in a couple of hours the finer points of using search engines and other online resources.
A friend of mine is a diesel truck mechanic, which to me seems like a job that doesn't have much to do with the Internet. Yet he spends lots of time researching and finding information. When he debugs engines that don't work the Internet is his first stop. Just like in programming there's always someone somewhere that's had the same problem and he's usually able to find it in some obscure place.
And don't forget about asking librarians, either. Some research librarians are so good at finding things, it's scary. They're like meta-meta-search engines, except they're usually much better at teasing the relevant details out of vague questions.
Historical note: there was a time when you could retain everything in the field, and it was around the time I was taking home more than ten trade rags a week that I decided that a new tactic was required: where to find information. That was well before the Internet, of course. And now we have it, it is a fantastic resource for us. Geez! Complete lumps of working code. It's too easy :)
I agree that proper methods of thinking and holding abstract thoughts is necessary to program well, in fact it is probably the most necessary criteria. But if you cannot do that the answer is not to give up on programming (if you want to program), but to learn to think and reason abstractly.
All knowledge that you really must have is already developed long ago (look at things like SICP or Mythical Man Month, or XP Explained) and remains unshaken.
In contrary, this or that new cool tool to make a dynamic web-pages, or a brand new pile of a useless abstractions in the next revision of J2EE or .NET are third-rated knowledge. Google search is enough in this case.
Take a look at Berkeley's CS 61A course (with that honorable old-school lecturer and a nostalgic 80x24 text-mode console on Solaris 2.7) - all the essential knowledge is here.
In that light, I think that programmers should know the basics of computer science: what references are; how a linked list works; a little about tree data structures; stacks, queues; basic sorting and searching stuff; etc. I'm not saying that you need to be able to talk about everything off the top of your head. I just think a familiarity with the theoretical concepts is good.
On the other end, I see a lot of people who know that PHP interprets strings using single quotes faster than double quotes, but have no idea why NoSQL might scale better (and might actively think it's because of the SQL language).
I want someone to say "a lot of the NoSQL technologies work by having you pre-compute data so that your data structures have the information you want to display together stored together in a structure that can be read more quickly like a hash." It isn't magic.
These are certainly good things to know, but for most people not really essential or even important.
You still need to understand it at least well enough to know that you really shouldn't try to access random list elements. Just like you need to understand arrays well enough to not try to insert things into the middle of them.
It allows you to debug really tough bugs and keep moving.
The guy that taught me that trick was a long time COBOL programmer, he took pity on me after not being able to crack a certain problem and it was like someone handed me an electric drill after puttering about for years with a pocket knife.
Pick the first half of the two and see if you can solve it.
If not recurse with the first half, else solve it.
Now look at the other half and see if you can solve it.
if not recurse with the second half, otherwise solve it.
For debugging it mostly relates to chopping up the program in to bits that are 'verified good' and 'unsure', until you are literally staring at the bug.
Some sort/search algorithms are an almost literal implementation of divide-and-conquer.
edit: the more I think about this the more I realize that this has pervaded in to just about everything that I do. I'm currently rebuilding a car that has crashed and I find I use divide and conquer to troubleshoot and to repair. Tomorrow I'll be puling the engine to replace a cracked gearbox casing and for sure I'll use it again, both to remove the engine, to remove and replace the gearbox and to put the whole thing back together again, one small 'obvious' step at a time.
edit2: see also: http://news.ycombinator.com/item?id=1695794
It should be noted there's a whole class of bugs where divide and conquer simply doesn't work. E.g. about anything with race condition.
Good points otherwise.
It's not a 100% silver bullet but it will get you quite far.
What should a self-taught programmer know that schooling or tragedy would have otherwise taught me?
For example, I know there are design patterns for data structures. I've modeled plenty of data and nothing has ever blown up on me, but I suspect I could have done it even better with a solid grasp of best practices for data structures. What should I know about these things?
edit: Also, I owe StackOverflow et al a huge chunk of my success but damn the you're not logged in slidey bar is obnoxious as hell.
It's not about the specific mathematical techniques, but the general way of thinking.
Caveat emptor, I'm just a web dev, not a crazy 3d modeller or anything.
For Programmers in particularly, I think Mathematical Functions and Set theory are some important areas that you should look at.
Whether you use the language in the long run or not, reading chapter two of _Developing Applications with Objective Caml_ (http://caml.inria.fr/pub/docs/oreilly-book/) and doing its exercises, in OCaml, (http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora020....), will probably help quite a bit, and shouldn't take too long.
* Haskell may help, too, but its emphasis on laziness will also mean many algorithms and data structures won't translate as easily. SML is probably as good as OCaml, but I don't have much experience with it.
Okasaki's _Purely Functional Data Structures_ uses SML to describe a variety of "purely functional" data structures, most of which are immutable, persistent, and exploit laziness for amortized performance guarantees. (There's an appendix with Haskell versions.)
Seriously? You're going to take huge swaths of things off the table to try to force the discussion the way you want it to go?
There is 'general' experience that applies equally across the board and that experience you either have or you don't have it, but for the rest we are all only as experienced as we are familiar with the tools we use to do our work.
Try switching from your 'favorite language' to one that is your least favorite and making some headway, versus the 'experienced' (but relative newbie) on that platform. You won't stand a chance.
Programming computers and computer science has become too large to still be able to be an 'all-rounder', you can't do all of 'embedded systems', 'web apps', 'operating system kernels', 'algorithm research' and so on. And if you can then you're either not human or exaggerating ;)
If HN was the mathematics forum, one would read and comment news about research in number theory and algebraic topology, not high school math, for it is not particularly insightful and entertaining.
The thing is, the 20% isnt the technical stuff, its everything else.
If the problem itself is long, hairy and convoluted, then redefine the problem in a different way. ie, come up with a different paradigm for achieving the same result.
If you find yourself writing a lot of special edge case handlers, then try to redefine the model to implicitly take care of those edge cases.
None of this is easy, but being aware of it goes a long way towards becoming better at it.
Now, whenever I find myself getting confused I try to think really hard about where I went wrong. Good code is like good prose. If you have to read it six times to understand what is happening, the writing sucks.
This comment(http://news.ycombinator.com/item?id=1695010), for example, seems on-topic but was downvoted pretty quickly.
Can you imagine how silly it would sound applied to other fields?
"Hey guys, I decided to be a nuclear engineer/attorney/pilot/paramedic. What are the main things I need to know? I'll figure out the rest as I go along."
An even more important requirement is realising when your solution is better and being able to explain exactly why.
Learn when to fix it, when to move on to something else, and when to come back. This is the hardest part.
Likewise, nobody else ever gets it right. Know when to build, when to borrow and tweak, and when to steal.
I keep looking forward to the time I have it all "figured out". As I get older, I've started to think this never happens.
Most of the most horrid WTFs I've seen is people's custom parsing routines. Horrible to initially code, worse to maintain.
A) It's unnecessary and will get "trimmed" down to 2147483647 on 32-bit systems.
B) Clearly the programmer that wrote this has absolutely no idea that his number is going to require a 64-bit long int (on applicable systems) just to generate a random number to make their URL non-cached. What a waste.
Why would you think this this calculation would get clamped to the range of a signed 32-bit integer?
I have a lot more advice for systems developers (which can basically be summarized as "you're never as smart as you think you are -- plan for it") but I don't think they generalize as much to "all programmers."
What if it's "checked in" to just another part of the same harddrive (e.g., git)?
That point is about people completely unaware of VCSs.
The programmer is the important element, and the language and tools should be chosen based on the problem. .
The key is a way of providing non-obtrusive (to the programmer) hints to the compiler - whether by annotation or the type system or whatever.
I believe that sort of extreme skepticism is helpful in writing solid code and thinking about unintended consequences.
For example, programmers are talking about ideas, algorithms and abstraction layers, while coders talking about tools - which one is the right tool (Java, of course) which one is the best for so-called web-development (PHP, of course), and about cool features of C# if they're using that system.
The difference is the same as between, say, GM engineer and a taxi driver.
There are also sysadmins, who know that everything sucks, but some things when fine-tuned suck less. ^_^