However, it's a mistake to think this 'taco bell programming' is somehow a good model for actual programming, or even some sysadmin tasks. This should be renamed 'Taco Bell Kludging'. Because that's mostly what we're talking about: using a quick hack/kludge on a command-line to finish a job quickly instead of programming. In terms of actually building a scalable, fault-tolerant solution, sometimes the Unix tools just won't do. Don't shortcut and cut yourself off at the knees just to save time.
- If the job isn't split up evenly or with an event queue, you end up preallocating jobs to processes, it's possible that one may take far longer than the rest to finish.
The worst case of this I've run into is Microsoft's EXMerge, which does imports/exports from an Exchange datastore - it can be threaded, but preallocates work by splitting up alphabetically. In one case, a family business, all the heavy users got lumped in one thread because they had the same last name - that thread took 5x longer to run when all other threads had finished.
- You can run the machine out of some resource (mem/disk/CPU) by spawning a huge number of jobs that hit one subsystem hard. This is tuning dependent, of course.
Also, I'd recommend using make and similar tools for this rather than shell commands like xargs if you're going to seriously script this - those tools are made to run processes in parallel and avoid repeating work. They also tend force you to write intermediate steps to disk which can help in debugging (and can be coded around or put on a ramdisk later if it proves to be a performance issue).
Once bash scripts reach a certain size and complexity, I've found they become quite difficult to follow. I don't know if this is inherently a quality of bash, or of people who tend to write bash, or of my ability to read bash scripts, but I find larger Python, Ruby, etc. programs a lot easier to follow.
On the other hand, even a 300 line shell script is easier to follow than a 10,000 line Java program.
Sure, it's awesome to use a ready-made tool to get that kind of scalability, but is it really apples/apples?
"Cloud-scale" just means "more hardware than we own" and seems to me like a step backward in computing, to a time when you paid to time-share a relatively powerful machine. The main appeal of cloud computing is outsourcing the ownership of the machines and responsibilities like configuring, storing, powering, and repairing them.
I am not saying there isn't a use for the cloud computing capacity or the benefits it offers... but I just think that the hype can be distilled down to "you don't have to take care of a bunch of servers" for the average business.
I'm not dissing ssh, I just don't think automatically running jobs on remote machines is its sweet spot.
For projects that need to be stable, used by non-techies, or upgraded over time, I generally go for something a little more robust. You know, like Google does, etc etc.