My understanding is that unless you add a nonce at each iteration, this wouldn't help at all, and probably would just increase the number of collisions. Is that correct?
I recently had to look at the phpBB password hashing scheme. As I recall, they iterate MD5 a small-ish number of times without adding a new nonce. There's a comment which says their scheme would be improved if they did something - but then it would be slower, which means they'd use fewer iterations, which would make it less secure.
The BSD MD5-crypt password hash algorithm is composed of multiple MD5 iterations. However, if you look at the code, you'll see that the variation of what gets tossed into MD5 only depends on the loop counter (i). So the hash function is not data-dependent. You could unroll all the loops into a single, large function that applies MD5 repeatedly to various inputs.
There is a salt in MD5-crypt, but it is just input data to the function. It does not change the actual hash function that is performed.
Now, your question: "is iterated MD5 of the message alone less secure than iterated MD5 of the data + a salt?" (And a related question that I brought in above: "is a function that itself varies based on the input data more secure than either of these?)
From an algorithmic perspective, there should be no difference. If you can create an MD5+ that calculates exactly the same output as MD5(MD5(data)) but always in less steps than 2*MD5, then you have broken MD5. The easy way to see this if you can create this MD5+, you can also "roll back" an MD5 result to some previous value, a 2nd-preimage attack.
From the perspective of creating a brute-force search device, there is some difference. If you use the salt to vary your function, it starts to take up more logic area. The larger the state space you create, the more RAM your device will require or the more slightly-different logic blocks. This is what Colin is doing with scrypt. It's a good idea and one that we should be moving to in the future.
However, there is no difference between MD5(MD5(data || nonce)) and MD5(nonce || MD5(data || nonce)) from a brute force or algorithmic perspective.
Still, I'm guilty of implementing the brain-dead 'append-a-secret-and-md5-it' scheme, so I'll be using something more sensible in future.
In particular, there are simple attacks when it is used with a hash function that has effective collision attacks, such as MD5.
This is 100% true, but 99% irrelevant. For any good hash, the number of collisions only increases linearly (proof: if it increased any faster, you would have a fast collision-finding algorithm); so you'd need to iterate your hash an unreasonably large number of times before extra collisions outweighed the benefit of slowing down the attacker.
That said: This is exactly why we have PBKDF2. PBKDF1 was just an iterated hash; PBKDF2 was introduced to "reduce concerns about the recursion degenerating into a small set of values".
I'm not sure this has much to do with the security of PBKDF2.