Just remember: in Haskell, persistent and immutable is the default. You turn on other environments as you need them.
Why is it the default? So-called "purely functional" programming is a rich, safe, environment for most programming problems, and makes lots of nice things possible, such as trivial parallelization, automatic thread safety, proofs on code via simple equational reasoning, and powerful optimizations.
Why do we care about laziness by default? Like purity, it adds power and expressiveness.
Finally, what would a strict Haskell be like? Here's a discussion: http://augustss.blogspot.com/2011/05/more-points-for-lazy-ev... )
Also, unpredictable program behavior (which nearly destroys its usefulness in embedded beyond what perhaps Galois has been doing).
The series require Haskell knowledge but this summary is fairly readable if you have a good grounding on programming language theory and implementation.
I think about it this way: imperative languages allow side effects by default, but allow you to write side-effect free code. Purely functional languages don't use side-effects by default, but allow you to write code with side-effects.
If you use the IO monad one then you can do whatever side effects you want. It's up to you to use the rest of the code in a responsible way (which means that you can write code C style if you want).
The ST monad version is quite nice, it lets you use the hash table as if it was modifiable, but only inside the ST monad. Looking at it from the outside, it's still purely functional, and within the ST monad you're restricted to purely functional programming and the facilities provided by the ST monad. This makes it a safe alternative when you want to use a hash table for performance reasons but still want to limit what side effects that can be used.