Yes, you have to declare your effects. In practice that means that most of your code returns IO, and isn't constrained anymore. I don't know if this is a library feature, or an essential feature of the language, but it would be very interesting for example to put a GUI together by computing events in functions that returned an "Event" monad, widgets in functions that returned a "GUI" monad, database access in functions in a "DB" monad, etc. Instead, all of those operate on IO.
 A completely subjective assessment.
 I've though for a short while on how to code that, but didn't got any idea I liked.
Of course, you can do this as a library. In fact, this is an example use case for Safe Haskell which also prevents people from circumventing your types with unsafePerformIO and friends.
Moreover, some existing libraries already take similar approaches. FRP libraries extract reactive systems (like events but also continuously changing signals) into their own types. A button gives you a stream of type Event () rather than an explicit callback system using the IO type. Check out reactive-banana (my favorite FRP library from the current crop) for a nice example.
Similarly, people use custom monads to ensure things get initialized correctly, which has a similar effect to what you're talking about. The Haskell DevIL bindings for Repa come to mind because they have an IL type which lets you load images and ensures the image loader is initialized correctly exactly once.
Sure, in the end, everything will need to be threaded through IO and main to actually run, but you can—and people do—make your intermediate APIs safer by creating additional distinctions between effects.
The thing is IO is generic. So IO (Db) and IO (Gui) are different things.
An example of this(not a very good one I'm afraid) is the X monad.
The best way to understand IO is to think about working with pure functions in an impure language. Let's say I've given you a promised-pure function which emits commands (re: the "Command pattern" if that's the way you want to see it) and you operate them using side effects. This is a massive inversion of control issue of course, but you can see how it might work.
Further, you might understand that your job is easier due to the purity of the command-emitting function. You explicitly give it all of the inputs you desire and operate it as needed. For instance, you can perhaps run it forwards and backwards as desired. Or weave it in with another "thread" I parallel knowing that only you must handle races and shared memory—the threads are pure.
Finally, you might understand that the risk of bad programming is borne on your shoulders primarily—side effects are complex and you're the only one handling them.
In Haskell, "you" are the RTS and the pure threads are Haskell programs. The IO monad is nothing more than what it feels like to be "inside" a useful kind of command pattern. Finally, we compartmentalize all side-effects into the RTS so that we only have to get them right once.
Purity makes it easier to reason about the semantics of your code. This isn't about parallelism, it's about concurrency, including single-threaded concurrency. Case in point, I recently spent quite a while scratching my head over a bug that happened because someone else had written some code to mutate a piece of shared state when I wasn't expecting it.
But the pure functional programming model is a very high level of abstraction (deep down, every interesting thing in computing is a state machine), and it has a tendency to leak like mad. One such case where it does so is I/O. In fact, you can't even do I/O in a 100% pure language - and that's what the I/O monad is really about; it's punching a hole in the language in order to let the big bad ephemeral outside world in. But in a controlled manner, so that the language's fundamental ethos of purity can be maintained, which in turn makes its laziness manageable. In short, the deeper downer reason why Haskell loves its I/O monad so much is because without it the language would be fairly useless. Anyone who tells you the I/O monad's really about making I/O concurrency headaches less of a hassle has been doing more blog-reading than programming.
So why preserve the illusion? Well, ghettoizing all things stateful lets you take advantage of pure semantics everywhere else in your code, which theoretically makes it easier to reason about and maintain.
As for monads themselves, IMHO they're kind of overblown. It's just another design pattern, akin to Visitor or Strategy or Decorator, only functional instead of object-oriented. Super-useful, applicable in all sorts of circumstances, and easily worth knowing. And, just like object-oriented design patterns, easy to make sound way more complicated than it is if you try to explain the idea to someone else without having fully digested it yourself first.