These are good questions! There are a lot of subtleties here. One of them is that for real one-way functions you generally get a big number and you need some sort of reduction back to the password domain. Let's call it '¬' because most fonts have that character these days. Chaining actually looks like
password → number ¬ new_password → new_number ¬ ...
Second, both the one-way function '→' and the reduction '¬' are not one-to-one mappings. This process will give you a preimage—i.e. given H it can give some password P such that P → H and you are happy—but that password will not necessarily be the original password! Usually any preimage will do, thankfully, but it's still a problem because this means that two different chains can 'merge' into one. This ties very closely to the question you're asking.
As you have surmised, 100% coverage of the input domain is also really hard to achieve because you can't control which words the one-way function (and therefore the reduction) is going to choose! Many rainbow tables just settle for some inaccuracy -- so if you cover 99.9% of the input domain, you don't worry about trying to get all of the other 0.1%.
The reason these two completely different-sounding things are connected is that if you think about it, if you find one of those 0.1% and feed it to your hash function and then the reduction -- there's actually a 99.9% chance that you end up on a chain you already have! So you have these super-short chains which immediately merge into other chains!
If you were trying to compress by a factor of 1000, then you might choose a chain length of 2000, try for 99.95% coverage, and then store these missing 0.05% as a separate lookup table -- then between the two tables you have 100% coverage. Or, you might choose a chain length of 1000 with, say, a 99% coverage, and just accept a 1% failure rate on hashes. Either one is acceptable depending on how you store the chains and how much work you want to do to ensure coverage.