The longer it takes to run the full suite, the more important this is.
Yes, sometimes this is unavoidable and you have to decide that it's not worth dealing with, but then you have to corral those tests off into a "these sometimes don't pass lol" section. The regular CI must not waste people's time with unreproducible failures. And it shouldn't waste downstream user's time with unreproducible or by-coincidence passes, either.
If this is not the case, then you let entropy win. Tests that fail "sometimes" become tests that fail most of the time. People get used to re-running until they get the result they want - which means it's passing by luck.
If you need to test RNG or time-based behaviour, make that reproducible as well. Otherwise you end up with a "can't print on Tuesday" bug, do the tests on Wednesday, conclude it works, and ship it.
There is definitely a debate to be had about coverage, which I think is what your question about ranges of inputs is really about. In some cases it may be possible to just test everything: https://randomascii.wordpress.com/2014/01/27/theres-only-fou...
.. but again, realistically, you have to look at your line and branch coverage and decide when it's good enough.
(One of my "debugging war stories" that I should get around to writing up was to do with touchscreens; we had all of "touchscreen stops working if you change PSU", "sometimes double-press", and "sometimes misses touches". For the latter two it was years before the business even definitively established it was a software fault and not just users damaging the touchscreen, and the eventual test process we settled on was "have a robot press the screen 10,000 times and count the results".)