Define reproduced? Do we mean "conduct the same experiment multiple times so we can assess the variance on the outcome"? Or do we mean "conduct the same experiment multiple times to figure out if the first result is a screw-up"?

Those two aren't the same, and I think far too many think that the point is the latter when, imho, it's actually the former. Pure screwups will likely get found out, just like glaring bugs are usually found. It's when your result actually has a huge variance but you're looking at only one (or a few) samples and draw conclusions from it that's insidious, like the fact that it's the bugs that just change the output by a tiny bit that are the hardest to notice.

