If you really care, then you actually profile your system and see what takes how much time, under which circumstances. The results of such a profile are almost always surprising.
I guess this is a basic cultural difference -- almost nobody in the HN crowd really cares whether their software runs quickly; there is just a bunch of lip service and wanting-to-feel-warm-fuzzies, with very little actual work.
In video games (for example) we need to hit the frame deadline or else there is a very clear and drastic loss in quality. This makes this kind of issue a lot more real to us. If you look at the kinds of things we do to make sure we run quickly ... they are of a wholly different character than "guess that calloc is going to do copy-on-write maybe."
(In fact we go out of our way to not do malloc-like things in quantity unless we really have to, because the general idea of heap allocation is slow to begin with.)
The calloc() function contiguously allocates enough
space for count objects that are size bytes of memory
each and returns a pointer to the allocated memory.
The allocated memory is filled with bytes of value zero.
You first write your code using standard system functions, using the right calls for what you're doing. If after that performance of the code is bad because of calloc() only then you roll out your own implementation (most likely in assembly), and accept that in the future your code might not work well, because something in your OS has changed since you wrote your code.
Choosing whether to use malloc vs calloc is not an architectural change though, and in fact it is very easy to replace one with the other, but if you use the right call for right use case, then you will benefit from optimizations that the OS provides, and often you might not even be able to achieve them from user space.
Rule of thumb: if you need to allocate memory region that will be overwritten anyway (for example reading a file) use malloc(). If you need a zeroed memory, use calloc().
As long as you rely on guarantees provided by the calls and use the right call for right use case you get predictability and very often optimization.