Hacker News new | past | comments | ask | show | jobs | submit login

That's a nice alternative history fiction.

Here's an early implementation: https://github.com/dspinellis/unix-history-repo/blob/Researc...




You haven't proven it wrong.

Here's the earliest implementation in that repo (in Research UNIX V6; your link in V7): https://github.com/dspinellis/unix-history-repo/blob/Researc...

    calloc(n, s)
    {
    return(alloc(n*s));
    }
There are several interesting things we learn from poking around V6 though:

- `calloc` originated not on UNIX, but as part of Mike Lesk's "iolib", which was written to make it easier to write C programs portable across PDP 11 UNIX, Honeywell 6000 GCOS, and IBM 370 OS[0]. Presumably the reason calloc is the-way-it-is is hidden in the history of the implementation for GCOS or IBM 370 OS, not UNIX. Unfortunately, I can't seem to track down a copy of Bell Labs "Computing Science Technical Report #31", which seems to be the appropriate reference.

- `calloc` predates `malloc`. As you can see, there was a `malloc`-like function called just `alloc` (though there were also several other functions named `alloc` that allocated things other than memory). (Ok, fine, since V5 the kernel's internal memory allocator happened to be named `malloc`, but it worked differently[1]).

[0]: https://github.com/dspinellis/unix-history-repo/blob/Researc... (format with `nroff -ms usr/doc/iolib/iolib`)

[1]: https://github.com/dspinellis/unix-history-repo/blob/Researc...


OpenBSD added calloc overflow checking on July 29th, 2002. glibc added calloc overflow checking on August 1, 2002. Probably not a coincidence. I'm going to say nobody checked for overflow prior to the August 2002 security advisory.

https://github.com/openbsd/src/commit/c7b2af4b3f7e78424f8943...

https://github.com/bminor/glibc/commit/0950889b810736fe7ad34...

http://cert.uni-stuttgart.de/ticker/advisories/calloc.html


It is embarrassing for glibc not to check for overflow in calloc implementation prior to 2002. It is not only a security flaw but also violation of C Standards (even the first version ratified in 1989, usually referred to as C89).

The standard reads as follows:

  void *calloc(size_t nmemb, size_t size);

  The calloc function allocates space for an array of nmemb objects, each of whose size is size.[...]
and,

  The calloc function returns either a null pointer or a pointer to the allocated space.
So if it cannot allocate space for an array of nmemb objects, each of whose size is size, then it has to return null pointer.


So the (slightly modified) question still stands: Why do calloc and malloc exist? Indeed it looks like calloc was originally intended as a portable way to allocate memory. It used the function alloc which apparently was not meant to be used directly; most iolib functions have a 'c' tacked on. So when iolib was reworked into the stdlib why was calloc kept? saretired suspects backward compatibility but I don't believe this, because no other c-prefixed iolib function was kept and i couldn't find any code that actually used calloc in the v6 distribution either. So maybe whoever is responsible for malloc/calloc in v7 (I think it was ken, not dmr) thought malloc should be a public function but saw a use for calloc and changed the semantics to be a bit more predictable.



Why are you so sure it was written by dmr? The coding style looks like ken's to me: a) no space after if/while/for/etc b) use of "botch".

Yes, calloc is used in lex, but that is not part of v6...at least not the official distribution, I don't know when he started development. But since he also uses fopen and friends why shouldn't he be using malloc as well? changing 'calloc(n, m)' to 'malloc(n*m)' doesn't sound like such a huge change.


It appears that only calloc was in Lesk's Portable C Library [0] while malloc was the name Thompson gave the kernel's memory allocator in V6 [1]. When Ritchie rewrote Lesk's library for V7, he may have simply retained calloc for backward compatibility with existing user space code. [0] http://roguelife.org/~fujita/COOKIES/HISTORY/V6/iolib.html [1] https://github.com/hephaex/unix-v6/blob/master/ken/malloc.c


GETMAIN, the malloc() equivalent in MVT-derived IBM OSes, does not always zero memory. IIRC, MVS didn't zero it at all, so you might get anything in there, thus the need for a call that guaranteed zeroed memory for it. (This is from my memory of assembly programming on MVT/MVS up to the 1990s; z/OS apparently[1] does it somewhat differently now, so that some allocations are definitely zeroed.)

[1] http://www-01.ibm.com/support/docview.wss?uid=isg1OA28314


It's a good explanation of why calloc still exists and is useful. Otherwise it would have been dropped from the standard like cfree was.


I think there's a big difference between "why does ... exist" and "why does ... still exist". calloc() may be useful today for reasons completely different from why it existed in the first place. And the difference #2 is just an implementation-specific optimisation. There's nothing in the standard that forces calloc to use lazy allocation / virtual memory. Actually, it may be implemented on platforms which can't provide this.


Thank you for bringing up the implementation-specific nature of #2! If the author is running Linux, then perhaps they've never checked out overcommit vs. not overcommit.


The way malloc, calloc, and memset are implemented all are implementation specific. For example, memset, when asked to zero memory, may use cache control instructions such as https://en.wikipedia.org/wiki/Cache_control_instruction#Data....

That tells your cache "pretend that you read all zeroes into the cache line at this address, and mark it as dirty (that guarantees the zeroes will be written out, even if the caller doesn't write to the cache line)

For small amounts of memory that will be written to soon, that's as good as free since it doesn't have to read from memory (the naive loop will, as it has to bring in an entire cache line before it can zero out its first byte or word)


You're likely to still see a performance improvement without overcommit. The OS will try to zero free pages in the background, so there's a good chance that it'll have pre-zeroed pages to hand you when you ask for them, rather than making you wait for them to be zeroed on demand.

Of course, there are plenty of systems where this doesn't happen, or there are no pages in the first place, or there's no kernel zeroing stuff for you.


For what it's worth, the numpy 'trick' he used to demonstrate that feature also works on windows.


Good point, and it makes we wonder why the code is attributed to dmr...it looks like written by ken. I suppose recreating such a repo can only be so accurate.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: