talloc is based on my library called halloc , which stands for hierarchical allocator. The idea is that every malloc'ed block can be assigned a parent block. Freeing the parent block also frees all its children in a recursive fashion, and that's about it. I cooked up halloc several years ago when implementing IKE (the IPsec keying protocol) library and it proved to be very useful for reducing the verbosity of the code without loosing any clarity.
Tridge fluffed up halloc to make the API more fit and convenient for Samba code, and talloc came out. I still like my version better though because it is simpler and (subjectively) more elegant.
No offense intended, but I have to say I was a little bit disappointed by halloc. I expected it to use a custom allocation scheme, like malloc()ing large chunks of memory in one go in order to reduce malloc() time and space overhead and to increase locality of reference. Instead it looks like it just malloc()s every object separately. I'm sure some people will find halloc useful, but in my C++ project where I already use smart pointers I didn't see any advantage in using halloc.
You appear to be mixing two things here. What you described is a form of a "slab allocator" whereby all blocks in a certain size range are allocated from one big piece of memory that is sliced into equally sized chunks. This is really fast, it works great for smaller blocks and it is the way to speed up the STL-heavy code. I did some profiling a while back and a lock-free slab allocator delivered 10x speed up in the code that operated with std::map and strings.
The second thing is locality. An allocator like halloc could theoretically utilize the fact that two blocks are explictly marked as related and allocate them close to each other. However, for this to work the allocation function needs to be passed a pointer to the parent block, which is not the case with halloc. So the API needs to change, the implementation needs to change too, and then it will no longer be a light implementation of a simple idea, but something else.
Yes. For example my app has a hash table which it fills upon every request. After this initial filling, not more hash table entries are set. At the end of the request the hash table is deleted, and a new one is created at the next request. I improved performance a bit by clearing the hash table instead of deleting and recreating it. What gave me an even bigger boost in performance was by using a custom allocator that allocates big chunks of memory in a single operation.
The thing is my app is already not doing a lot of work per request so almost any speedup is noticeable.
You saw benefit because you implemented a custom allocator whose behavior was optimized to the memory access pattern that you planned to use with it. Any decent implementation of malloc already allocates large chunks of contiguous memory from the operating system and tries to maintain locality among allocations. It's also possible that you saw performance improvement from the removal of a function call - it's possible that the call to your custom allocator could be inlined, whereas malloc has to be a function call.
My understanding of halloc and talloc is that they are general purpose allocators. A pure library approach - that is, without source code analysis and code generation - won't be able to tailor the allocation scheme based on the data structures themselves.
When two blocks are explictly tied into a parent-child relationship, it gives the allocator valuable locality information. It is another matter that halloc separates allocation and block binding steps, so the relationship information is not available at the allocation time, and so it cannot utilize these locality hints.
So it's like NSZone in Objective-C. In Obj-C you can associate any allocated object instance with a zone (unless you want to use the default, runtime-provided zone); deallocating a zone will also deallocate all the instances. Useful for things like trees where you can deallocate every node just by deallocating the root.
Writing a pair of constructor/destructor is not a problem.
tmalloc() adds a layer of complexity that is unhealthy to C programs. Are the few seconds you save by using tmalloc() really worth the hours spent to debug memory leaks 6 months later when the project got big and complex with multiple collaborators ? No.
Introducing side-effets under the table is the best way to confuse other C programmers when they read your code. It's not worth it.
Memory management is part of C, and will always be.
If you don't like it, switch to another language.
What side effects were introduced "under the table"? talloc introduces entirely new functions, it doesn't override standard malloc/free. All of the new behaviors are rather explicit.
As for large project with collaborators: if the project doesn't already have coding standards or doesn't have a code review process for new members then they're already headed down a bad path. And anyone blindly introducing talloc into an existing project without first getting agreement from other participants and taking steps to smooth integration is irresponsible.
The issues you give aren't reasons to dismiss talloc. Bugs, implementation flaws, design limitations are some valid reasons. By your argument, anything layered on top of the C standard library should be avoided.
I understand that talloc adds complexity, but what large project with multiple collaborators does not already have some memory management policy?
C forces you to explicitly communicate who is responsible for allocating and freeing each piece of memory. Isn't talloc just providing a mechanism that helps with this?
If talloc is part of a project's memory management policy, contributors not using it consistently are by definition not following the policy consistenly. This would be a problem regardless of mechanism.
You get kmalloc() and kfree(), which are very similar to malloc() and free(). (Okey you also get kmem_cache_alloc()/kmem_cache_free() which gives you the ability to provide a ctor/dtor, but they are explicit)
tmalloc() does too much: The day you have some memory corruption / leak, you'll start being scared of using tmalloc().
tmalloc() is a structured way of doing memory management within C. It's no different than all the other ad-hoc structured (or more often unstructured) memory management schemes that evolve within C programs, with the exception that it's semantics are at least documented.
tmalloc() is memory management. As are other, alternative memory allocation strategies, chosen for better performance for certain applications. The notion that there is only one right way to do things is quaint, even if you're talking about Python.
This feels slightly untrustworthy. What if there are still external references to subfields of a struct? I think it's probably safer to just write explicit constructors (that allocate and initialize) and destructors (to free properly) for the things you use.
I have used talloc for a medium-sized project, and it was quite refreshing. Of course, it all depends on your program's structure. An hierarchical allocator naturally fits hierarchical data structures and containers, so parsers for example can benefit immensely in my experience.
You are correct in that you can just write constructors and destructors, but then you still need to worry about error handling, double-frees and whatnot. With an allocator like talloc, you just state the relationships during the allocation, and afterwards only worry about the high-level structure.
One thing which can be somewhat cumbersome is the need to pass around those "contexts"; for example, if you allocate an array of some struct, and you need to allocate additional memory inside that struct, your context will presumably be the array, and so you need to have access to it. One solution I have found which is helpful in most, but not all, situations is to use the wonderful container_of macro, which enables you to do upcasting using some pointer trickery. And of course, it does cost some performance as they say on their homepage.
I invariably find myself doing this in C projects because not doing it is a serious violation of DRY--writing possibly hundreds of lines of free (foo->bar); free (foo->baz); free (foo->quux->fnord); ... ; free (foo) is not only hugely error prone, but a waste of everyone's time.
One tricky thing here is that this does break the normal memory management semantics. That is, while one nice thing about malloc/free is that you can drop pretty much any implementation of a user-space memory allocator in there without changing your code, this is no longer the case. Whatever replacement allocator you use will have to change its interface, and probably its data structure, to fit the new model.
Edit: Also, how is this better than doing something like this
Even LGPL is a pretty steep price for the value being offered. The LGPL requires that is it possible to substitute a modified version of the library, which can be a logistical problem. Do you really want to have to link your malloc() as a shared library? How do you let the user substitute inline functions? What do you do when people start bugging you because they weren't able to substitute a modified version of the library due to some technical edge case?
Too much legal baggage for such a small piece of functionality.
I wouldn't call talloc a small piece of functionality. Also, the LGPL explicitly states that dynamic linking suffices, and also explicitly covers the inclusion of bits from header files. I do agree that the LGPL makes it non-trivial to statically link a proprietary application with the library; personally, I think it would help if the FSF had standard exception language libraries could use if they don't care about relinking and only care about requiring source for modified versions of the library. That said, I don't write proprietary software, so that doesn't affect me. :)
> Also, the LGPL explicitly states that dynamic linking suffices
Good point, I thought that the requirement was substitutability and dynamic linking was just given as an example of how a library can be substitutable, but it appears that dynamic linking is sufficient.
> and also explicitly covers the inclusion of bits from header files.
It's not clear to me exactly the meaning of these sections; if including bits from header files is truly allowed, then you could effectively circumvent the substitutability goal by simply inlining the whole library.
> That said, I don't write proprietary software, so that doesn't affect me. :)
I don't either, but I write a BSD-licensed library that I absolutely want to be usable with proprietary software without any difficulty.
> Good point, I thought that the requirement was substitutability and dynamic linking was just given as an example of how a library can be substitutable, but it appears that dynamic linking is sufficient.
You can satisfy the requirement without dynamically linking (by shipping object files or source code instead), making dynamic linking sufficient but not necessary. In practice, though, nobody seems to attempt anything other than dynamic linking against an LGPL library, or shipping enough source code for static linking (even if not all the source code).
> It's not clear to me exactly the meaning of these sections; if including bits from header files is truly allowed, then you could effectively circumvent the substitutability goal by simply inlining the whole library.
Substitutability represents a technical goal, and the library has to support that goal for it to prove useful. If the library provides the entirety of its functionality via inlines in header files, they clearly don't care about substitutability. And if you modify the library to move all of it into header files, you still have to supply the source for the modified version of the library, so you've broken substitutability but you haven't broken the copyleft. As far as I can tell, the LGPL's goal of substitutability seems primarily aimed at cases where you don't intend to substantially modify the library; after all, you can also change the library's API/ABI and thus make the original no longer substitutable. (Personally, I don't really see the point of the substitutability provisions anymore; I really just want a copyleft on the library itself.)
> I don't either, but I write a BSD-licensed library that I absolutely want to be usable with proprietary software without any difficulty.
Fair enough. However, in that case the copyleft on the LGPL software seems equally onerous, since users of your library would need to ship the source of talloc, even though they don't need to ship the source of your library. (One other interesting provision: the users of your library couldn't use a license that prohibits reverse-engineering.)
In any case, yes, talloc's license effectively makes it unusable by libraries that want to provide an all-permissive license. If you really want to use talloc, you might consider writing to the authors of talloc; they might prove sympathetic to the needs of a BSD-licensed library, and grant a suitable exception. (Personally, I'd like to see some standard wording for an exception similar to the one used in UPX, namely that you can use it under an all-permissive license as long as you use an unmodified version.)
Yes, and you can also have multiple parents with a reference-count semantics.
void * talloc_steal(const void * new_ctx, const void * ptr);
The talloc_steal() function changes the parent context of a talloc
pointer. It is typically used when the context that the pointer is
currently a child of is going to be freed and you wish to keep the
memory for a longer time.
NOTE: It is possible to produce loops in the parent/child relationship
if you are not careful with talloc_steal(). No guarantees are provided
as to your sanity or the safety of your data if you do this.
There's also talloc_reparent where you explicitly choose the parent to change.
Or you could use C++ and RAII. It's an excellent fit for the problem that's been presented.
Though I do understand, there are some very special projects that can't use C++, and must use C. I'd say in most cases, the extra machinery you get with C++ outweighs the small performance penalty you get by going with straight C.
FWIW, with C++, the advantages of linking to tcmalloc make it the default for any code that I write that isn't intended to run as a daemon -- and even there, unless I hit tcmalloc's limit of not releasing memory back, it's still a viable choice.
This is exactly the qestion I was asking myself.
What if the child block needs to be deallocated before the parent block ? What if block dependency is changed ?
These functions are to be used with particular conditions which aren't clearly specified. When the conditions are met then it is a clear benefit to use them. But these add complexity and new pitfalls to inexperienced programmers.
I don't think it deals with cycles at all (it's hierarchical, after all), but talloc does allow you to do funky stuff with object ownership - it uses reference counting and features a talloc_steal() function to move ownership.
Apple doesn't advise developers to avoid ref counting, it advises them to avoid using -retainCount. -retainCount is a weak debugging tool, at best. It doesn't take into account autorelease pools, to start, and any framework-specific memory management is ignored. Its use is discouraged for good reason.
For better tools, check out Instruments, MallocDebug or leaks. Or read Apple's Memory Usage Performance Guidelines, specifically the "Finding Memory Leaks" section.  Apple's Technical Note TN2124 "Mac OS X Debugging Magic" even has, in the Cocoa section, an example of how -retainCount can be misleading and confusing. 
Nowhere does Apple advise Cocoa developers to avoid ref-counting. Cocoa's ref-counting/GC'd memory management system is considered a huge advantage for the platform.
Are you mixing up garbage collection and reference counting? I can understand a suggestion to turn off garbage collection if you want to debug a non-garbage-collected, reference counted piece of code. Otherwise, a link to documentation which describes what you are talking about would clarify things.