Are there any good benchmarks which demonstrate the performance characteristics ...

xfs · on May 10, 2023

https://youtu.be/_qaKkHuHYE0 (CppCon) A senior software engineer at Google tried to optimize tcmalloc by replacing a mutex with lockless MPMC queue. After many bugs and tears, the result is not statistically significant in production systems.

gpderetta · on May 11, 2023

A fully lock free allocator would be a huge result by itself: you would be able to allocate from a signal handler, or having truly lock free algorithms that do not need custom allocators, or avoid deadlock prone code in tricky parts like dlclose...

But I guess this was only one part and malloc wouldn't be fully lockfree.

In any case the lockfree mallocs designs I have seen use NxN queues to shuffle buffers around, but I guess it would be unsuitable for a generic malloc.

bluGill · on May 10, 2023

This is impossible to do in a useful way for publication. You can do case studies, but minor changes in various factors that seem minor can make a massive difference in benchmarks. As such you need to find real world data, used in a real world scenario, for your application: then benchmark it. Even then you have a benchmark useful for your application only, and not worth publishing.

kerkeslager · on May 10, 2023

I disagree. I've gotten a lot from reading publications about people profiling their applications and posting about the results with descriptions of their application design and load. Yes, obviously you can't read that and draw conclusions about how the same data structure or algorithm will perform in your application, but it helps you build an intuition for what is likely to work in applications with different characteristics.

Here's an example: https://queue.acm.org/detail.cfm?id=1814327

If you read that for the first time and don't learn something, I'd be quite surprised.

josephg · on May 11, 2023

Thanks for the link! I read that about a decade ago, and its still a great read. That article makes it seem ridiculous to implement software any other way.

And yet, yesterday I watched a talk about how ridiculous it is to use mmap to implement a database:

https://db.cs.cmu.edu/mmap-cidr2022/

And I ended up side tracked watching some talks from CMU about database page buffers and such.

Its weird how people with different backgrounds (OS / kernel people and database people) come to very similar problems in computing with very different perspectives, and they end up implementing very different systems as a result. And in each case, each community thinks the other way is basically wrong.

My understanding is that Linus Torvalds thinks O_DIRECT is a ridiculous flag that database people probably don't want. And from a database perspective, its crazy how difficult the linux kernel makes it to write high performance filesystem code that never corrupts data if the system crashes. fsync is a misshapen sledgehammer, and disk write barriers or IO completion events are totally missing from linux.