
Mimalloc – A compact general-purpose allocator - dmit
https://github.com/microsoft/mimalloc
======
danlark
We tried mimalloc in ClickHouse and it is two times slower than jemalloc in
our common use case
[https://github.com/microsoft/mimalloc/issues/11](https://github.com/microsoft/mimalloc/issues/11)

~~~
MaxBarraclough
To be clear, your program ran at half speed, right? That's far worse than
doubling the time spent in memory-management functions.

~~~
danlark
Yes, our program ran at half speed.

------
nh2
Are there functions available with which I can at run-time query how much OS
memory is used, how much handed out in allocations, how many mmap()ed pools
are used, and so on?

I find that one of the most important features of a malloc library to debug
memory usage.

glibc has these functions (like malloc_info()) -- they are very bugged in that
they return wrong results, but after patching them to be correct, they are
super useful.

~~~
jasonzemos
Any more info/link on how they are incorrect and the patches you need to fix
them?

~~~
nh2
Sure, my patches are linked in the bugs I filed:

[https://sourceware.org/bugzilla/show_bug.cgi?id=24026](https://sourceware.org/bugzilla/show_bug.cgi?id=24026)
\- "malloc_info() returns wrong numbers"

[https://sourceware.org/bugzilla/show_bug.cgi?id=21556](https://sourceware.org/bugzilla/show_bug.cgi?id=21556)
\- "malloc_stats printing size_t fields as unsigned int"

If you're interested in this topic:

After finding one bug after the other due to 32-bit integer overflow in
malloc.c, I just searched for "unsigned int" in that file for fun, and 30
seconds later found what I consider a security vulnerability in realloc():

[https://sourceware.org/bugzilla/show_bug.cgi?id=24027](https://sourceware.org/bugzilla/show_bug.cgi?id=24027)

In certain situations, if you realloc() e.g. 32G + 5 bytes (reallocs this
large can happen in large programs e.g. for data analysis), it'll copy only 5
bytes, and leave the rest as memory garbage.

That experience taught me that open source code being old doesn't mean anybody
ever read it.

~~~
rurban
pt2malloc is only maintained by glibc, but not really. Since they cannot
maintain it, they want to get rid of it. The upstream maintainer released a
better version pt3malloc, which glibc refused to adopt. It needs one more word
per alloc.

------
c-smile
Looks like the same idea as Konstantin Knizhnik's thread_alloc:

[http://www.garret.ru/threadalloc/readme.html](http://www.garret.ru/threadalloc/readme.html)

At least the same architecture of allocated chunks management.

------
shereadsthenews
I always find comparisons with tcmalloc hard to parse, since it has a million
knobs and the defaults are terrible. If they are running with 16 threads I
would normally advise increasing the thread cache size far above the default
3MiB. also interesting would be jemalloc in per-CPU mode.

As always the thing to do is build and run your own workload and see the
results.

------
huhtenberg
The tricky part with allocators is always the multi-threaded setups.

Even something as simple as a bunch of threads doing malloc-free in a loop
will drop performance of a lot of allocators to the floor, due to some sort of
central locking or excessive cache thrashing. This is typically solved by
adding per-thread block pools, free lists or some such.

If you go further down the rabbit hole, there's a case when blocks are
allocated in one thread and freed in another, your very typical producer-
consumer setup. This too further complicates things with the pool/freelist
setup and requires periodic rebalancing of freelists and pools.

So once all this is accommodated, a well-tuned allocator inevitably converges
to a model with central slabs/pools/freelists and per-thread caches of the
same, which are periodically flushed into the former. Then it all comes down
to routine code optimization to make fastpaths fast, through lock-free data
structures, some clever tricks and what not.

In other words, it's always nice to read through someone's allocator code, but
in the end this is a very well-explored area and there's basically a single
stable point once all common scenarios are considered.

~~~
bakery2k
What if I'm writing code that's strictly single-threaded? Presumably malloc
could be simpler and/or faster in this special case?

Is there a production-ready allocator that's optimized for single-threaded
use?

~~~
gok
It's very hard to not accidentally pull in a dependency that makes your code
multithreaded.

~~~
01100011
You guys could argue this all day and both be right. Depending on what type of
system you're working on, you are either very likely to pull in a
multithreaded dependency or you are not. Systems programming is different from
game programming is different from web client programming is different from
kernel development. In most of my career, I would have never accidentally
pulled in a multithreaded dependency, but I could see how that could be easy
to do in some cases.

------
fwip
The benchmarks are very impressive! I am excited to read through this code and
think on it.

Edit: They do mention they're all from AMD's EPYC chip, which is a little
idiosyncratic. Speculation: perhaps page locality is more important on this
architecture.

~~~
the_duke
The benchmark repo contains results with a Intel Xeon.

Looks roughly similar: [https://github.com/daanx/mimalloc-
bench](https://github.com/daanx/mimalloc-bench)

------
longcommonname
Just a general question in regards to using memory allocators, in the
consideration of a C only application.

The problems I encounter with allocator and heap manager are almost never
solved by these types of frameworks. These problems include:

1\. Improper usage of the memory returned that contradict implementation. 2\.
Pool allocators that don't have separation between individual blocks
(performance reasons). 3\. Specifying the lifetime of the memory to a thread
or until specific events happen. 4\. Difficult to diagnose corruption, with
any tool available.

Here's a specific scenario I deal with very often: There are N persistent
worker threads. These worker threads have their own pool of memory, and prior
to getting work we know this pool is clean. After the work is finished and
before more work is recieved the memory is cleaned. Any excess requested
memory is returned to the global-pool, and any memory that is "unmanaged" is
dealt with properly.

This means that people can do whatever heap management call you use (void *
obtainMemory(size_t);) in the scope of business logic without having to worry
about infrastructure concerns.

Having a faster malloc/calloc doesn't benefit me as much as making the usage
of memory easier, and the understanding of what happens easier.

------
civility
Is anyone aware of a good/fast single threaded allocator for cases where you
don't need/want to pay for thread safety?

~~~
kllrnohj
If you're single threaded then you'll never have mutex contention so they'll
always be fast-path. I'd suggest you actually prove that the memory barriers
are actually a problem for you via profiling, since it's somewhat unlikely it
is.

~~~
adwn
> _If you 're single threaded then you'll never have mutex contention so
> they'll always be fast-path._

Concurrent algorithms (to which multi-thread capable allocators belong)
typically necessitate design and performance compromises which aren't
nullified by creating only a single thread.

> _I 'd suggest you actually prove that the memory barriers are actually a
> problem for you via profiling, since it's somewhat unlikely it is._

civility's reply to your post is somewhat rude, but they're right. It's rather
arrogant to assume that the poster doesn't know what they're doing, without
knowing anything about their specific problem. Modern allocators are complex
pieces of software, typically with a lot of knobs and dials, and it is
entirely plausible that an allocator tuned for single-threaded programs is
more performant for a specific use case than a generic multi-thread allocator.

~~~
kllrnohj
You can get large gains tuning an allocator for a specific use case, yes, but
that's very different from tuning it for thread safety. Particularly since
most major allocators use a first-level allocator that's thread-local and is
therefore not paying any thread-safety tax in the first place.

> It's rather arrogant to assume that the poster doesn't know what they're
> doing

Given the question they asked I don't believe it was arrogant at all to assume
they don't really know what they were doing. Their response seems to justify
the push to start from the basics as well.

~~~
civility
Nice job - you've succeeded in being just another condescending asshole.

~~~
dang
It's true that shallow comments are frustrating, but aggressive ones are
worse. If you keep breaking the site guidelines like you did repeatedly in
this thread, we're going to have to ban you. Can I persuade you, instead, to
review
[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)
and use HN as intended? It sounds like you know a lot, and we want
knowledgeable users. It's just that we want not to have a tirefire wreck of a
website more—so we have no choice but to put out flamewars.

~~~
civility
Dan, go fuck yourself. By letting people get away with snarky condescending
tones and banning people who call them out on it, you're encouraging shitty
behavior. "Don't be snarky" is right up top in the comments section of the
guidelines, but it's easier for you to chastise me because I called the snarky
guy an asshole. That's just lazy on your part.

Aggressive comments aren't worse, they're just easier for you to recognize.

~~~
dang
You're right that "Don't be snarky" is high up in the comment guidelines, but
even before that comes "Be kind." Being unkind is worse than being snarky. I
don't mean morally worse, but worse in the long-term effect it has on the
forum. The guidelines aren't moral edicts, they're heuristics designed to
prevent the system from burning out. Snark is bad, but aggression is worse: it
leads to flamewar and eventually, as people keep upping the ante, to scorched
earth. It's in that sense that what you did was worse than what you were
reacting to. Snark is like vandalism—it eventually wrecks a neighborhood. But
aggression is like arson, or gun battles in the streets.

We care as much as you do about snarky and condescending comments. I
personally share your feeling of being even more averse to those than to the
cruder abuses. But for a bunch of reasons, they're harder to moderate. Here's
one: people's interpretations of what counts as snark or condescension vary
widely. There is little community consensus around this, and moderation can
never get too far ahead of the community view. We have to choose our battles
wisely if we don't want to spark backlashes, protests, and off-topic
distractions that render the cure worse than the disease.

I can't moderate based on my personal views. That's the reason why your advice
wouldn't work. We couldn't moderate based on your personal views either. It
isn't laziness—it's that you can't impose an individual interpretation set on
the community as a whole. People assume we do, but that's only because they
haven't learned about moderation the hard way. Moderation is extremely
different from extrapolating one's likes and dislikes into site rules and then
using power to enforce them.

Does that mean doing nothing about snarky and condescending comments? Hardly,
and if you read my comment history (not that I recommend it), you'll find
plenty of examples of asking people not to do that. But they're ad hoc and I
try to be careful not to demand too big a leap from the reader.

The plan is, over time, to raise the bar for comments so that gradually the
snark and shallow dismissals stand out more prominently as abusive, the way
that "you're an asshole" comments do now. Then it will be possible for
moderators to do more about them. But this needs to happen slowly— frog-
boilingly slowly. If we push too hard, the community balks and moderation
loses power. We can only do what the community will support. We can lead—but
only a bit at a time. Even if
[https://news.ycombinator.com/item?id=20252539](https://news.ycombinator.com/item?id=20252539)
was trite and unhelpful, I guarantee you that the hivemind is not yet ready to
support moderator intervention at that level. It needs to get more refined
before that is possible. That's the long-term hope, but it will take years if
not decades to get there.

~~~
civility
I appreciate your reply, but I don't believe the vandalism and arson metaphor
fits. There are just people you let get away with crap and ones you don't. His
goal was to antagonize me while flying under the radar, and he succeeded. My
goal was to antagonize him in retribution, and I failed.

So where are we now? I'll go back to reading the headlines instead of the
comments because I'm not clever enough to keep people like that from getting
under my skin, and he'll go on to piss off someone else the next chance that
comes up.

Anyways, I regret snapping at you. Take care.

~~~
dang
> There are just people you let get away with crap and ones you don't.

It's really a comment-by-comment thing rather than a people thing. If you're
seeing cases where we're failing to moderate a post that cries out for it, the
likeliest explanation is that we didn't see it. We'd appreciate links, because
we can't come close to seeing everything here. Or of course you can flag the
comment (described at
[https://news.ycombinator.com/newsfaq.html](https://news.ycombinator.com/newsfaq.html)).
In egregious cases, emailing hn@ycombinator.com is best because then we're
guaranteed to see it sooner.

> His goal was to antagonize me while flying under the radar

Really, are you sure? Can you point to the comment that demonstrates this?
Because either I missed something obvious or you're reading perhaps a bit too
much into what was posted. Intent is notoriously difficult to read accurately
in these posts, as I'm sure you know.

~~~
civility
> Really, are you sure?

Pretty sure, and user adwn commented on it too. It's a pretty common pattern
to not answer the question and then indicate it was a dumb question in the
first place. Then he doubled down on the acceptable insults in the follow-on
response.

I'm sure I'm oversensitive to crap like this, but I don't think I'm wrong to
notice it.

I wonder if HN would ever add a feature to block/hide users others don't want
to see, something per reader. I don't see a browser addon for it, and while
I'm sure it could be done purely client-side, you might find stats about who
is being blocked and how often interesting if it was done on the server. Heh,
I'm sure I would end up on a few people's list, but it doesn't seem likely
that we're all just going to get along anytime soon.

Anyways, thank you again for the polite reply.

~~~
dang
You're welcome! and likewise.

I feel like we probably shouldn't add a block/hide/killfile feature because it
would be a step back into the siloed style of forum
([https://hn.algolia.com/?query=by:dang%20siloed&sort=byDate&d...](https://hn.algolia.com/?query=by:dang%20siloed&sort=byDate&dateRange=all&type=comment&storyText=false&prefix=false&page=0)).
That would feel easier in the short term but would perhaps be a retreat from
the hard problem of building a community that actually works. The thing about
that task is that it's always deeply unsatisfying and frustrating. One can
only fall short. Yet it seems like the right task to be working on. Perhaps in
the end, as Bob Dylan put it, we win the war after losing every battle. Or
perhaps it just falls back into the swamp. The ambition of HN has always been
to maybe stave off doom a while longer
([https://hn.algolia.com/?query=by:dang%20stave&sort=byDate&da...](https://hn.algolia.com/?query=by:dang%20stave&sort=byDate&dateRange=all&type=comment&storyText=false&prefix=false&page=0)),
or as pg put it years ago, "make a conscious effort to resist decline"
([https://news.ycombinator.com/newswelcome.html](https://news.ycombinator.com/newswelcome.html)).

------
m0zg
The important thing about all this is to measure perf on realistic workloads
before and after. I don't really believe in allocators that have "excellent
performance" on everything.

------
ksec
The Dev at Discourse also try it with Ruby, the result aren't as good as
jemalloc. [1]

[1]
[https://twitter.com/samsaffron/status/1143048590555697152](https://twitter.com/samsaffron/status/1143048590555697152)

------
tuananh
the redis benchmark is interesting! Maybe antirez can make sth out of it

------
john-aj
I never like names that require a “pronounced like” note, but cool project
regardless.

~~~
simias
Given the rather unpredictable nature of English pronunciation that's
basically required for any made up word.

~~~
jing
Particularly since it would not be unreasonable to assume that the "mi" in
mimalloc is incorrectly pronounced like the "mi" in Microsoft.

------
PieUser
they should not have tested on AWS

------
brian_herman__
I should create memealloc which converts all memory allocations to base64
encoded gifs

~~~
Iwan-Zotow
with cats and unicorns inside

