They're not useable for anything serious, i.e. high throughput, low frequency, m...

delta_p_delta_x · 2026-06-13T20:20:39 1781382039

> anything serious, i.e. high throughput, low frequency, massively concurrent work

Why is only 'high throughput, low frequency, massively concurrent work' considered 'serious'?

Pannoniae · 2026-06-13T22:33:08 1781389988

They're just clearly inferior in pretty much any situation.

The map stuff the other posters summed up well but even std::vector is dogshit with pretty much all implementations having inlined grow code in push_back, a not too great API and missed optimisations e.g. no trivial relocation when growing the vector / moving it and no useful APIs such as "grow but don't initialise"...

jstimpfle · 2026-06-13T22:45:29 1781390729

To be fair grow-but-don't-initialize is a pretty fundamental part of the API, the reserve() method.

But already the basic premise that you should push back without thinking is wrong. You will suffer reallocations and invalisations when you least expected them, and frankly you have to architect around that fact which is a terrible restriction. You can work around by pre reserving but at that point it's just a basic fixed heap allocated array but worse because the type gives you a weird look all the time, "I'll realloc as soon as you don't pay attention, harhar"!

tialaramex · 2026-06-13T23:43:39 1781394219

std::vector lacks what I call the "bifurcated reservation API". It has Rust's Vec::reserve_exact but not Vec::reserve -- these APIs serve subtly different purposes, the former (which C++ calls just "reserve") says "Here's a hint for exactly how big this container will ever grow" while the latter says "Here's a hint for how big this container will get in my immediate future, but it might grow further later".

The implementation always tries to grow (if necessary) to the exact size chosen for Vec::reserve_exact, but for plain Vec::reserve if growth is needed it always grows exponentially, not to the exact size, preserving the O(1) push cost.

For a typical "doubling" growable array type, if we're pushing groups of ten items, reserve_exact or C++ grows like 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 ... which is much worse than O(1) whereas the correct reserve grows 10, 20, 40, 80, 160 preserving O(1)

For trivial types you can work around this in C++ with a little work, and for the non-trivial types you can work around it with a bunch more extra code, but you probably won't.

Bjarne among other people teaching C++ recommend just not using the reservation API as a reservation API because of this problem†, and the resulting teaching definitely leaks into CS graduates and even into languages which have the correct API and so you have to un-teach the bad lesson.

In applications where you actually can't afford to pay for growth (or at least in some cases can't afford this) I also like Vec::push_within_capacity which I believe comes from Rust for Linux where the kernel legitimately needs this "If there's room push, otherwise I have a plan B" approach.

† To Bjarne this API is instead conceived of as a way to preserve reference validity. Since it won't grow, our references will still work.

jstimpfle · 2026-06-14T06:53:08 1781419988

Vec::reserve() is the behaviour you get from C++ std::vector push_back() (implicit just-in-time reserve), so right now I can't see a situation where I'd want the explicit Rust version even if I didn't think the whole push realloc thing is mostly a bad idea.

Yes, Rust version allows you to maybe skip a reallocation step or two by doing explicit up front reallocation. But remember most allocation work is always from the last grow anyway. The Rust version seems like a microoptimization, giving a little bit more explicit control in a situation where you've already pretty much given up control and gone like, throw hands in the air, we're doing push_back()!

tialaramex · 2026-06-14T10:45:00 1781433900

As with the likely/ unlikely branch hint the problem is that programmers are wrong much more often than they expect on both sides. They're too often wrong to think they know the final size - hence Bjarne's caution - but they're also too often wrong that they've got no idea how much capacity they need at all. So hence this API.

You're correct that this isn't a huge optimization. But it more than pulls its weight directly because it's a small boon when you're right and it doesn't have the terrible penalty that Vec::reserve_exact has when inevitably the programmer is sometimes wrong. It's very much about saving pennies, but the growable array type is so widely used that counting pennies makes sense.

I have a lot more thoughts about reservation, but these suffice for specifically the growable array type.

jstimpfle · 2026-06-14T19:36:53 1781465813

There are many many situations where I know exactly what is the common path and what is the uncommon path. And when I don't know, it's much harder to optimize!

If you're counting pennies, Vec::reserve() (inexact) is hardly what you want, because in the worst case you're wasting a factor of 1.5x or 2x of elements due up-front overallocation. Maybe chunk lists could be better, overhead is bounded by chunk size and all operations are constant-time. No pointer invalidation either. And you can pool those chunks, preventing memory fragmentation and improving memory utilization, since there aren't a million different sized allocations in your process.

tialaramex · 2026-06-15T07:42:10 1781509330

> If you're counting pennies, Vec::reserve() (inexact) is hardly what you want, because in the worst case you're wasting a factor of 1.5x or 2x of elements due up-front overallocation.

I think this is "Trump math" but assuming we're actually looking at the same over-allocation this isn't new, what you're calling "wasting a factor of 2x" was already what you were paying by using the amortized O(1) growable array type, that's its central trade off, so yeah, the type with that property does still have that property.

> Maybe chunk lists could be better, overhead is bounded by chunk size and all operations are constant-time.

Basically std::deque. Surely no more need be said ?

jstimpfle · 2026-06-15T08:14:12 1781511252

> was already what you were paying by using the amortized O(1) growable array type

As I said, I don't think growable array data structure is a good idea.

> Basically std::deque. Surely no more need be said ?

Ok now you're really out of your depth here. Pretty sure you haven't actually used it, it's a somewhat obscure data structure even though the more serious C++ programmers have heard of it, most have probably never used it.

From a web search, std::deque seems to be pretty universally thought of as a weird and very bad data structure. I've had to learn myself because I did actually try to use it myself recently. Beyond being unintuitive to use due to being "abstract" (had 2-3 serious bugs, e.g. unexpected iterator invalidations that happened only under load), apparently std::deque is not even specified as a chunk list. O(1) random access must be guaranteed. So it must be much more complicated than that, I don't know the details though.

And while actual implementations do use chunks in some way (just not in a simple chunk list), apparently "chunks" is not a part of the std::deque spec, and hence there isn't anything standardized about the size of these chunks either.

- On MSVC, chunk size is 16 elements so chunk size of std::deque char is 16 BYTES !!!!! I was thinking they can't be serious, but apprently it is the case

- On GCC, chunk size is 512 byte fixed.

- On Clang, chunk size is 4096 byte fixed.

It _is_ a shithole, surely no more need be said?

tialaramex · 2026-06-15T13:14:02 1781529242

> As I said, I don't think growable array data structure is a good idea.

It is still quite hard for me to take this seriously as a belief.

> Ok now you're really out of your depth here.

I don't think so. As you found, the consensus is that this sort of thing is a terrible idea. The O(1) complexity looks good superficially but your actual performance is miserable.

> I don't know the details though.

Raymond Chen has a pretty good explanation of the three implementations of std::deque, like he did for their std::string but Raymond's goal is to help you do forensics, he's not here to tell us which data structures are a good idea. https://devblogs.microsoft.com/oldnewthing/20230810-00/?p=10... might work, or, given Microsoft, it might already be dead by the time you click it.

jstimpfle · 2026-06-15T14:47:58 1781534878

> The O(1) complexity looks good superficially

Without knowing more about the details of the spec and of real world implementations [0], I'll boldly claim that the O(1) is the precise problem why this is a weird data structure (I never needed or expected that sort of constraint). It is not a chunk list, like many (including me) seem to have assumed.

I don't even have performance problems measured with this, also given that my std::deque is not currently in production, it is used as a simple streaming queue that only has a couple megabytes/second of bandwidth requirements, and it is not being built with MSVC (so chunks are going to be 512 or 4096 bytes). But it is a latent problem.

[0] Btw. I don't want to know the messy details of this right now, I need simple not complex. I've learned not to add more complexity in a futile attempt to fix things that are fundamentally broken. What use case is this going to be solved by std::deque, name me one such use case and tell me with a straight face that it's not a completely made up case or a super niche case that would likely also be better served by a different approach.

Or take it from someone else if you would, is Chandler Carruth high-profile enough and free enough of conflict of interest to be credible for saying that std::deque is dumb? https://news.ycombinator.com/item?id=22962980

tialaramex · 2026-06-15T15:52:35 1781538755

> It is not a chunk list, like many (including me) seem to have assumed.

The thing C++ programmers tend to expect (though perhaps not you) is Rust's std::collections::VecDeque - the growable array type again but used as backing for a ring buffer.

This type has amortized O(1) push and pop at both ends, I assume since you didn't like the amortized growth of Vec / std::vector you'll feel the same way but that's what most programmers actually want from such a type - in use as a queue it won't repeatedly cycle allocations so long as the size is constrained, because it's a ring buffer. If you actually know the needed size up front the overhead from this structure, which can grow, versus a structure which can't grow, is a single integer for the capacity.

But as you saw, std::deque is... not that. STL wasn't able to explain what it's for when I asked him, so, if anybody knows it apparently doesn't include the people maintaining the C++ standard library implementations.

> Or take it from someone else if you would

I think the trouble here is that you've misconstrued this entire part of the conversation. I was gesturing at std::deque because it's terrible, and your response, that it's terrible, isn't a rebuttal as I think you're expecting. We agree on that, std::deque is terrible, our disagreement seems to be that you think a slightly different terrible data structure would be a good idea somehow and I do not.

jstimpfle · 2026-06-15T18:17:43 1781547463

I don't know the size up front because it goes up and down depending on load. I want to be able to buffer jitters/spikes but not hog memory when there is no data in the queue. I was assuming it's natural to want a simple linked list of chunks to represent this, maybe chunks can be partially full here. So chunk -> chunk -> chunk, and each chunk is actually a node with a buffer pointer + size + fill fields, or maybe the data comes directly after the size + fill fields so no buffer pointer needed.

That easily gives O(1) access to enqueue/dequeue, and the queue naturally grows and shrinks (even within configurable bounds) as part of reading and writing. Blocks can be taken and returned from/to some fixed size block pool. This design would be completely satisfactory. It's also very easy to code from scratch, so why should I settle for anything less, for example a growable vector where a reallocation would mean blocking other threads for extended time? But I was told to "not reinvent the wheel" and to "not overengineer it" so I tried to change my mind and make use of an STL container data structure.

> our disagreement seems to be that you think a slightly different terrible data structure would be a good idea

I don't know why you think that I would be thinking that, and I'm confused that you say you were gesturing at how std::deque is terrible, and not sure why I should be the one misconstruing something, when my premise from the beginning was "STL containers are terrible in many ways". And I don't get at all why you responded with "basically std::deque" to my "maybe chunk lists would better". It doesn't make the least bit of sense to me.

But let's settle this here, this is a pointless fight...

jstimpfle · 2026-06-14T06:31:53 1781418713

Feels like typical improvement/perfection tunnel vision syndrome to me, though. That syndrome is engrained in C++ community and I think also Rust community, although it looks like the latter took the chance to do many things better from the start learning from C++'s mistakes.

Realloc is pretty much never the right way in whatever form, and I've never seen any need to include realloc in any of my own allocators (mostly blockallocators and linear allocators and pools/free lists, sometimes using malloc/free).

jstimpfle · 2026-06-13T21:26:33 1781385993

You are free to make your own definition, what are your suggestions?

(Obviously I meant to say low latency, not low frequency)

delta_p_delta_x · 2026-06-14T10:05:49 1781431549

That is a cop-out. You made the assertion, you define it.

What about these particular workloads (and the environments they're used in) make them 'serious' and why are other workloads 'lesser' and therefore the standard library 'suffices'? Why not use better containers for everything? Google, for instance, universally recommends Abseil.

jstimpfle · 2026-06-14T19:06:38 1781463998

I stand by my statement. You are welcome to contend it, but then please actually suggest alternative workloads that could qualify as serious?

Google also recommends using Golang in many cases, which was explicitly designed for not-so-experienced people. It is more geared towards creating services quickly and robustly, not towards squeezing out the last 10x of performance.

I can't say anything about abseil, having never used it. "By Google" is not an immediate seal of quality, and it doesn't mean that one size fits all. I've recently seen a list of .so objects required in order link grpc (also by Google), there were like dozens of abseil .so's in the list IIRC. I don't like it! In my experience, validated countless times in my own practice, complexity => slow and hard to maintain and simplicity => fast and easy to change.

Templatized C++ containers are generic recipes, they can't make use of local context and hence don't lead to the simplest solutions. They produce tons of boilerplate. They give you a decent single-threaded baseline performance in many cases, sure, but now start hammering these data structures using 16 CPUs in parallel... you might find that you should have designed your application completely differently.

Like WalterBright says, I too use very simple data structures almost exclusively (arrays, linked lists, linear allocation). Most of these coded ad-hoc everywhere, very straightforward, very adaptable. Even hash maps I only use infrequently -- if I can, I just make keys such that I can index directly. I get performance from knowing exactly what happens, and minimize the work that needs to happen. I create immutable data objects (write-once) where possible. I've created my own pooled block-allocator for power of 2 sized allocations with book-keeping in shadow-memory (preventing fragmentation), massively reducing syscalls that modify virtual memory mappings and closely controlling memory used by various subsystems.

I don't expect a library like abseil to solve many of my problems, it probably makes some things "easy" by taking away the very control from me that I need to solve the problem I have, resulting in unsolved problems and complexity.

miroljub · 2026-06-13T22:13:42 1781388822

Because if you don't need any of these, any slop implementation will do.

stinos · 2026-06-14T13:59:50 1781445590

If you haven't been thinking deeply about memory management and concurrency, you won't be able to understand, no offense meant

I do think about that, when needed. My point is that these containers can be 'good enough' in places where it doesn't really matter, not that they're always the go-to thing. E.g. I really don't see any issue using a map as part of a configuration type of object which gets read from args or json and which only gets used once at application start.

jstimpfle · 2026-06-15T07:48:35 1781509715

Yes. When you don't have any specific requirements, any mediocre data structure will do.