First, EASTL provides a set of containers, iterators, and algorithms that are identical in interface and behavior to std STL versions with one exception: allocators. EASTL has a different allocator specification which is simpler, more efficient, more flexible, and easier to use than std::allocator. Both std::allocator and eastl::allocator are described below. EASTL follows the defect reports and TR1 as well. EASTL additionally provides and uses some of the TR1 functionality as well, including most significantly the smart pointers, type traits, and hashing containers.
Second, EASTL provides extension functionality to the above containers, iterators, and algorithms. An example of this is the push_back(void), set_capacity(size), and validate() functions added to eastl::vector. The majority of these extensions are driven by the need for higher performance (push_back(void)), higher clarity (set_capacity), or higher debuggability (validate()). There are about 30 such extensions to the various entities in the library. These are described in detail below.
Third, EASTL provides additional containers and algorithms that don't correspond to std STL. These include intrusive_list, vector_map, fixed_hash_set, slist, ring_buffer, radix_sort, has_trivial_relocate, and others. These are described in detail below.
There are additional differences not related to the functional specification. They include a programming philosophy that emphasizes readability, consistency, and optimal performance on limited hardware.
Other nice additions are the fixed-sized containers (fixed_vector<> and friends) and the somewhat-enhanced string<> class (which adds things like sprintf() support).
I've missed all of these things greatly since leaving EA.
Game companies generally use C++ because they need to interface with existing C and C++ library code (at least at the engine level). C++ also allows for various low-level optimizations when such things become necessary, such as resorting to inline assembler or stashing flag bits in pointers.
That's not to say that other languages aren't capable of these things, of course.
(I spent seven years working for EA, and I contributed to EASTL while I was there.)
A surprisingly large number of companies have their own STL implementations from Investment Banks to Telecoms companies. Often because they're doing funky things (often related to memory allocation) which would be outright impossible to do in languages like Java or C#.
Not trying to argue with you, I'd just like to mention as a curiosity that you would be surprised by how much 'funky' stuff you can do in C#. It has pointers (not just references, but real C-style pointers), it can pin objects in the memory, etc. It's hidden from the sight of the regular programmer, but it's there and it's part of the language, none the less.
Here's my experience. Whenever a C++ programmer uses the word performance, it's never to describe using template-based generic data structures. I do it and the performance is fine, but the hard-core C++ programmers seem to hate STL.
Incidentally, I asked someone about memory management in an interview for a C++ position. He said he wrote his own memory manager (object pool, basically), and never even bothered to use "new" or "delete".
The reason you need to really care about memory management in game development is because most of the consoles don't have virtual memory. Once you run out (and you don't have that much), your game stops working. It's very similar to embedded system / real-time programming in this way. So you need to plan out how much space everything will use in advance, basically. Lots of fixed-sized buffers and pool allocators.
And the reason you don't use garbage collection is because real-time games can't afford a GC pause.
I imagine most STL classes still don't fit game programming paradigms. My very first experience with this was using vectors to store vertices as an intermediary before sticking them into a vertex buffer. Granted, my math was awful, but the memory overhead was absurd. I constantly find myself over using STL in my programs because of how easy it is, then redoing it to use standard C/C++ arrays, linked lists, etc. for efficiency.
Ranting aside, I'm most curious to look at any optimizations they've made.
A bit part of the optimization is the flattening of deeply nested calls to tiny functions that are prevalent in the STL. In the general case of STL usage, these calls serve an important purpose in the design and flexibility of the library. They also generally inline away in optimized builds. However, in the specific case of game development, they greatly slow down the performance of non-optimized builds. That performance is a big deal when you are debugging a game that by design stresses the limits of the fixed hardware platform it is running on (you can't upgrade the CPU in a PS3 devkit). The nested inline functions also make debugging much harder because stepping into observe simple actions can result in walking a long chain of calls that each effectively do nothing.
Any sane implementation of std::vector is probably going to have three pointers of overhead, and might allocate twice as much memory as required in the worst case when you push a bunch of objects into it. That doesn't seem totally absurd to me - was yours much worse, or am I just being more tolerant of it?
Sorry, you are correct; I should have been more specific: the memory management overhead was tremendous. For every object added and removed, the vector and all of it's data was basically reallocated, moved, deleted, repeat. Moving to a fixed size array in C++ I saw an easy 50% drop in CPU usage.
I think STL is a great set of libraries, but I also think they're sometimes too easy to use if you don't consider alternative implementations and the impact of each way.
For every object added and removed, the vector and all of it's data was basically reallocated, moved, deleted, repeat.
That doesn't sound right at all.
Of course, if you add or remove objects at the vector's head, all of its data will be moved, though probably not reallocated. But if you have a need for frequent additions and deletions anywhere but the tail of the vector, you picked a wrong data structure. Double ended queue (<deque>) would would've suited you much better.
Moving to a fixed size array in C++ I saw an easy 50% drop in CPU usage.
Did you reimplement element addition/deletion logic from scratch?
I wouldn't call myself an expert on vertex buffers, but IIRC the OpenGL API to load data into a buffer just takes a pointer to the data and a size, so requires that your data is contiguous in memory beforehand - so deque may not have been an option.
Agreed. But I can't think of any data structure that would offer you contiguous, order-preserving storage and O(1) insertions/deletions of arbitrary elements.
There are actually implementations of deque which can store elements contiguously: imagine a vector where the "first element" is placed in the middle of the vector, and push_{front,back} reallocate whenever either end is reached. It wastes more memory than the typical deque implementation, but the elements are contiguous.
> Sorry, you are correct; I should have been more specific: the memory management overhead was tremendous. For every object added and removed, the vector and all of it's data was basically reallocated, moved, deleted, repeat.
If that was the case, you were doing something wrong.
You're being more tolerant. When you're building a high performance game engine for fixed target hardware, the last thing you want is to be doubling up the size of your arrays.
If you know before the allocation how much entries it must hold at maximum, you can give an std::vector that information. There is simply no case where you are forced to require more memory with an std::vector but not with a manual implementation.
If you have a lot of small vectors those three pointers can be expensive. On a 64 bit machine they're 8 bytes apiece.
If you know the vector isn't going to be enormous you can swap two of those pointers for unsigned integers, cutting overhead down to 16 bytes total. If you know more about your data and usage patterns you can probably get it further. For very small vectors of largely static data you might trade one of the integers for linear-time insertion and deletion. If capacity is known at compile-time you can make it part of the type. If you know that size() will always equal capacity() you can get rid of the size int.
It's also worth noting that you don't have to double your storage on every reallocation to avoid linear-time insertions - you just have to multiply it by some constant that's greater than one. If you don't know how many reallocs you're going to do, but you know that it isn't many, you might want to just multiply your capacity by 1.1 every time to avoid a lot of wasted space. Assuming, of course, that you're happy with bringing floating point computation in, and that you've thought about how you're going to increase the capacity of a vector of size 1...
If you don't overallocate the vector then push_back() runs in linear time instead of amortized constant which is generally a worse tradeoff. You can avoid the whole issue by calling reserve() ahead of time, but if you don't know how many elements you're going to have then you're in trouble, but you won't be able to do much better by replacing vector with your own array either.
It's great until you need to allocate 3.9Gb of vector on a 32bit machine.
Somebody said computer science is solving yesterdays problems on tomorrows hardware, the rest of science is solving tomorrows problems on yesterdays hardware
If you implemented this as follows, then yeah, you'll get log(N) reallocations, where N is your final vector size. But if you have a pretty good idea of what N is, you should set capacity at the time of vector declaration to avoid useless reallocations. You shouldn't expect any library to make this sort of predictions for you.
Differences between std STL and EASTL
First, EASTL provides a set of containers, iterators, and algorithms that are identical in interface and behavior to std STL versions with one exception: allocators. EASTL has a different allocator specification which is simpler, more efficient, more flexible, and easier to use than std::allocator. Both std::allocator and eastl::allocator are described below. EASTL follows the defect reports and TR1 as well. EASTL additionally provides and uses some of the TR1 functionality as well, including most significantly the smart pointers, type traits, and hashing containers.
Second, EASTL provides extension functionality to the above containers, iterators, and algorithms. An example of this is the push_back(void), set_capacity(size), and validate() functions added to eastl::vector. The majority of these extensions are driven by the need for higher performance (push_back(void)), higher clarity (set_capacity), or higher debuggability (validate()). There are about 30 such extensions to the various entities in the library. These are described in detail below.
Third, EASTL provides additional containers and algorithms that don't correspond to std STL. These include intrusive_list, vector_map, fixed_hash_set, slist, ring_buffer, radix_sort, has_trivial_relocate, and others. These are described in detail below.
There are additional differences not related to the functional specification. They include a programming philosophy that emphasizes readability, consistency, and optimal performance on limited hardware.