I cheered when he said "if you want to talk about premature optimization, you get out now".
Has anyone actually benchmarked this recently? It's such a trivial optimization for a modern compiler.
I have trouble believing that they'll produce different code when the return value isn't used.
I often still sometimes write "++it" out of habit, but usually will just go with whichever variant I feel conveys my intent better and leave the optimisation of this to the compiler.
Its also not always true that "++it" is faster than "it++", as this quote from Agner Fog's C++ optimisation guide notes:
For example, x = array[i++] is more efficient than
x = array[++i] because in the latter case, the
calculation of the address of the array element has
to wait for the new value of i which will delay the
availability of x for approximately two clock cycles.
Obviously, the initial value of i must be adjusted
if you change pre-increment to post-increment.
Edit: The same code with a preincrement notation would be x = array[i]; ++i;
> Obviously, the initial value of i must be adjusted
> if you change pre-increment to post-increment.
But if i is an iterator, you will create an unnecessary copy. Maybe a special treatment can be done for STL types. But in some other cases, expecting ++i's behavior when doing explicitly i++ would surprise the user.
I think the compiler should prove that constructing and destroying the object doesn't have side effects before skipping those steps.
I used to wonder the same thing, but C++ is just weird in that way.
However, as was said before, any performant iterators ought to be header-only implementations, and if they have no side effects, I don't see why a compiler might not do away with the copy altogether.
Personally I always use the prefix increment (unless the other behaviour is needed).
Post increment: https://godbolt.org/g/k5rcPA
Pre increment: https://godbolt.org/g/WgfMzU
godbolt is a great resource to see just how good modern compilers are.
Do you have anything particular against this suggestion ?
Assuming the two are written sensibly, and you don't have a Sleep(1000000) in your prefix version or something, the best case is that the two perform the same. The worst case is that the compiler can't spot that the result of the postfix version can be ignored, and the postfix version is then a bit less efficient.
I'm not saying this is likely to end up being the bottleneck in your program... just that choosing ++it is +EV.
I suspect lots of people use it because they learned it that way and now it's been blah blah blah years and it's too late to change. But for some reason, they don't want to admit this, so they come up with spurious reasons for changing being the wrong thing. Which is odd, because when the two would be equivalent, force of habit is basically the only good reason to prefer it++ over ++it.
Sounds like pascal's wager
That's not the case here. There are actually only two alternatives.
Anyway, using ++i is today idiomatic in C++.
For example, the 90/10 "rule". You should focus your optimizations on things that matter. If applying the pre-increment operator vs post-increment operate makes any substantive difference, then it warrants better solutions such as:
* Don't use STL. If you're running into overhead problems here you can almost certainly build faster custom data structures more specific to your usecases.
* Use inline assembly for creating the tightest inner loops. If the overhead for an increment is significant, then that suggests the total number of operations in a loop is small. There may be some gains to be had by optimizing the assembly for the specific architecture and usage constraints.
* While not all folks get to use C++11, you should also be using range based for loops in general which makes the number of cases where you see a manual increment on an object very small these days.
* A STL vector is typically at least 3 * sizeof(void * ) for the begin, end, and allocated size to be a general purpose solution . You can trivially trim down space if you assume your memory is contiguous and then store deltas rather than pointers for the vector; which can be a nice win for 64-bit pointers.
* A std::function blossoms to 48 bytes to support all the various usecases from lambdas to functors to function pointers. Using a template for lambda functions can allow inlining and be a win here, or writing your own that doesn't support all the usecases.
* The lifetime of a std::string can involve a lot of allocations and deallocations. Fixed size char arrays can be a win.
You can address all these with your own library functions. This is very consistent with the rest of the post's content, and in fact they suggest some of these approaches themselves.
While these don't directly attack some fictitious ++i vs i++ performance bottleneck example, they will improve cache utilization and probably actually improve performance of your application :)
Do C++ assumes that i++ and ++i same for overloaded operators?
edit: same if they are same in the case of primitives
You separate overloads need to be provided and the compiler cannot know if the behavior is the same.
I always use the same approach as in pure OOP languages and FP ones, assume everything is a method call.
- BUT -
Even if it is different, in my 20 years of production C++ and 10 years of console game dev with plenty of profiling, optimizing and assembly inspection, I have never seen a case where optimizing the loop counter would have made a meaningful difference. It is possible, and I'm sure someone has seen it, but I never have, and I've seen a lot of code.
It could be a bit more costly if, say, you're using a library that does something like added runtime checks for iterator validity. Or maybe you're using an STLish type that has a more complicated iterator.
So, no, it's unlikely that there will be a difference 99% of the time. On the other hand, the pre-increment is never slower so it's a good habit to have anyway... and it's more idiomatic C++ as well.
So it becomes "why depend on maybes?"
In the prefix form (the first one) the expression's result is the incremented value, while in the postfix form, the expression's result is the value before it was incremented.
However, compilers are so crazy with optimizations these days that I wouldn't be surprised if they can figure out that the code isn't actually relying on the semantics of the postfix form and convert it to prefix form.
I found the next posts/chapters on Marshalling and Encoding even more interesting with comparison and pros/cons of different encoding techniques. I was not aware of the FlatBuffers library and see why the author likes it and may have to look into using in future projects.
But the main thing is that the hash table lookup mostly is in the same memory area until you try to get the actual value out of it. Also the author mentions as a side note the problem with them "(though they do have their drawbacks in the worst-case scenarios, so you need to be careful)"