

Coding Horror: The Sad Tragedy of Micro-Optimization Theater - twampss
http://www.codinghorror.com/blog/archives/001218.html

======
bradtgmurray
Jeff is not quite on the ball with his reference to the "Shlemiel" algorithm.

Joel originally described the "Shlemiel" case when referring to strings in C
not having a length stored in them, and needing to use strlen to walk the
string to find the length. The Shlemiel problem comes up when you use strcat
in a loop, since each strcat call needs to recalculate the length of the loop,
which takes longer as the string gets longer.

The problem with concatenating strings is not a length issue, but a memory
issue. In order to concatenate a string in C# or C++ using built in string
classes, the location to store the new resulting string has to be allocated
and the two strings have to copied into this new location. The allocation does
not take up a trivial amount of time, and reallocating memory over and over
again for each time you concatenate a string is bad practice. There is still a
bit of "Shlemiel" since you have to keep copying your existing string, which
takes a larger amount of time as the string gets longer, but you have to take
in consideration the allocation, which wasn't present in the original problem
that the "Shlemiel" story described.

~~~
cconstantine
Memory allocation is something that is typically O(1), or O(log(n)). The
constants attached are fairly large, but doing lots of
allocations/deallocations isn't a BigO issue.

The schlemiel problem still exists when doing '+=' in a loop. Each copy
requires some code to walk the length of both sides of the operator, same as
strlen in Joel's example. The loop is an O(n^2) operation, even if it has a
smaller constant on it than a memcpy(src, dest, strlen(src)).

It's "Schlemiel", but with different constants attached. For big enough N the
O(1) or O(log(n)) of the allocations lose importance making the entire thing
O(n^2) just like "Schlemiel".

~~~
tptacek
So we may be saying the same thing, but, he didn't say allocation was N log N
--- and allocation/deallocation _is_ really expensive; it's consistently on
the top of profiles of programs I work on.

~~~
palish
_... allocation/deallocation is really expensive ..._

The cost of allocating memory in .NET (and probably other GC environments) is
basically equivalent to the cost of incrementing a pointer, which is to say,
basically free. The reason is because the .NET garbage collector periodically
"packs together" objects in memory, so the memory allocation algorithm
essentially becomes:

    
    
      result = _freeMemoryPtr;
      _freeMemoryPtr += numBytesToAllocate;
      return result;
    

... in other words, since objects in memory are periodically "packed together"
by the GC, there isn't a lot of memory fragmentation, so the allocator can
simply allocate from the end of the heap. It doesn't need to try to find a
"large enough free block" within fragmented memory, because memory isn't
fragmented.

At least, that is my understanding. I could be wrong because I haven't
investigated this thoroughly yet.

[http://en.wikipedia.org/wiki/Garbage_collection_(computer_sc...](http://en.wikipedia.org/wiki/Garbage_collection_\(computer_science\)#Generational_GC_.28aka_Ephemeral_GC.29)

~~~
tptacek
You're right; I should have said, "malloc/free is really expensive".

~~~
palish
Just curious, what do you do? It's interesting that you optimize a lot of
C/C++ programs... game development maybe?

~~~
tptacek
Network security software, reverse engineering, and (a bit earlier in my
career) streaming video/multicast.

I got a 7x performance improvement in a performance-critical component at
Arbor Networks within a couple weeks of starting there just by profiling,
finding malloc/free at the top of the profile, and switching to an arena
allocator. I have other malloc stories, but that's the simplest. I've learned
to fear malloc.

~~~
palish
I'd like to hear the other malloc/free stories whenever you're inclined.

------
dominik
I'm unclear what point Jeff had in mind...

A micro-summary to illustrate:

1\. I love strings.

2\. You don't want to concatenate strings in (large) loops.

3\. You already knew that, at least if you're a decent programmer.

4\. Analysis of various methods of concatenating strings in loops.

5\. Reveal that the methods don't matter.

6\. It's bad to worry about micro-optimizations like the best way to
concatenate strings in a loop.

7\. Just avoid obvious mistakes like concatenating strings in loops.

8\. Otherwise, don't worry about micro-optimization. Instead, focus on your
real goal: writing better code.

Maybe it's me, but this article felt disjointed.

------
jerf
Well, the _real_ fastest thing to do is stream your page out by arranging it
such that you don't have to construct it out of order, so you never
"concatenate" strings at all, simply shipping the page out to the socket with
each chunk, except with a small I/O buffer before shipping it out on the
socket. Not only will this take less time in all but very pathological cases
by avoiding string concatenation penalties, it'll also end up getting the page
to the user sooner so they can progressively render it.

But hardly any web framework I've ever used will even permit this, all
insisting on being handed one big string to ship out to the user. Sigh.

~~~
tptacek
You mean like, Rails helpers and "concat"?

~~~
raganwald
Meaning "no true framework":<http://en.wikipedia.org/wiki/No_true_Scotsman>
allows you to stream results out piecemeal.

------
daleharvey
am I just being slow or has he completely missed the point of avoiding
concatenation in loops, concatenating small strings inside loops is fine, its
tiny cost, but concatenating the results of those 100000 iterations is
completely different

~~~
cconstantine
The Schlemiel problem is a BigO problem. BigO is really only useful analysis
if your N is large enough. If you always work with data sets that are small
(and have confidence that they will remain small) there is no point in doing
BigO analysis or worrying about bad things like O(N^2).

~~~
daleharvey
I wasnt talking about the length of the iterations but what the result of them
is, he is just concatenating tiny strings, then apparently throwing the result
away, if he used the first example in an iteration of 100000 then I dont think
the results would be negligible.

~~~
daleharvey
lol, someone else picked up on it and wrote the proper benchmark for it

8678ms for concatenation Vs 19ms for stringbuilding

------
aditya
Can anyone explain what he means by "writing better code"?

It seems like he sets up a straw man string concatenation problem and then
tears it down in a pointless argument.

~~~
olavk
I would guess that better code means more readable and maintable code.

------
coliveira
In the examples that he gives, the main burden is in creating objects, not in
concatenating strings. As long as there are just a small number of +=
operations, the ratio between object creation and concatenation remains
similar and there will be no big difference in the long run.

------
jonknee
Is Jeff really making HTML strings in his methods? Aren't there template
systems for .NET?

~~~
johns
Yes, but they get very ugly when you have a lot of control structures. Jeff
prefers to move these into code, see here:
<http://twitter.com/codinghorror/statuses/1154609179>

------
newt0311
OT: This is why I keep wishing that languages would grow up when it came to
strings and use better optimized big-O data structures like ropes instead and
save everybody a lot of headache.

