
Clash of the unlikely: C# versus.. RealBASIC? - stesch
http://tomgrimshaw.wordpress.com/2009/04/16/clash-of-the-unlikely-c-versus-realbasic/
======
ajanuary
It looks like the RealBASIC version was written first and then translated
straight into C#.

In reality, you'd not create the encoding each time, and either use a
StringBuilder to build up the string, or just use Encoding.GetString.

C# on my machine:

As per blog: 50s

Creating Encoding once: 24s

Using StringBuilder: 11s

Using Encoding.GetString: 2.5s

Use Random.NextBytes: 2s

RealBASIC on my machine:

As per blog: 27s

Interestingly despite advice I'd found on the internet[1] using array joining
didn't really make the RealBASIC version any faster, so from what I can tell
it's generally as fast as it could be using the out of the box libraries.

(My machine isn't exactly idle and ideal for microbenchmarking atm, but the
differences are large enough for that not to matter)

This doesn't really provide a good evaluation on the state of JITs.

[1]
[http://forums.realsoftware.com/viewtopic.php?f=13&t=2510...](http://forums.realsoftware.com/viewtopic.php?f=13&t=25107)

[Edit: Added RealBASIC times]

~~~
gjulianm
I'm fairly surprised in the results using StringBuilder. I didn't know it was
so fast compared with plain concatenation. According to what I've read in MSDN
[1], memory allocation has a lot to do with it. I think I'll start using this
class in my code.

By the way, this leads me to one question: does the code

    
    
        string a = "a" + "b" + "c" + "d";
    

generate one or three memory allocations (one to compute each concatenation)?

~~~
ajanuary
Because you're concatenating literals, the compiler will translate it into
string a = "abcd" ;)

Strings are outwardly immutable, so concatenating any two strings will create
an entirely new String object, along with the memory allocation needed for
that. However, there are some internal methods that allow String and other
classes in the same corelib assembly (such as StringBuilder) to mutate them.

`string a = x + y + z` will be translated into `string a = String.Concat(x, y,
z)`. Because it knows the final length it can use the internal String methods
to allocate a new String long enough to fit all the arguments and then copy
the characters into that without any intermediate representations. [1]

In a loop it doesn't know how big the final result will be, and so has to
create a new String object on each iteration. If you're doing 16000000
concatenations as in this test that's a lot of intermediate strings that are
largely being wasted.

StringBuilders have a private String object and use the internal String
methods to mutate that when you call Append. When you call ToString it returns
a copy of the private String object. Again, this avoids creating lots of new
intermediate String objects, but does have the overhead of having to guess how
large the initial String should be, and growing it if you append too much
(similar to how Lists work).

[1] I believe in Java it's translated into using StringBuffer or
StringBuilder, which has a small overhead of creating the
StringBuffer/StringBuilder object.

[Edit: Corrected how String.Concat and StringBuilders work based on the Mono
codebase]

~~~
Dykam
For others interested in the source of StringBuilder:
[https://github.com/mono/mono/blob/master/mcs/class/corlib/Sy...](https://github.com/mono/mono/blob/master/mcs/class/corlib/System.Text/StringBuilder.cs#L50)

Note that the actual magic consists of internal methods in String.cs:

\- FormatHelper
[https://github.com/mono/mono/blob/master/mcs/class/corlib/Sy...](https://github.com/mono/mono/blob/master/mcs/class/corlib/System/String.cs#L1928)

\- InternalSetChar
[https://github.com/mono/mono/blob/master/mcs/class/corlib/Sy...](https://github.com/mono/mono/blob/master/mcs/class/corlib/System/String.cs#L2585)

\- CharCopy
[https://github.com/mono/mono/blob/master/mcs/class/corlib/Sy...](https://github.com/mono/mono/blob/master/mcs/class/corlib/System/String.cs#L3060)

------
gav
It's yet another flawed microbenchmark.

The RealBASIC version calls the inbuilt "chr" function while in C# it calls:

    
    
      public static char Chr(byte src)
      {
        return (System.Text.Encoding.GetEncoding("iso-8859-1").GetChars(new byte[] { src })[0]);
      }
    

I'm sure there's some overhead to calling this 16 million times.

~~~
profquail
That may not even be the biggest flaw; using DateTime.Now as the timing
mechanism for small benchmarks is pretty inaccurate because the resolution is
rather poor.

The recommended way is to use System.Diagnostics.Stopwatch, since that uses
the CPU's built-in high-performance timer (HPET).

~~~
ajanuary
It also doesn't do any warm-up (run the code a few times untimed) which puts
the JITer at a disadvantage.

------
chamakits
What I found the most interesting was how quickly Miguel de Icaza was to post
on the message, trying to get involved with helping improve the performance.
I'm not familiar enough with the original poster, nor with opensim, so I don't
know of this specific developer is someone the whole Mono community is keeping
an eye on. Regardless I find it quite encouraging that the main public face of
Mono is always keeping an eye out to help the community.

~~~
makomk
It's a PR thing more than an actually-helping thing. For example, the OpenSim
developers had a problem where memory use exploded rapidly on heavily-loaded
regions running under Mono, to the point that they had uptimes measured in
single-digit hours, yet the same regions running the same code and workload
under .Net were just fine. Miguel de Icaza's response was to run a totally
idle region for 24 hours under both Mono and .Net, produce pretty graphs
showing that Mono used less memory, and claim that the problem didn't exist.
The developers pointed out the flaws in his tests, but he never corrected his
statement or helped to find the actual problem.

------
killface
I'm trying to figure out what piece of the puzzle I'm missing. I converted the
script to java, using apache's RandomStringUtils.randomAscii, and I get:

With StringBuilder: 2.61 seconds With String Concat: 3.03 seconds

I have to assume there's another loop running these, but I'm not seeing it.
Each dictionary has 1 million elements at the end.

I just implemented it inside a junit test, here's the class:

<http://pastebin.com/sFZjA1p2>

Turns out java performance isn't bad at all.

EDIT:

I'm a dumbass, was converting nanos to full seconds. Post has been edited.

~~~
ajanuary
Strangely your StringBuilder version seems to really slow down for me at
around iteration 6000000. If I replace the repeated use of uuid.toString with
`String uuidStr = uuid.toString()` it works fine. Looks like the performance
of toString differs between versions.

------
MartinCron
It would be interesting to run these numbers again (this article was from
2009) to see if newer runtime/JIT is any faster/slower.

~~~
eduo
And if it isn't, that would make it even more interesting and unexpected.

------
kyberias
This article is from April 16th, 2009 !!!

