
Computer Scientists’ Trivia - signa11
https://keon.io/cs/computer-scientists-trivia/
======
tomsmeding
> To improve the performance, take the redundant multiplication out of the
> loop like below:

Um, that multiplication _is already_ outside the loop. You said before that
one shouldn't trust the optimiser, but it did the right thing here...

EDIT: (After reading the entire thing) The memory aliasing point completely
correct and very important (even though you probably want the `b[i] = val`
_outside_ the inner loop...); in C, I'd argue that this point needs even more
attention if we're micro-optimising already.

The strlen call is only repeated (leading to quadratic behaviour) because the
string may be modified in the inner loop. Might be worth mentioning that the
calls will be hoisted (i.e. merged to one) if the string is constant and
nothing can alias to it.

The "replacing multiplication by additions", the replacing multiplication with
a shift and the hoisting I mentioned above are all redundant optimisations for
a programmer: compilers will trivially do that already. The first two are
instances of "strength reduction" and feature in most compilers' -O1 set.

Conclusion: I do like the first part about the actual trivia, but I think the
micro-optimisation part misses the point somewhat.

~~~
danbruc
_The memory aliasing point completely correct and very important (even though
you probably want the `b[i] = val` outside the inner loop...) [...]_

Do you mean the assignment to b should be or must be outside of the inner
loop? I am pretty sure it must be outside of the inner loop in order to make
the assembly version correct.

~~~
tomsmeding
Very true, didn't spot that one. Then the C code is probably a typo, but still
incorrect. :P

------
110011
The latency numbers shown in the post are directly taken from a well known
source (see for example
[https://gist.github.com/jboner/2841832](https://gist.github.com/jboner/2841832)).
Why the lack of acknowledgement though?

As a related rant, let me point out that the standard for quoting seems really
poor in these kind blog posts that appear on HN. I wish the authors would take
more responsibility and cite sources where credit is due.

~~~
kwk236
Hey, thanks for pointing out. I added the reference.

~~~
110011
Awesome!

------
marcosdumay
Can we stop pretending that the default C integer types have any kind of
definitive size? That is wrong, and it's just asking for more portability bugs
down the line.

The correct sizes:

\- char >= 1 byte

\- int, short >= 2 bytes

\- long >= 4 bytes

\- long log >= 8 bytes

You can not rely on any further information about them.

~~~
jwilk
char is 1 byte exactly.

~~~
Koshkin
More accurately, _sizeof(char)_ is always 1. This is simply because _sizeof_
returns the size of the object measured in terms of the equivalent number of
_char_ s (rather than 'bytes').

~~~
jwilk
Perhaps your notion of what a byte is different, but according to the C99
standard:

\- sizeof returns size in bytes;

\- sizeof (char) == 1;

\- one byte has CHAR_BIT bits;

\- CHAR_BIT >= 8.

~~~
AnimalMuppet
So per C99, a byte can have more than 8 bits? My mind boggles...

~~~
Someone
For 36-bit CPUs, the 9-bit byte was a natural choice (as was storing 6
characters of a more limited character set in a 36-bit word)

~~~
AnimalMuppet
Ah, yes, 36-bit. I had forgotten about that, despite my mother having worked
on such systems.

------
sspiff

        2^10 = Kilo ~ 10^3
        2^20 = Mega ~ 10^6
        2^30 = Giga ~ 10^9
        2^40 = Tera ~ 10^12
    

This is wrong - power-of-two magnitudes are called kibi-, mebi-, gibi- and
tebibytes. Kilo, Mega, Giga and Tera are prefixes used by the SI unit system,
and denote powers of ten.

The confusion comes from Microsoft using power of two numbers in calculations,
but SI power of ten prefixes in labels.

A kilobyte is 1000 bytes, not 1024 bytes.

~~~
ben0x539
I never hear anyone use the -bi units unless they're trying to be clever, and
I'm fairly sure at most storage devices use the power-of-ten units. Am I alone
in that?

~~~
Symbiote
Storage devices use the power of 10 units correctly.

My 6TB drive has 6001175126016 bytes.

Windows probably still reports that as 5.5TB, although many Linux GUI tools
now correctly say 5.5TiB, or 6TB.

(Being clever should be encouraged!)

~~~
animal531
That's even worse, because they're intentionally preying on the ignorance in
an attempt to make the devices appear to have more space than they actually
do.

------
agounaris
Nice info but...

"It is crucial for programmers to understand how long a certain operation
takes in and out of a computer."

Google interview question "how much time it takes to Read 4K randomly from
SSD".... You are hired now go fix this css colour of www.whatever.com/about.
:D

I would prefer its a "good to know" but not really that crucial unless you
create code which operate on such a low level.

~~~
Tloewald
SSDs weren't a thing gen years ago and may not be a thing in ten years. This
is not eternal fundamental knowledge. The important thing is that registers
are way faster than on chip cache which is way faster than ... insert levels
of indirection ... which is way faster than ... insert levels of remoteness.
Even registers vs on chip cache may not be around that long (probably will for
a good long time though).

What's actually surprising is that it might be faster to pull data from
elsewhere in a data center than your own SSD. I recall John Carmack saying he
could ping a server in Europe faster than he could get a pixel onto a display
(triple buffered updates at 120fps iirc)

------
csl
I think adding sources would not only be polite, but make it easier for
interested readers to delve more deeply into the matter. E.g., as in
[https://gist.github.com/jboner/2841832](https://gist.github.com/jboner/2841832)

~~~
110011
Oops, I made a similar comment before seeing yours. Glad to find someone else
peeved enough about it to point it out. It's such a small thing but I don't
understand why it has become more or less acceptable here to just blurt
something out in a long blog post without any references whatsoever. It might
not be malicious necessarily however the standard for technical writing needs
to improve.

------
Animats
_Shift, add instead of multiply or divide_

This is usually not worth it on modern superscalar processors, where multiply
is fast and pipelined. It's a win mostly on Arduino-class CPUs. If you need
more than one operation to replace a multiply, such as an add and shift or a
bit operation, the replacement is probably slower.

The costly operation in the example is accessing a large 2D array along the
non-dense axis. For big enough arrays, that's a cache miss every time.

~~~
yorwba
> The costly operation in the example is accessing a large 2D array along the
> non-dense axis. For big enough arrays, that's a cache miss every time.

I half-remember hearing about memory-prefetchers smart enough to detect this
kind of strided access and fill the cache accordingly.

~~~
mnw21cam
Then it turns from a cache miss problem to a memory bandwidth problem. If you
are only using a small proportion of the cache line in your calculation, then
you effectively multiply the data transferred to the CPU.

------
delta1
OT: beautiful choice of colors and fonts for his blog, I was immediately
struck by how pleasant it was to read.

~~~
foo101
I am seriously unable to tell if you are genuinely appreciating the appearance
of the blog or sarcastically mocking it.

~~~
petters
I think most people would not prefer grey on black fixed width. But it did not
bother me that much, personally.

~~~
eru
I usually go for green on black in my terminals.

~~~
majewsky
I would like to go with green on black just for hacker street-cred, but how
does this work with colors? The normal color set seems to be geared towards
black/white or white/black as base colors.

~~~
eru
MacOsX and Linux terminals support green on black themes out of the box.
(Otherwise, I would adopt the white on black option.)

------
dvirsky
There's an error in the Primitive C types section, the width of long:

> long 8 (32 bit) 8 (64 bit)

the second column should be "4 (32 bit)" AFAIK, but definitely not 8 bytes /
32 bit.

~~~
peapicker
Even that is inaccurate. Long is 4 bytes on 64bit windows, 8 bytes on 64bit
UNIXes (all I've used) and 4 bytes on OS/400 with teraspace (an environment
with 16 byte pointers)

~~~
dvirsky
yeah but certainly 8 bytes are not 32 bit

------
weissi
for

    
    
        int m[2][3] = { {1, 2, 3}, {4, 5,6}};
        int *n[2];
    
        n[0] = &m[0][0]; //equivalent to n[0] = m[0]
        n[1] = &m[1][0]; //equivalent to n[1] = m[1]
    

it lists that

    
    
      What is n[1][1]? //2
      What is m[1][1]? //same as n[1][1]
    

but it's actually 5. (n[0][1] would be 2)

------
gumoro
Beautiful typo about two's complement:

> Simply filp all bits and add 1.

------
clarkmoody
To the site author:

Let me just say that I loved how quickly this site loaded. Thank you for
taking the effort to build a fast site.

