
 How Much Processing Power Does it Take to be Fast? - 6ren
http://prog21.dadgum.com/68.html?1
======
dmbaggett
Intermediate point of reference: Crash Bandicoot, which came out at the end of
1996 and ran on the original PlayStation, used 2MB of RAM and a 33Mhz
processor. As in Defender, there was no operating system to speak of, and
though Sony "required" developers to use libraries, we ignored that dictum and
coded the rendering pipeline in R3000 assembly. We could render about 1000
polygons per frame, 30 frames per second.

Now I have to work similarly hard just to get IE8's crappy JavaScript
implementation to sort 1000 objects in under 33ms on a 3GHz machine. Sigh.

For even more extreme examples of performance within confines, see the
excellent chronicle of Atari 2600 development, "Racing the Beam".

------
reedlaw
I enjoyed reading this piece from the same site:
<http://prog21.dadgum.com/52.html>

It's about how modern interpreted languages such as Ruby and Python are orders
of magnitude faster than BASIC's such as found on typical 1980's systems.

~~~
saalweachter
The speedup is actually precisely what you would expect from hardware over
that time period.

2^((2009-1984) / 1.5) = 104031.915

He's accidentally demonstrated that we would expect a 1984 implementation of
BASIC on modern hardware to be _exactly as fast_ as modern Python.

~~~
Wilduck
Which actually isn't damning of Python at all. Python has optimized for things
other than speed. I'm pretty sure that you wouldn't choose a 1984 BASIC
implementation (on modern hardware) over Python for any of your projects. You
can have a Better Language at no cost to performance.

~~~
tectonic
True, I expect (but don't know for sure) that BASIC was actually highly
optimized to be as fast as it was.

~~~
bane
Well, it was also usually written in the local machine language as well.

------
extension
If you can live without such luxuries as an operating system, libraries, and
drivers then you too can write software as efficient as Defender! But
resources are well spent on abstraction, most of the time.

~~~
Lagged2Death
But is it mere abstraction that causes me (to take a _completely random_
example) to wait on Windows so often, when I've got so many GHz and so many GB
at my disposal?

I think it's more likely to be a matter of priorities. It's more important to
ship your PC software than it is to expend the effort required to make it
slick and responsive. On non-PC platforms (like the iPad) that's not true; the
platform's relative simplicity and responsiveness is a big selling point, so
it's crucial to be fast and slick to fit in.

~~~
drzaiusapelord
>when I've got so many GHz and so many GB at my disposal?

Because like most laptop and desktop users youre IO bound by your slow
mechanical spinning disk. My desktops at work at home and my laptop all have
SSD's now. I have a level of responsiveness that's very close to what I get
with an ipad or my transformer.

Don't blame the software, blame the hardware. You can't really compare flash-
based storage to mechanical storage. A lot of the "bloaty" OS's really aren't.
Its the damn mechanical storage. I mean, there's literally a mechanical arm
that roams around spinning platters. You can't compare that to electrons
dancing on flash media.

~~~
Lagged2Death
I'm aware of the mechanical I/O bottleneck, and I'd love to have SSDs in the
machines I use, but funds do not permit.

But I doubt that's the whole story. The first time I type "Ctrl-F" in a VS2010
session, the disk thrashes and there's a noticeable delay of a couple of
seconds before the Find tool loads. The delay is so bad that keystrokes are
lost; if I not-particularly-quickly type "Ctrl-F mystring", the find tool
searches for "tring."

I'm sure an SSD could improve that situation, but the real issue here isn't
that the disk is slow, it's that the software is causing disk access where
none should be necessary.

I'm sure there are reasons behind it; the find tool is a module in a modular
system, it's not hanging around when it's not needed, it makes a smooth fit
with the rest of the framework that was used to build VS2010, etc.

But the end result, the end user experience, is worse than it was in older,
simpler versions of the product. (In this particular way, it's actually worse
than a DOS-era editor, running on far more severely bottlenecked hardware, was
20 years ago.) The find tool was probably built in whatever was the most
straightforward way to get it to fit into the horrendously complicated system
(VS2010 on Windows) it's a part of, and no effort (or not enough effort) was
expended to make it any better than that.

This is just one example that struck me today. There are plenty of other
situations where a Windows machine full of Windows apps will cause the user to
wait in situations that shouldn't require waiting. You can classify them as
I/O bottlenecks, and that's not wrong, but I think it's missing the point.
Everyone knows there's an I/O bottleneck there; when you develop code, you're
supposed to bear that in mind.

But backwards compatibility and shipping the damn thing are more important
than optimizing, and that goes for Windows itself and all the components
thereof as well. The stuff we write today has to accommodate the foolish-in-
retrospect decisions of yesterday.

~~~
rogerbinns
This complaint is the reason why some apps include "quick" loaders under
Windows. Effectively they have a tray icon, run after startup and load the
files making up an app so they are already in memory when you run the app.
Then when you launch the app for real it appears very quickly including all
the functionality.

Of course people get annoyed at these things, taking up "memory", "delaying"
system start etc and get rid of them.

I/O just tends to be more expensive on Windows. Not only does the code have to
deal with backwards compatibility, old drivers and tag alongs (anti-virus,
backup), potentially obstinate hardware, and numerous other things but often
the OS is supported for a long time. For example Windows XP uses a 10MB buffer
cache, irrespective of how much RAM you really have. (Can be changed via
registry, code changed in Vista.)

------
lallysingh
Funny, there's no discussion of the modern bottlenecks: the bus in between the
CPU & GPU, and the speed difference of the CPU and RAM. And no discussion of
what else the modern machine has to do now: maintain a lot more devices (like
radios), handle background work, and have a network stack.

It's not like we have faster hardware but suddenly got dumber programmers.

~~~
onemoreact
The CPU & GPU bus is hardly a meaningful bottleneck when dealing with basic UI
interactions. 3d games often have less input lag than the background windowing
environment because they are optimized for that. The technical problem is
simply excessive buffering and a poor interrupt implementation.

~~~
smackfu
The real problem is levels of abstraction that let you do anything, but not
necessarily quickly.

Compared to the old games, where you could do only one thing, but you could do
it fast.

~~~
onemoreact
Abstraction is not what causes modern operating system to buffer user input.
You can often have the same image drawn to 7-10 buffers a user can see it. And
honestly most of the time there is plenty of time for this so it's not a big
deal, what you don't have time for is doing the full path for each tiny input.
Let's suppose you want to drag a window. The secret is you don't need to use
the same window location for every stage of the process as long as you find an
acceptable middle ground so that not rendering part of a dragged window does
not look that bad.

Now you need to do this for menu options etc etc.

PS: Games often do this trading a little less accuracy in a single frame to
enable low latency responses.

------
s800
There were a lot of other tricks to be had:

\- Some form of sprites were the rule, not the exception. Albeit very hardware
coupled, usually a fixed width and maybe unlimited height. Fixed # of sprites
per line as well, or they'd just turn invisible.

\- Tile based (character) graphics, so there's a lot of games that aren't
bitmapped, but rather character or tile based. So think updating a 40x25
screen (eg) not a 320x200 screen.

\- Palette (indirect) based graphics = some tricks for animation here.

\- A hardware supported transparent color.

\- Direct control of frame buffer pointer = lots of goodness.

\- Interrupts based on raster position

\- DMA for sound

Sometimes:

\- Some control of modulus based scrolling (think starting bitwise horizontal
and vertical for up to 8 bits) = easy tile based bit scrolling.

\- Blitter (with boolean ops) - maybe.

\- Planar graphics.

Still, we're lazy today. Lots o layers in between us the hardware.

------
tintin
The screen resolution he mentions is 0.1% of a 1024x768 resolution. So lets
say we need 1000MHz to render 1024x768. Then I think modern software is doing
alright. Because today we calculate more color bits, keep track of all the
input devices, output 'real audio', keep the device online and so on.

~~~
bradfa
320x256 is 10% of 1024x768 in terms of raw pixels, not considering bit depth.
But bit depth and raw pixels shouldn't matter here, the GPU takes care of
mucking about with those and the GPU in the iPad is amazingly fast for its
power consumption.

I think the point is that the iPad is sluggish and it shouldn't be. 30 years
ago people knew how to make something fast reacting with way less hardware
behind it, somehow we've lost that and added complexity that doesn't really
seem visible to the user.

~~~
aaronblohowiak
>think the point is that the iPad is sluggish and it shouldn't be The OP is
arguing the opposite-- the responsiveness of the iPad demonstrates that we can
have responsive systems even with all of today's luxuries

------
c1sc0
"And of course there's overhead involved in the application framework where
messages get passed around to delegates and so on." Am I the only one who
finds this an amusing & concise summary of iPhone development?

