
The demise of the low level programmer - dr_art
http://altdevblogaday.com/2011/08/06/demise-low-level-programmer/
======
rednum
I am happy about the 'demise of low level programmer'. Why? Because for me
programming is more about algorithms/data structures than circuits,
microprocessors and caches. I am happy that I didn't have to write mouse
handlers, keyboard handlers - because have I started doing that, I'd either
got bored before managing to do something useful, or changed my passion
towards low-level stuff before writing my first quicksort. I am also happy
that I can write something without understanding all mechanismes behind it -
that I can build some site with django not even knowing how http works, or
play with some graphics having no idea about graphic cards. Sure, all of this
stuff is interesting - but it's just too much! Low level guys did their job
well, so now people who don't like machines so much can build cool stuff on
top of that.

I don't want to praise ignorance - probably I will read some of the stuff he
linked under the article - I just think it's ok that people can do something
useful with computer not knowing all the details about it. It's worth knowing
though - but just as you can have no idea about physics of sound and how piano
works and write nice songs, you can build nice stuff with high level tools not
knowing low level.

~~~
makmanalp
The problem is the law of leaky abstractions:

[http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...](http://www.joelonsoftware.com/articles/LeakyAbstractions.html)

I agree with you on most counts, but when you traverse your 2d array one way
versus the other and are left scratching your head as to why one of them is
slower, it's usually because something is going on in there that you don't
know about, that the abstraction didn't make clear.

We need those low level programmers. We also need high level programmers to
know a little about this stuff. However, we don't need _everyone_ to be a low
level programmer, and that's why the current state is an improvement over the
past.

~~~
Daishiman
Knowning the proper way to traverse an array is basic knowledge that can be
explained in 10 minutes. Seriously, cache theory is not complicated and even
the highest-level programmer ought to know about it.

That's a huge step from knowing the number of instructions that fit in a trace
cache of a CPU or when loop unrolling becomes optimal. Cache theory and
understanding how IEE 754 are basics and are fairly universal. It's completely
different from programming a fast subroutine in assembly optimized for a
specific processor generation.

------
wglb
There are lots of very good and detailed examples in this article, and some
very nice links for those interested in the actual low-level stuff.

But I am not sure where a lot of things he notes were ever "taught" in
universities or formally to those of us who were in the trenches. Much of the
hard-core low-level stuff was learned either on your own, or when you got into
an environment where it was necessary to know.

There are some additional skills that are falling by the wayside as well. One
is how to optimize instructions on a type 650 rotating drum computer. Or more
realistically, how to program co-routines in an interrupt-driven environment.
Or how to write a floating-point division routine if you are working on a 286
target that does not have a 287. Or what order to write items to a disk and
where to place them to minimize arm movement in a real-time data-driven
system.

I admit that some of these are actually less necessary now due to technology
being replaced, or machines being ridiculously faster.

But I was around when those technologies were necessary and useful (well, not
the 650, but almost). We built a real-time system in assembler that had to be
seriously well-optomized, and we had to modify the operating system to remove
hazardous a latency that was fogging our 2ms data sampling rate.

But you know, simultaneously with that were COBOL programmers who had to be
schooled in the idea that writing backups to tape was a good idea.

So there always has been a gap--this article suggests that this is something
new.

Having said all that, this is an excellent article, as it is a challenge to
programmers who would be better, and has a forest of valuable links to learn
more. (This site now bookmarked.)

~~~
salem
I graduated 10 years ago, in Computer Engineering, and much of this low level
stuff was covered in core classes, such as Microprocessors & Interfacing,
Operating Systems, and Computer Architecture. If one only wanted to study
"higher level concepts", or couldn't pass, you switched to computer science...

------
Macha
Part of the reason why there are so few low level programmers is that it's
quite complicated to actually find information on it:

* Lots of college courses, especially of the Java school variety simply do not cover anything remotely near low level code.

* For people looking to learn themselves, there are far fewer books and websites with resources than for other subjects. If I walk into my local book shop, there are books on Java, C++, C#, PHP (lots of PHP), Flash and Python. Then a few language agnostic books like Code Complete and Design Patterns (Stuff like this is lacking in quantity, but at least it's there). None on anything low level. There isn't even a x86 assembly for dummies type book there. The internet has free books for other languages too (Dive Into Python, Why's Poignant Guide to Ruby, etc..) and plenty of free resources on good functional programming or good OOP. For assembly, all I keep running into is a wikibook, which I'm rather skeptical about as they tend to pretty poor in my experience.

And then, once you've gone through the initial learning phase, how are you
going to get any experience? For 95%+ of projects, it's a case of YAGNI. There
aren't any interesting open source projects coded in assembly today. Maybe a
few Linux drivers, but that's something that would need a lot of domain
knowledge on top of low level knowledge to understand.

~~~
barrkel
All you need is an assembler (a language with a built-in assembler helps with
the learning curve) and the processor manual (ideally the processor manual's
assembler syntax will match your assembler's syntax; this is not the case for
e.g. the defaults in GNU as and Intel x86 or x64). Build a timing loop (timing
instructions like rdtsc are right there in the manual) and get to profiling
instructions. Read up, and find out about the hidden performance models behind
the various instructions (and rework your timing loop to reset these things
where possible); read up on caches, branch prediction, register renaming etc.,
then play around, try to reproduce the positive and negative sides of these
optimizations and develop intuitions etc.

I don't think there's a better - or easier - way.

~~~
dkersten
A timing loop will tell you how fast it runs on your particular configuration,
but not how fast you can expect it to run in the general case. For that, you
need to get very intimate with the processor manuals, especially optimization
guides. Also, a good profiler, like Intels VTune, is great for getting low
level performance data such as cache misses.

Also, doing what you suggest on a number of combinations of hardware would be
useful, so you can compare various processor architectures.

------
roel_v
While in general I agree with the fundamental issue in this post, I can't help
but think each time that people, when they write lists like this, write down
the things they themselves know very well. And then call those 'indispensable
for all programmers out there'. Yet every single list is different, because
everybody encounters different specific details in their careers. I guess the
only conclusion is that, in the end, these specific details aren't that
important after all. But I do understand that it's hard to admit that - I
spend years to get to the relatively detailed level of understanding of the
win32 api that I have, yet it's mostly useless now already, and will only get
more useless with time. The irrational part of me is bothered by that too.

~~~
jodrellblank
He says _They don’t seem to grasp that one must understand the native
environment you’re working in before going ahead and writing a program to run
within it._

I thought, or maybe you don't grasp that one need not understand it first, or
in many cases, at all.

~~~
agentultra
That is the whole point of abstractions after all.

If I had to write my entire operating system from scratch everytime I wanted
to get something done I'd find a different profession.

However, I don't think that's the point being reached. Somewhere out there
exists a magical corpus of knowledge that every programmer should be able to
memorize from heart. And then there's the world we live in now where you have
a corpus of knowledge you do know and can build upon. Somewhere in between is
a practical subset of knowledge that should be common but some feel isn't
being represented well. The debate rages about what that suitable subset is.

What I think most people who bring this argument up fail to realize is that
the full corpus of knowledge that embodies all of programming is far too large
for a single programmer to understand. Good abstractions can be trusted to
hide the unnecessary details. Great abstractions shouldn't mean piss-poor
performance (and should infact provide just the opposite). They also get out
of your way (or you avoid using them) when working in the problem domain where
your "low-level" knowledge is more useful for the optimizations you can
provide to your code.

I'm just not as OCD as some programmers. Yet I can still get good work done. I
don't think I could do it without useful abstractions.

------
jensnockert
I wouldn't want to be taught in school about low-level optimizations, I am
sometimes, and generally, they teach you stuff that is 10 to 20 years old.

I think many low-level optimization tricks each generation of programmers need
to find for themselves, the technology changes so much that old tricks are not
as useful anymore.

Memory wasn't as slow (compared to processors) as they are today etc. Many
people still count FLOPS, while memory accesses is probably the most relevant
metric today, especially on GPUs.

~~~
salem
My former college switched many years ago to teaching low level concepts on
the PLEB, a ARM/Linux computer developed in-house. I would argue they were 10
years ahead of the game, with that platform now being widely commercially
deployed. In fact, the developer of that platform and his professor went on to
a startup that provides low-level software components for millions of android
phones.

------
wccrawford
"The rumors of my demise have been greatly exaggerated" - Okay, I took some
liberty with the quote.

I see people lamenting that their favorite low-level programming techniques
aren't being taught these days.

That's because the field has expanded so much! Entry-level programmers learn
entry-level things. If they're good programmers, they're going to eventually
teach these things to themselves... If they need them. Chances are they won't
need them, though.

There are still plenty of programmers doing low-level work, and there's more
all the time. There's just so many more high-level programmers that you don't
notice them.

------
pnathan
What about the Arduino community, and mobile phone programmers, and embedded
systems in general?

I mean, yes, web & desktop coders don't really need to know about the details,
but there are people who write operating systems, compilers,
microcontrollers... Those are very low-level areas.

~~~
stonemetal
Considering nvidia is planing on releasing a quad core 1.5Ghz phone processor
around the end of the year, I would say no phone programmers don't really need
such details. Arduino community specifically, sure the arduino sucks hardware
wise. Embedded systems in general professionals yes, hobbyist no.
Professionals yes because shaving per unit price to the bone can make dev time
worth it. Hobbyist can buy a pretty beefy microcontroller so that they don't
really need to worry about it.

~~~
salem
That chip would likely still be an ARM+GPU architecture, and would still be
without a floating point unit (except CUDA/OpenCL code). So code could still
run like a dog and burn precious battery power even faster.

~~~
mansr
All current smartphones are ARM based and have a floating-point unit. The
original iphone had one, and it wasn't the first.

------
kqr2
Also recommend the book: _Hacker's Delight_

[http://www.amazon.com/Hackers-Delight-Henry-S-
Warren/dp/0201...](http://www.amazon.com/Hackers-Delight-Henry-S-
Warren/dp/0201914654)

~~~
adestefan
_Programming Pearls_ is another excellent book in this realm.

[http://www.amazon.com/Programming-Pearls-2nd-Jon-
Bentley/dp/...](http://www.amazon.com/Programming-Pearls-2nd-Jon-
Bentley/dp/0201657880/ref=pd_sim_b_1)

~~~
kragen
_Programming Pearls_ is excellent, but it has very little in common with
_Hacker's Delight_ , except that both are excellent books about programming.
HD is specifically about the kinds of low-level tricks we're talking about
here, while PP generally is not.

------
hmottestad
At the uni in Oslo (Norway) there are plenty of low level programming courses.
I've taken a few C (no plus plus) course with plenty of bit shifting. There is
a course with a lot of assembly, you get to write a converter for UTF8 in
assembly working with a lot of shifting on the registers. Then there is a
course where you develop an OS and one where you work with multi-core
architectures.

And I also feel that that guy in the article should have grey hair. It would
give more weight to his argument.

~~~
hmottestad
And try paying more...that way you'll be able to attract the guys who can
program both lower level and higher level ;)

------
onan_barbarian
We're [low-level programmers] not dead, we're pining for the fjords. Low level
programming is a dark art, and there are naturally plenty of people
celebrating how fantastic it is that they don't _have_ to know this stuff.
Don't celebrate your ignorance; you may be 2-4 orders of magnitude off after
your performance is sucked away by all those layers of abstraction.

My take on it is that "low-level stuff" isn't going away any time soon, and
that there will always be a equilibrium reached between those who understand
it and are fluent with it (and, sometimes, are going to be able to write code
that is 5-100x faster) and those who don't.

It's orthogonal to an understanding of algorithms. No amount of bit-bashing
can fix really poor choices of algorithms (N^2 vs NlogN, say, on a big input).
That being said, there are a lot of tasks where everyone is going to land on
the same linear or logN basic algorithm and the main difference is going to be
the algorithm that misses cache frequently and is stuffed with branch
mispredicts and pipeline stalls vs. some algorithm that avoids all these
problems and runs 10x faster. Sometimes you've just got to go do something to
every bit of data and there's no classic algorithmic trick.

For example, I helped a former academic colleague write the 'worlds fastest
floating point minimum' routine using SSE and lots of unrolling/software
pipelining and I think he got about 8-10x (he had a library for most of the
common operations like vector add, mul, etc but it didn't do 'min'). No amount
of rampant algorithmic cleverness would avoid the need to look at each data
element once when trying to calculate the min of a vector and if you're doing
WORSE than linear, you've got real problems.

The major point that I've discovered from doing this stuff for years (on
considerably more complex cases than the example above) is that programming
efficiently for a modern architecture is qualitatively different than
designing algorithms for the abstract '1 operation counts as 1 operation'
machine in your average algorithms textbook. Oddly, quite a bit of improvement
in my scalar, non-parallel programming came about after having used CUDA
fairly intensively - if you're writing code for a Core 2 Duo or beyond, you're
already parallel programming even if you're designing code that's single-
threaded. Understanding how to rethink your algorithm to have data-parallelism
(not to mention using SSE) is just plain conceptually different and more akin
to parallel programming than scalar programming.

Knowing when to do this is important; I like bashing out a quick Python script
as much as the next guy, or perhaps some pretty random C++ STL code that's
probably an order of magnitude from where it should be (not because the STL is
bad but because, say, I've been bone-lazy with design). So all those 'I have
10 layers of abstraction above this level" nincompoops shouldn't get too smug;
we (low-level guys) can go there too - just because we know how to optimize
C/asm loops to within an inch of their lives doesn't mean that we're going to
compulsively do it with every last one.

Also worthy of note - the right thing to do changes frequently; some of the
resources (especially on branch prediction) listed are already out of date.
Don't bring P4 knowledge to a Sandy Bridge fight. Some bit twiddling hacks are
great, others (especially ones that assume multiply is ultra-expensive) are
obsolete on recent x86. The concepts still keep their validity a lot more
than, say, all that newfangled crud that you youngsters fill your heads with
(LAMP stacks, etc.) :-)

~~~
palish
I'm having an extremely hard time finding a remote dev job, as I have years of
C/C++ knowledge and industry experience. If I worked in webdev, I probably
wouldn't say the same.

~~~
psykotic
That's probably as much to do with different parts of the industry's
expectations/requirements when it comes to on-site or off-site work.

~~~
palish
Is RAD looking for an experienced graphics programmer / etc, by chance?

~~~
psykotic
Sorry, RAD doesn't really hire in the traditional sense.

If you're looking for game development work, I can see how being stuck in St.
Louis must be tough. Good luck!

~~~
palish
Thanks!

Out of curiosity, what's different about RAD's hiring?

~~~
psykotic
RAD's only ten programmers and that number stays more or less constant. The
way everyone is hired is unique and inevitably somewhat odd. But it usually
involves knowing Jeff or Jeff knowing of you for a long time and eventually he
makes an offer when it makes sense.

~~~
palish
That's so cool!

------
yason
How appropriately timed for the Assembly demo compo weekend. But I don't think
these things are special: what's special is that some people just adapt and
immerse fully into the environment while others try to keep their thinking
together and abstract away the machine.

We have bit-twiddling hacks because we have machines that do bit-twiddling. If
we had had different kind of programming environment, these low-level hackers
would've been creating different tricks of trade. Nobody probably ever taught
them anywhere formally, except for programmer peers to each other.

It's not the low-level stuff per se, it's the people who go for the low level
no matter what stuff there is because they have to do it to get the job done.

------
pjscott
You have not truly done low-level programming until you start worrying about
gate widths and wire delays. (I'm so happy that I don't have to worry about
these things anymore.)

------
alextp
I think google hires a lot of low-level systems programmers, as should anyone
running thousands of datacenters. After all, the power and network savings you
can get from bit-twiddling, when multiplied by the number of computers in
these networks, is worth quite a lot in dollars.

~~~
nickik
The have a lot of compiler writers for that reason too.

------
forrestthewoods
Don't forget that AltDevBlogADay is a game developer blog and games have very
different requirements than most other software.

~~~
_delirium
Even there it really depends on the kind of game (and target platform). I
would say that for a lot of indie games, worrying about low-level programming
shouldn't be high on your list. In fact many indie-game developers are exactly
the kinds of people who _do_ love to worry about such things, when perhaps
their time would be better spent making sure anybody wants to play their game,
and/or finding a way to market it--- most indie games don't fail because of
bad _performance_.

An exception might be if the game's core gameplay mechanic is fundamentally
based around doing some trick on a console or handheld device that can't be
done without careful optimization.

------
salem
I was the last class in my CS/CompEng college to go through the program
learning C as a core language for teaching. I had to tutor operating systems
to classes a year after me that had learn mostly java. There was a definite
loss of understanding of what was going on under the hood. Blank stares on
questions on calling conventions and where variables are stored, let alone
virtual memory. I still wonder how those students can write efficient higher
level code if they don't at least have a general understand what happens in
lower levels. A good example of this is choosing a web framework in python. In
a previous job I had co-workers that wanted to combine an async framework with
a synchronous DB API, in a single thread, and expected high performance.

------
rbanffy
Spoiled kid ;-)

When I got my first job I had to deal with BASIC in ROM and 6502 assembly.
Aztec C on the Apple II was very slow and it wasn't possible to tap the faster
routines in ROM because it messed up page zero completely.

Still, I can't complain. I loved the 6502 and the routines built into the
Apple II+ ROM were incredibly efficient.

------
namank
This is exactly why Computer Architecture should be a mandatory course for
Computer Engineering and EE with CS.

------
helwr
not dead, this blog is a great example of low level and high level living in
harmony: [http://mechanical-sympathy.blogspot.com/2011/07/memory-
barri...](http://mechanical-sympathy.blogspot.com/2011/07/memory-
barriersfences.html)

------
wmat
Thanks for this post, I'd love to see more content like this on HN.

------
michaelochurch
I think what's changing from a business perspective is what kills software
projects. What keeps a project manager up at night?

1980: Being too slow to run or not being able to an amount of memory we'd
consider tiny is what kills projects. Large programs are relatively uncommon
because the default languages (C and assembler) are so low-level that writing
them is pretty much impossible. Programs exist to solve defined problems, not
to be million-line ecosystems. In this world, being a decent programmer means
you have to know about things like the performance trade-offs of pre-increment
vs. post-increment.

2011: It's rare that a program, unless it's using O(n^2) sorting algorithms on
large sets, is actually too slow to be useful. What kills software projects is
illegible code. Paying a performance penalty for readability is generally a
good idea, and highly-optimized but illegible code has fallen out of favor.
This also means that fewer people are learning how to write such code, because
people encounter less of it to read.

~~~
vilya
I agree with you, except for the dates. It seems to me that worrying about
programmer productivity over performance is a phase that we're coming out of.
Mobile devices, with less memory and slower CPUs & GPUs than the desktop PCs
we're used to are the hardware platform du jour. Also, more people are having
to worry about scalability which goes hand in hand with performance: it's not
_just_ about smart algorithm choices, kids...

~~~
salem
Agreed. Also, mobile devices tend to suck at floating point (no floating point
unit in ARM cpus), and IOS will kill your app if it is memory hog.

------
Cushman
Well _yeah_. What the hell are we doing as programmers punching tape for a
Turing machine? That's robot work!

Our abstractions are still leaky, but they're getting better. Soon enough
they'll be good enough that you mostly don't have to think about what's
happening in the physical box at all— just like you shouldn't have to think
about different kinds of wood when you're laying out a housing development.

~~~
adestefan
You know that the Turning machine is the ultimate abstraction in CS right?

~~~
Cushman
I'd say that depends on what you mean by "ultimate". The Turing machine is the
_most fundamental_ abstraction in CS, sure. But it's not _that_ much of an
abstraction of my laptop. My laptop has state, it has instructions, each
instruction takes it from one state to another. It doesn't have infinite
resources, but for most of my purposes it might as well.

So since my metaphor was obviously unclear, the point I'm getting at is _why_
would I want to be "punching tape" (writing assembly or C instructions) for a
"Turing machine" (a physical computer with unlimited resources that executes
one — okay, two — instructions at a time) when I could instead tell a robot (a
high-level programming language) what I _want_ it to do, and let the robot
figure out the individual instructions to make that happen?

And just for future reference, "You know that #{thing I assume you don't
know}, right?" is a really douchey way to pretend to try to educate somebody.

~~~
arethuza
I would say that TMs are _one_ of the most fundamental abstractions in CS -
but as you note, for an abstraction they are actually quite concrete. I would
perhaps argue that the lambda calculus is a slightly more fundamental
abstraction - even of course they are equivalent in "power" to TMs.

~~~
Cushman
I see where you're coming from with that, although personally I mostly think
of them as different approaches of considering the same thing. A Turing
machine asks the question of what a computer can be, while lambda calculus
asks the question of what a computer _program_ can be. You could also think of
the TM as an abstraction of the imperative paradigm, and LC as as abstraction
of the functional paradigm. We know they're theoretically equivalent, but
they're different ways of conceptualizing it, and probably trying to figure
out which is more basic is not a good use of our time :P

