

The software development final exam: Computer Architecture and Operating Systems - cperciva
http://www.daemonology.net/blog/2012-10-09-software-development-final-exam-part-2.html

======
philbarr
For those who are reading this and are new to software development, please
don't let these questions put you off or take them as some sort of test of
your skills. The questions pertain only to the poster's particular CS degree
and to his particular experience, which for some reason he thinks is the only
CS degree and experience that is valid.

Also note that you will come across people like this quite a lot in the
software industry and you have to be careful not to let them get under your
skin. They can be very difficult to work with. If you are a team lead and
identify someone like this on your team then it can be very difficult to place
them and you often have to gently coerce them into their own "special" area
where they feel their talents are best used.

~~~
jaimzob
Hear, hear. Obsession with trivia and a supercilious tone are sure signals of
a B-player in my experience. "Name and describe the four states in MESI cache
coherence"? Sorry, memorising an algorithm for determining if a graph is
bipartite pushed that knowledge clean out of my head...no doubt I'm very
stupid.

~~~
waterlesscloud
He's definitely not a B-player, but that makes it all the more mysterious why
this project is missing the target so badly.

I don't think the questions are awful, though they do tend to have a trivia
component to them.

What I think has really happened is that the whole thing is completely
mispackaged. By calling it a "software development final exam" and saying it's
things every programmer should know, he's set up an idea that it would be
fairly broad and comprehensive, when in reality it's fairly narrow.

Plus that terminology is a bit socially off-key since it sets people up to be
defensive rather engaging in meaningful discussion.

~~~
jaimzob
Indeed, presenting this more 'guide to interesting things you should know'
rather than 'here's an exam I've condescended to set for everyone dumber than
me' would have worked out much better.

Maybe B-player is harsh but in my career I've worked with some very good
programmers and some real stars. When I talk to the very good programmers I
always feel dumber, but whenever I talk to the real stars I always feel
smarter myself (though heaven knows that's not actually true). This seems to
fall firmly in the "very good programmer" category.

------
peteretep
I am enjoying these, but as a poster on your previous thread predicted, you're
moving further away from anything resembling what 99% of today's programmers
ever see...

Need a car analogy? Fine: you pitched this as "what every driver should know
about driving", and you're talking about the intricacies of combustion
engines. That is to say, it's interesting, may pin together some pieces of
other knowledge, but of no real practical use to anyone who doesn't also fit
in to the "and I'm a mechanic" demographic.

~~~
antidoh
"What every pilot should know" might be a better analogy. Pilots are more
directly responsible for keeping their vehicle within safe conditions, and to
do that they need to know more about their vehicles than car drivers.

Programmers affect and determine the operation of the computers their code
runs on much more than a driver driving inside the lines.

~~~
randomdata
I really like the first analogy, actually. A driver who knows all the details
of how each engine works will be able to choose the car that provides the best
fuel economy and performance, but in the end, the driver's status will really
only be judged on the style and features of the car they choose.

i.e. the developer who chooses the slower zeroarray algorithm that still
completes in reasonable time, but provides a better user interface and better
feature set is still the better software developer (but perhaps the poorer
computer scientist).

------
cperciva
Some FAQs: I'll be posting the third and fourth parts at 9 AM UTC on Wednesday
and Thursday (and submitting the links to HN).

I got just over 100 responses to part 1, and I'm almost halfway through
grading them. I'll be sending out a batch of emails (form emails, I'm afraid,
to save time) once I've caught up with the part 1 grading.

I will be posting my answers, my grading scheme, and an analysis of the
results next week.

I will not disclose any individual's performance (except to them) without
their permission.

~~~
jsnell
Thanks for taking the time to do this, I think it's an interesting experiment
(even if one can't read too much into the results due to the sampling bias).

------
thaumaturgy
Colin, I've been programming for about 25 years now, in more languages and
architectures than I can ever readily remember, and you're making me feel like
a noob.

Thank you for that. :-)

~~~
nandemo
That's the spirit! I don't understand why people get so defensive.

I have a BS and a master's degree in CS and I didn't know about MESI cache
coherence. And I've never run into it in 7+ years of professional experience.
But that doesn't mean it's a useless question. I don't feel compelled to learn
all about it right now, but at least I'll get acquainted with the Wikipedia-
level facts. One day I might study it further.

~~~
michael_h
I have liked these questions, even if I think a few of them are a little
narrow. It's certainly sparked a bit of motivation for me. However, the intro
blog post is worded very poorly, which is what I'm assuming is causing a lot
of the problems. It reads less as a call to personal improvement and more as a
shaming/institutional snobbery - whether that was intended or not.

------
mmcnickle
I think you have fallen into the trap by following your goal of wanting
"questions which are easy to mark as right or wrong". You end up writing
questions that are extremely narrow in scope and don't provide any opportunity
for any interesting discussion. Anyone could memorise the states in MESI cache
coherence, but does that really test their understanding of the material? That
so many of the answers are one google away shows how shallow these questions
are. All you're doing here is testing knowledge retention.

I'll admit that I couldn't answer many of these questions off hand; I hope the
criticism doesn't come across as bitter.

~~~
randomdata
I did get the feeling the questions were chosen, perhaps unintentionally, to
create a natural bias towards those who have a formal CS education.

Contrived example: Someone who came up with a quicksort algorithm
independently, and is completely aware of its operating conditions, but chose
to call it mysort instead, would be unable to answer the question in the last
test despite having all the theoretical knowledge needed.

There is, of course, still something to be said about being able to use
terminology that is shared with others, but that changes the nature of the
test in ways I'm not sure the author intended.

~~~
JoeAltmaier
A good point, considering a large portion of practicing programmers are self-
taught.

~~~
iamtempleton
Is this true in 2012? I would love to see a study or something. Not doubting
you, just curious.

------
ww520
There aren't much relationship to OS concepts. The lock question can be
arguably related but it's more for concurrency. They are pretty light weight
questions.

OS questions would involve process/thread/fiber, scheduling, priority,
multitasking, memory management, virtual memory, IPC, deadlock, distributed
deadlock, interrupt/signal, input system, output system, GUI system, file
system, disk system, IO caching, driver model, kernel space and user space,
privilege and non-privilege operation, security model, user/process permission
and privilege, file permission, authentication, authorization, etc.

~~~
cperciva
I was aiming for questions relevant to general-purpose developers, not
questions relevant to operating system developers -- and a small number of
questions at that.

I covered locking, race conditions, virtual memory, and processes; I think
those are much more relevant to most developers than knowing the internals of
how filesystems work or how interrupts are routed.

------
Ensorceled
I graduated with good marks from a great school. And so far I'd have been, at
best, 7 out of 10 on graduation day. After 20 successful years in the
industry, I might get 4 or 5 depending on time constraints since I'd have to
figure out some stuff from first principles.

Some of these questions aren't even necessarily relevant in getting a computer
science degree let alone software development.

------
wrath
I don't know if I should feel "okay" about not knowing the answers or if I
should feel "dumb". Is this something as a system programmer today really need
to know?

~~~
jaimzob
No. They can be interesting things to read about, and knowing a little about
the themes each question comes from (i.e. how does a processor cache work)
will make you a better programmer, but don't worry about cramming your head
with trivia such as "what does MESI stand for?"

------
geebee
So far, I've been enjoying reading these questions. However, I think that the
format prevents using them to answer what _might_ be a more interesting
question - how long would it take someone with a different (non-CS) academic
background to answer the question (with understanding, obviously, not through
cut and paste). In other words, rather than asking whether someone knows the
answer, ask how well prepared they are to research and understand the
question.

~~~
cperciva
You're misunderstanding the purpose of these questions. It isn't to determine
whether you are able to figure out the answers to these questions -- as many
people have pointed out, some of these are simple "type into google"
questions.

The point is that knowing the answers or being able to figure them out on your
own is, I believe, strongly correlated with having a good understanding of the
entire area the question is drawn from.

~~~
geebee
I didn't write the questions, so the purpose of them really isn't up to me.
However, I don't really agree that they are simple "type into google"
questions.

For example, suppose you let someone google around for a half hour to try to
answer the question about how to determine if a graph is bipartite. Now, you
spend about a half an hour doing an oral exam to see how well they understand
what they just regurgitated.

I suspect that you would see a wide range of performance, but that people with
certain academic backgrounds might do much better than others. That's the
"more interesting question" that I had in mind.

Actually, suppose someone had never take graph theory came up with a novel but
ultimately flawed attempt at an algorithm. That might be a stronger sign of
talent in this area than someone who had taken the class and was able to
reproduce an algorithm (even if that student showed a genuine understanding of
it).

~~~
cperciva
I did say that _some_ of the questions are "type into google" questions. The
bipartite-graph question isn't -- but most people will have never seen that
particular question in class, either. That one is a "can you take material you
should know and come up with something new" question... just like the TLB
question in part 2 and another question in part 3.

~~~
geebee
I really don't think any of the questions are "type into google". For
instance, take the question of the run time of quicksort. Someone could google
this, but how well would their answer stand up to the slightest bit of probing
if they didn't understand it?

For instance, suppose someone doesn't really remember the quicksort algorithm,
but looks it up and is quickly able to determine the run time by analyzing the
algorithm. To me, that's pretty much as good as knowing the algorithm's run
time off hand. Maybe even better. For all I know, if you changed the question
just slightly and ask if the run time has changed, the first student has shown
the ability to analyze the run time of an algorithm - the second student's
ability to do this is still unproven.

~~~
cperciva
_how well would their answer stand up to the slightest bit of probing if they
didn't understand it?_

These aren't necessarily the questions I'd use in an interview -- I don't have
the luxury of reading someone's answer and then probing further.

------
LinaLauneBaer
Can someone explain why zeroarray2 is so much slower than zeroarray1
(regarding question 2)? I have profiled the two functions and it turns out for
inner for-loop and the line which zeros out the entry in the array are much
slower... (optimisations turned off). The only real difference is the name of
the variables. Right? Has this something to do with memory alignment?

~~~
andyjohnson0
In zeroarray1 the address to be zeroed goes linearly from 0 to (1024^2)-1.
This generates less cache misses because the address is much more likely to
already be in the cache, which is in turn because the address only differes by
one from the previous value of the address.

zeroarray2 jumps around (0, 1024, 2048, 4096, ..., 1, 1025, 2049, 4097, ...)
in a way that intentionally makes life difficult for the cpu's cache manager,
and generating lots of cache misses.

~~~
dagw
Also worth noting that this is language dependent. What you say is true in C
(which is admittedly what the question was about), but not true in for example
Fortran.

edit: Ignore. I'm wrong, see below

~~~
andyjohnson0
I'm not doubting what you say, but could you elaborate on why Fortran is
different?

I could understand a difference for two-dimensional (or more) arrays, where
different languages lay-out the array contents in memory differently. Does
Fortran lay-out the contents of one dimensional arrays in an unusual way?

~~~
dagw
Sorry, I screwed up. I skimmed the question and mis-remebered it when making
my post. I thought the question looked like:

    
    
      for(i=0...
        for(j=0...
          A[i][j]=0;

vs

    
    
      for(i=0...
        for(j=0...
          A[j][i]=0;

In which case C vs Fortran makes a difference to the way multidimensional
arrays are stored in memory.

------
psykotic
I sent in answers to help with your survey, but I'm highly skeptical of the
statistical methodology. You are selecting for people who at least think they
know the answers. My prediction is that you are going to see a rosy picture of
the state of CS knowledge among software developers, even though the reality
is probably closer to the exact opposite.

~~~
cperciva
I'm asking everybody to submit answers, and there are plenty of people
submitting "I don't know"s.

~~~
tmoertel
Even though you _asked_ everybody to submit answers, you should expect people
who expect themselves to do poorly to opt out. So your sample, I suspect, is
strongly biased toward the answers from people who expected not to do poorly.
And that bias is indeed likely to paint a rosy picture of the reality.

~~~
cperciva
Quite true -- I'll definitely have to be careful in how I interpret the data.
I'm hoping that the issue I'm most interested in -- comparing how people do on
different questions, and comparing how different people do, will be less
strongly biased by the poor sampling method.

------
bentoner
What's the best textbook for this material (caching, concurrency etc.)?

~~~
mdkess
I found this <http://www.akkadia.org/drepper/cpumemory.pdf> to be a very good
starting point for getting into serious low level development. It's a
difficult read, so go through it slowly and with a pad of paper for notes, but
very information dense and useful.

